Issue writing to HDP 2.0 from Talend 5.4.1
Tags: [bug, development, error, HDP 2.0, Hortonworks]
Downloaded the most recent Hortonworks HDP Sandbox (v2.0) and latest Talend Open Studio for Big Data (v5.4.1) on 2/10/14. I am able to interact with and upload data to the HDP VM through the Hue interface. However, when trying to upload data via Talend as per the tutorial available online at ___http://hortonworks.com/kb/how-to-connectwrite-a-file-to-hortonworks-sandbox-from-talend-studio/___ I am receiving the following error (excerpt due to length):
[ERROR]: org.apache.hadoop.hdfs.DFSClient – Failed to close file /user/root/testfilez
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/root/testfilez could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
The file is created in HDP, but it is 0 bytes and thus contains no content.
Based on research online, I have verified that the data node is running and is not full (see attached cluster summary image). I have seen similar issues to this one with other versions of HDP/Talend in online forums, none of which has provided a solution. I am trying to have a tRowGenerator in Talend generate 100 rows and then output them into Hadoop via the tHDFSOutput. Every time I run the job I am receiving the error. I have the following configured in Talend:
Hadoop Version: Hortonworks Data Platform V2.0.0(BigWheel)
NameNode URI: “hdfs://127.0.0.1:8020/”
User name: “root” (have also tried “sandbox” and “hue”)
File Name: “/user/root/testfile” (have tried “/” and “/user/hue/testfile”)
Configured to generate 100 rows, each with two string and 1 int columns.
The tHDFSConnection is connected to the tRowGenerator via an OnComponentOk trigger, and the tRowGenerator is connected to the tHDFSOutput via a Row Main.
Can anyone please suggest a solution to this connectivity issue? Thanks in advance for your advice!