HOW TO: Connect/Write a File to Hortonworks Sandbox from Talend Studio

Writing a file to Hortonworks Sandbox from Talend Studio

I recently needed to quickly build some test data for my Hadoop environment and was looking for a tool to help me out. What I discovered was this is a very simple process within Talend Studio. (you can get the latest Talend Studio from their site)

Here is how…

Step 1 – Generating Test Data within Talend Studio

  • Create a New Job within the Job Designer
  • Drag a tRowGenerator onto the Designer
  • Double Click on your tRowGenerator component and add in fields you want to generate

Step 2 – Connecting to HDFS from Talend

  • Drag a tHDFSConnection onto the Designer
  • Change the “Name Node URI” property to point to your Hortonworks Sandbox on port 8020.
  • Change the connection your to “sandbox”.
  • Right click on the tHDFSConnection and add a OK trigger that connects the tHDFSConnection to the tRowGenerator

Step 3 – Writing to HDFS

  • Drag a tHDFSOutput onto the Designer
  • Change the “Name Node URI” property to point to your Hortonworks Sandbox on port 8020. Example:”hdfs://<YOUR SANDBOX IP>:8020/”
  • Change the connection your to “sandbox”.
  • Set the name of the output file in File Name field
  • Right click on the tRowGenerator and add a row main that connects the tRowGenerator to the tHDFSOutput

Step 4 – Running the Job from Talend

  •  Click on the “Run” Tab and press the “Run” button

Step 5 – Viewing the file in the Hortonworks Sandbox

  • Open your web browser and enter the URL: http://<YOUR SANDBOX IP>:8000
  • Click of the File Browser Icon on the top bar
  • Your file should have appeared within the sandbox user’s home directory

VOILA! 

You can explore Hadoop with many more tutorials in the Hortonworks Sandbox.

Try these Tutorials

HDP 2.1 Webinar Series
Join us for a series of talks on some of the new enterprise functionality available in HDP 2.1 including data governance, security, operations and data access :
Contact Us
Hortonworks provides enterprise-grade support, services and training. Discuss how to leverage Hadoop in your business with our sales team.
Explore Technology Partners
Hortonworks nurtures an extensive ecosystem of technology partners, from enterprise platform vendors to specialized solutions and systems integrators.