Data Integration with Talend
Note: this tutorial was validated with Sandbox 1.2
Data Integration is a key step in a Hadoop solution architecture. Hortonworks has partnered with Talend to bring an open source integration tool for easily connecting Apache Hadoop to hundreds of data systems without having to write code. Talend Open Studio for Big Data is a powerful and versatile open source solution for big data integration that natively supports Apache Hadoop, including connectors for Hadoop Distributed File System (HDFS), HBase, Pig, Sqoop and Hive.
By leveraging Apache Hadoop’s MapReduce architecture for highly distributed data processing, Talend Open Studio for Big Data generates native Hadoop code and runs data transformations directly inside Hadoop for maximum scalability. Its easy-to-use graphical development environment dramatically improves the efficiency of data integration job design.
- Get the Hortonworks Sandbox
- Download: Talend Open Studio for Big Data
- Review the Tutorial: Connect/Write a file to Sandbox from Talend Studio
Alternatively, Talend has created a fully integrated demo with the Sandbox.
Try this tutorial with :
These tutorials are designed to work with Sandbox, a simple and easy to get started with Hadoop. Sandbox offers a full HDP environment that runs in a virtual machine.