Data Integration with Talend

Note: this tutorial was validated with Sandbox 1.2

Introduction

Data Integration is a key step in a Hadoop solution architecture. Hortonworks has partnered with Talend to bring an open source integration tool for easily connecting Apache Hadoop to hundreds of data systems without having to write code. Talend Open Studio for Big Data is a powerful and versatile open source solution for big data integration that natively supports Apache Hadoop, including connectors for Hadoop Distributed File System (HDFS), HBase, Pig, Sqoop and Hive.

By leveraging Apache Hadoop’s MapReduce architecture for highly distributed data processing, Talend Open Studio for Big Data generates native Hadoop code and runs data transformations directly inside Hadoop for maximum scalability. Its easy-to-use graphical development environment dramatically improves the efficiency of data integration job design.

 Get Started

  1. Get the Hortonworks Sandbox
  2. Download: Talend Open Studio for Big Data
  3. Review the Tutorial: Connect/Write a file to Sandbox from Talend Studio

Leave a Reply

Your email address will not be published. Required fields are marked *

If you have specific technical questions, please post them in the Forums

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Try this tutorial with :

These tutorials are designed to work with Sandbox, a simple and easy to get started with Hadoop. Sandbox offers a full HDP environment that runs in a virtual machine.