HOWTO: Using Apache Sqoop for Data Import from Relational DBs

ISSUE

How do I use Apache Sqoop for importing data from a relational DB?

SOLUTION

Apache Sqoop can be used to import data from any relational DB into HDFS, Hive or HBase.

To import data into HDFS, use the sqoop import command and specify the relational DB table and connection parameters:

sqoop import --connect <JDBC connection string> --table <tablename> --username <username> --password <password>

This will import the data and store it as a CSV file in a directory in HDFS.

To import data into Hive, use the sqoop import command and specify the option ‘hive-import’.

sqoop import --connect <JDBC connection string> --table <tablename> --username <username> --password <password> --hive-import

This will import the data into a Hive table with the approproate data types for each column.

Reference:

https://blogs.apache.org/sqoop/entry/apache_sqoop_overview

Try these Tutorials

HDP 2.1 Webinar Series
Join us for a series of talks on some of the new enterprise functionality available in HDP 2.1 including data governance, security, operations and data access :
Contact Us
Hortonworks provides enterprise-grade support, services and training. Discuss how to leverage Hadoop in your business with our sales team.
Integrate with existing systems
Hortonworks maintains and works with an extensive partner ecosystem from broad enterprise platform vendors to specialized solutions and systems integrators.