HOWTO: Using Apache Sqoop for Data Import from Relational DBs

ISSUE

How do I use Apache Sqoop for importing data from a relational DB?

SOLUTION

Apache Sqoop can be used to import data from any relational DB into HDFS, Hive or HBase.

To import data into HDFS, use the sqoop import command and specify the relational DB table and connection parameters:

sqoop import --connect <JDBC connection string> --table <tablename> --username <username> --password <password>

This will import the data and store it as a CSV file in a directory in HDFS.

To import data into Hive, use the sqoop import command and specify the option ‘hive-import’.

sqoop import --connect <JDBC connection string> --table <tablename> --username <username> --password <password> --hive-import

This will import the data into a Hive table with the approproate data types for each column.

Reference:

https://blogs.apache.org/sqoop/entry/apache_sqoop_overview

Thank you for subscribing!