Sqoop Forum

Sqoop import with no primary key

  • #47530
    Swapnil Patil
    Participant

    I am importing database from MS SQL server 2008.. I have 100 tables in the database but only 10 tables have primary keys defined on it. So, when I migrate entire database with import-all-tables option it only migrate those tables which have primary key defined.

    ./sqoop import-all-tables –connect “jdbc:sqlserver://w.x.y.z:1433;database=db_name” –username user_name –password **** –hive-import –hive-database db_name -m 1

    This is the error I am getting
    ERROR tool.ImportAllTablesTool: Error during import: No primary key could be found for table xyz. Please specify one with –split-by or perform a sequential import with ‘-m 1′

    I dont want to migrate tables individually.. please help

    Thanks inadvance.. :)

to create new topics or reply. | New User Registration

  • Author
    Replies
  • #49591

    Hi Swapnil,

    Unfortunately, for sqoop to import all tables in one command, the tables would need to have a primary key.

    From http://sqoop.apache.org/docs/1.4.1-incubating/SqoopUserGuide.html

    For the import-all-tables tool to be useful, the following conditions must be met:

    Each table must have a single-column primary key.
    You must intend to import all columns of each table.
    You must not intend to use non-default splitting column, nor impose any conditions via a WHERE clause

    -Mahesh

The topic ‘Sqoop import with no primary key’ is closed to new replies.

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.