Sqoop Forum

Sqoop Import – Data File is imported in some binary Format

  • #51073
    Narayana Ganta


    While importing data from MySQL to HDFS, we are seeing data in some binary format in HDFS.

    Below is the format in which data is seen:
    0000000: 78 9c bc bd 7d 6f e3 48 92 3e f8 f7 dd a7 20 fa x…}o.H.>…. .
    0000010: 80 da ea 5b ca 66 66 f2 75 71 38 80 a2 28 9b 5d …[.ff.uq8..(.]
    0000020: 7a 2b 52 b2 ab fa 9f 81 a7 4a dd e3 1d b7 5d b0 z+R……J….].
    0000030: ab 66 ba 7f 87 fb ee 17 91 49 32 23 c5 24 45 c9 .f…….I2#.$E.

    Sqoop import command used:
    sqoop import –connect jdbc:mysql://myserver.com/test –username user1 –password hdp123 –table table1 –target-dir /user/test6/ –as-textfile

    Sqoop Version: 1.4.4
    HDP: 2.0

    Can you please help us how to get the data from MYSQL in readable text format.

    Thank You.

to create new topics or reply. | New User Registration

  • Author
  • #51083
    Narayana Ganta

    We tried importing the data into Hive it is working fine. Data is showing fine in the Hive table.

    Command Used:
    sqoop import –connect jdbc:mysql://civlsldhadoop1.val.cummins.com:3306/sqooptest –table intcol –username root –password hadoop –hive-import 2> error.log


    Hi Narayana,

    Can you check to see if your MapReduce is set to have mapreduce.output.fileoutputformat.compress this enabled or true? If it is, then this would be the one thats causing the issue.

    What you can try is to use the -D mapreduce.output.fileoutputformat.compress=false and in the sqoop command like below and see if this helps
    sqoop import -D mapreduce.output.fileoutputformat.compress=false –connect jdbc:mysql://myserver.com/test –username user1 –password hdp123 –table table1 –target-dir /user/test6/ –as-textfile

The topic ‘Sqoop Import – Data File is imported in some binary Format’ is closed to new replies.

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.