Sqoop Import – Data File is imported in some binary Format

to create new topics or reply. | New User Registration

This topic contains 2 replies, has 2 voices, and was last updated by  Mahesh Balakrishnan 1 year ago.

  • Creator
  • #51073

    Narayana Ganta


    While importing data from MySQL to HDFS, we are seeing data in some binary format in HDFS.

    Below is the format in which data is seen:
    0000000: 78 9c bc bd 7d 6f e3 48 92 3e f8 f7 dd a7 20 fa x…}o.H.>…. .
    0000010: 80 da ea 5b ca 66 66 f2 75 71 38 80 a2 28 9b 5d …[.ff.uq8..(.]
    0000020: 7a 2b 52 b2 ab fa 9f 81 a7 4a dd e3 1d b7 5d b0 z+R……J….].
    0000030: ab 66 ba 7f 87 fb ee 17 91 49 32 23 c5 24 45 c9 .f…….I2#.$E.

    Sqoop import command used:
    sqoop import –connect jdbc:mysql://myserver.com/test –username user1 –password hdp123 –table table1 –target-dir /user/test6/ –as-textfile

    Sqoop Version: 1.4.4
    HDP: 2.0

    Can you please help us how to get the data from MYSQL in readable text format.

    Thank You.

Viewing 2 replies - 1 through 2 (of 2 total)

The topic ‘Sqoop Import – Data File is imported in some binary Format’ is closed to new replies.

  • Author
  • #57758

    Hi Narayana,

    Can you check to see if your MapReduce is set to have mapreduce.output.fileoutputformat.compress this enabled or true? If it is, then this would be the one thats causing the issue.

    What you can try is to use the -D mapreduce.output.fileoutputformat.compress=false and in the sqoop command like below and see if this helps
    sqoop import -D mapreduce.output.fileoutputformat.compress=false –connect jdbc:mysql://myserver.com/test –username user1 –password hdp123 –table table1 –target-dir /user/test6/ –as-textfile


    Narayana Ganta

    We tried importing the data into Hive it is working fine. Data is showing fine in the Hive table.

    Command Used:
    sqoop import –connect jdbc:mysql://civlsldhadoop1.val.cummins.com:3306/sqooptest –table intcol –username root –password hadoop –hive-import 2> error.log

Viewing 2 replies - 1 through 2 (of 2 total)
Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.