importtsv issue

to create new topics or reply. | New User Registration

This topic contains 1 reply, has 2 voices, and was last updated by  Robert 1 year, 8 months ago.

  • Creator
    Topic
  • #30516

    yan kun
    Member

    i use importtsv like this
    hadoop jar /opt/module/hbase-0.94.6-cdh4.3.0/hbase-0.94.6-cdh4.3.0-security.jar importtsv -Dimporttsv.bulk.output=/data_hfile/output -Dimporttsv.columns=HBASE_ROW_KEY,s:STATION,s:YEAR,s:MONTH,s:DAY,s:HOUR,s:MINUTE,s:ODATE,s:LDATE,s:LTIME,s:CCCC,s:LATITUDE,s:LONGITUDE,s:ELEVATION………… -Dimporttsv.separator=, data_rk /data_split/data_rk_2_4bw
    data_rk_2_4bw about 600MB
    and result make to mang file about 15GB

Viewing 1 replies (of 1 total)

You must be to reply to this topic. | Create Account

  • Author
    Replies
  • #30588

    Robert
    Participant

    Hi Yan,
    What hadoop distribution are you using? In addition, can you clarify the issue?

    Regards,
    Robert

    Collapse
Viewing 1 replies (of 1 total)
Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.