Hortonworks Sandbox Forum

Hadoop fs

  • #55985
    Jan Leonhard


    I’ve created a Hive Table EDW_RETAIL…

    Added some data by

    LOAD DATA INPATH ‘/user/hue/BACK_130530_0809_EDW_RETAIL_BEWEGUNGEN_PART53.DEL’ INTO TABLE default.edw_retail_bewegungen;
    LOAD DATA INPATH ‘/user/hue/BACK_130530_0809_EDW_RETAIL_BEWEGUNGEN_PART54.DEL’ INTO TABLE default.edw_retail_bewegungen;
    LOAD DATA INPATH ‘/user/hue/BACK_130530_0809_EDW_RETAIL_BEWEGUNGEN_PART55.DEL’ INTO TABLE default.edw_retail_bewegungen;
    … etc.

    Worked fine – content grows by every file added.

    So i tried to find out, how hdfs/HIVE manages the table in the hadoop file system.

    but when i try this on the console:

    hadoop fs -ls

    the command is executed successfully, but i receive no output.

    Any advice how i can find out, if the table is stored in one hadoop file?

    best regards


to create new topics or reply. | New User Registration

  • Author
  • #56614

    Hi Jan,
    Can you try the following hadoop fs -ls /user/hue?


    Jan Leonhard

    Hi Ian!

    That worked fine! Thank you very much!

    Why don’t i get a result without giving a path? Shouldn’t i get the contents of the highest level?

    best wishes



    HI Jan,
    Running hadoop fs -ls with no path will assume the home directory of the user you are logged in as. For example, if I am in as root, it would show me the contest of /user/root

    To get the highest level you would need to execute hadoop fs -ls /


    Jan Leonhard

    Hello again :)

    thanx again! That worked fine, too.

    i now found that a hive table does not refer to its own hadoop file, so my table is stored like this:

    Permission Owner Group Size Replication Block Size Name
    -rwxrwxrwx hue hdfs 17.78 KB 1 32.63 MB BACK_130530_0756_EDW_RETAIL_BEWEGUNGEN_PART51.DEL
    -rwxr-xr-x hue hue 3.21 MB 1 32.63 MB BACK_130530_0804_EDW_RETAIL_BEWEGUNGEN_PART52.DEL
    -rwxr-xr-x hue hue 30.06 MB 1 32.63 MB BACK_130530_0809_EDW_RETAIL_BEWEGUNGEN_PART53.DEL
    -rwxr-xr-x hue hue 39.41 MB 1 32.63 MB BACK_130530_0813_EDW_RETAIL_BEWEGUNGEN_PART54.DEL
    -rwxr-xr-x hue hue 23.78 MB 1 32.63 MB BACK_130530_0819_EDW_RETAIL_BEWEGUNGEN_PART55.DEL

    so it is stored in 6 Blocks, although 3 would be enough.

    Why is it, that a hive table does not generate a new hadoop file, so that it inserts new data at the end of the file? I might have read that this is possible in Hadoop 2.0. ISn’t it?

    highest regards


You must be to reply to this topic. | Create Account

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.