user directory in sandbox

to create new topics or reply. | New User Registration

This topic contains 1 reply, has 2 voices, and was last updated by  Pramod Thangali 2 years, 6 months ago.

  • Creator
  • #14332

    Kar Son

    Hi I am trying to find where the map reduce input files are located in the sand box. Is there a way to find out where /user/… directory in the sandbox UI maps to in the actual sandbox. Also I couldn’t find where $output_directory maps to . Any help is really appreciated.

Viewing 1 replies (of 1 total)

You must be to reply to this topic. | Create Account

  • Author
  • #14363

    Pramod Thangali


    Can you provide more details on what you are trying to do? All references to files/directories in a job specification refer to files/directories on HDFS.

    Assuming you are referring to the job specification under Job Designer:
    – mapred.input.dir /user/hue/jobsub/sample_data on hdfs
    – $output_directory interpreted as a parameter that is prompted for when you submit the job. This directory should have write permissions for the current user. In this case sandbox.

    As an example use:
    – Go to Job Designer
    – Clone streaming _wordcount
    – Change the name of job to say my_streaming_wordcount (or any other name of your choice)
    – Click Save at the bottom of this screen
    – Once you are back on the job designer page, select my_streaming_wordcount
    – Click Submit and it will prompt you for the output directory
    – type /tmp/output1 for example
    When the job is finished, you can go to this directory on the ‘File Browser’ and see results

    Hope this helps….

Viewing 1 replies (of 1 total)
Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.