HDP on Linux – Installation Forum

Hdfs command goes to cluster, Pig command goes to local filesystem ??

  • #53725
    Niels Basjes


    I have just installed HDP 2.1.2 on my systems and setup the environment almost the same as what I had with CDH 5.0.0.
    I.e. environment setting like $HADOOP_CONF_DIR and $HADOOP_MAPRED_HOME have been setup right (as far as I can tell)
    So if I do:
    hdfs dfs -ls -R /
    I see the files that are on my HDFS.

    I do this:
    $ pig
    2014-05-15 10:41:55,810 [main] INFO org.apache.pig.Main – Apache Pig version (rexported) compiled Apr 27 2014, 16:49:54

    2014-05-15 10:41:56,284 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine – Connecting to hadoop file system at: file:///

    When I do this

    $ pig -printCmdDebug
    Find hadoop at /usr/bin/hadoop
    dry run:
    HADOOP_CLASSPATH: /home/nbasjes/hadoop-environment-configs/kluster-hdp21:/usr/lib/pig/bin/../conf:/usr/java/default/lib/tools.jar:/etc/hadoop/conf:/usr/lib/pig/bin/../lib/avro-1.7.4.jar:/usr/lib/pig/bin/../lib/avro-ipc-1.7.4-tests.jar:/usr/lib/pig/bin/../lib/avro-mapred-1.7.4.jar:/usr/lib/pig/bin/../lib/jruby-complete-1.6.7.jar:/usr/lib/pig/bin/../lib/json-simple-1.1.jar:/usr/lib/pig/bin/../lib/jython-standalone-2.5.3.jar:/usr/lib/pig/bin/../lib/piggybank.jar:/usr/lib/pig/bin/../pig-
    HADOOP_CLIENT_OPTS: -Xmx1000m -Dpig.log.dir=/usr/lib/pig/bin/../logs -Dpig.log.file=pig.log -Dpig.home.dir=/usr/lib/pig/bin/..
    /usr/bin/hadoop jar /usr/lib/pig/bin/../pig-

    Ok, so I manually set these two and run the command at the bottom.
    $ /usr/bin/hadoop jar /usr/lib/pig/bin/../pig-

    2014-05-15 10:36:00,329 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine – Connecting to hadoop file system at: hdfs://node1.kluster.basjes.lan/

    What am I doing wrong???
    Why doesn’t pig go to my HDFS when running the normal script??

to create new topics or reply. | New User Registration

  • Author
  • #53729
    Niels Basjes

    The best guess I have right now is that BOTH my HADOOP_CONF_DIR and /etc/hadoop/conf are in the classpath.
    After I removed all files from /etc/hadoop/conf is suddenly worked.

You must be to reply to this topic. | Create Account

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.