The Hortonworks Community Connection is now live. A completely rebuilt Q&A forum, Knowledge Base, Code Hub and more, backed by the experts in the industry.

You will be redirected here in 10 seconds. If your are not redirected, click here to visit the new site.

The legacy Hortonworks Forum is now closed. You can view a read-only version of the former site by clicking here. The site will be taken offline on January 31,2016

HDP on Windows – Other Forum

Getting Hue running with a Windows cluster

  • #50487
    Steve D

    I’m trying to install Hue on a Linux server, but interacting with a HDP Windows cluster
    The filebrowser, HCatalog and Beeswax seem to be running. But I can get Pig running

    1. This is for Hue 2.3 which comes with Hortonwworks HDP2
    2. On the cluster nodes, configure webhdfs, webhcat and Oozie as
    a. Additionally, in oozie-site.xml on the cluster nodes, change the oozie.service.WorkflowAppService.system.libpath
    value. Set it to /user/oozie/share/lib rather than /user/${}/share/lib since on windows the services run as the “hadoop” user (otherwise hue complains that the Oozie Share Lib is not in the default location )
    b. On one of the cluster nodes, install the Oozie Shared libs into the hdfs filesystem
    Command line is:
    C:\hdp\oozie-\oozie-win-distro\share\lib>hdfs dfs -put * /user/oozie/share/lib/

    3. On the Hue server, install Hue, Pig, Hcatalog, Hive and Hbase. (pig needs hcatalog and hive, hue need pig and hbase)
    a. Configure Hue webserver as per the Hortonworks docs (skipped SSL in our case)
    b. Configure Hadoop/Yarn/Beeswax as per Hortonworks docs
    Note: The Hue server will also be the Beeswax server, so should bind to it’s FQDN dns name
    d. Configure pig as per the hortonworks docs. Make sure JAVA_HOME is set in /etc/pig/conf/
    In environment variables, set HCAT_HOME to the same value as HCATALOG_HOME.
    e. Also, copy the hive-site.xml from one cluster node to /etc/hive/conf/ on the Hue server
    Edit hive_conf_dir in hue.ini to point to /etc/hive/conf/ so it can find hive-site.xml
    f. Configure Oooze, UserAdmin and WebHcat for Hue as per Hortonworks docs)
    g. Create a /etc/hadoop/conf/core-site.xml on the hue server
    Set the hadoop.tmp.dir (should be a directory on the hue server) and fs.defaultFS (should be set to hdfs://<namenode>:8020)
    h. Create /etc/hadoop/conf/hdfs-site.xml on the hue server
    Set the dfs.namenode.http-address to point to the http address on name node. (<namenode>:50070)
    i. Edit /etc/hadoop/conf/ on the Hue server
    Set the JAVA_HOME to point to your jvm installation (e.g. /usr/lib/jvm/java-1.7.0-openjdk.x86_64)
    j. Patch /usr/lib/hue/desktop/libs/hadoop/src/hadoop/fs/ to set the superuser to ‘hadoop’, as per!topic/hue-user/IR76qQnYQB4
    (make sure the cluster is started on windows using the hadoop user!)
    4. Make sure the hdfs permissions are set to allow hadoop user to be the owner at root level
    5. Start Hue
    6. When you first log into hue, call the admin user ‘hadoop’, so they have write access to the hive metastore (since the ‘hadoop’ user in the windows cluster has write access to /hive/warehouse in hdfs).
    7. At this point, Hive + HCatalog are working.
    8. Pig on the linux box runs from a script or from grunt console

    Pig jobs started from Hue simply output the “pig –help” output. And the JobBrowser can’t find the job. <urlopen error [Errno -2] Name or service not known>

  • Author
  • #50493
    Steve D

    Pig works when run on one of the datanodes (or master node)

    In the Hue server logs I can see the pig job being invoked
    [24/Mar/2014 21:11:19 +0000] views DEBUG User hadoop started pig job via webhcat: curl -s -d file=/tmp/.pigjobs/hadoop/tut1_sd_1395655879/script.pig -d statusdir=/tmp/.pigjobs/hadoop/tut1_sd_1395655879 -d callback=$jobId/ -d arg=-useHCatalog

    And then if I look in the statusdir the ‘exit’ file just has ‘7’ in it, and the stderr is empty and the stdout has the “pig –help” output as described above.

    So how can I debug why invoking Pig from Hue isn’t working, when it works from within the Windows cluster, and also when called from the console on the Hue server ?

    Steve D

    I some progress.
    I can see to Pig job being queued and executed as a YARN application, but it still fails with an exit code 7
    Within the log I can see the call to pig.cmd, but not 100% sure what might be wrong with it.

    The log of the Map phase is here

    Steve D

    Success !

    In the map step syslog I could see:
    2014-03-25 22:14:07,331 INFO [main] org.apache.hive.hcatalog.templeton.tool.TrivialExecService: Starting cmd: [cmd, /c, call, C:\hdp\\pig-, -D”mapreduce.job.credentials.binary=/c:/hadoop/data/hadoop/local/usercache/hadoop/appcache/application_1395717154689_0002/container_1395717154689_0002_01_000002/container_tokens”=“-useHCatalog”, -file, script.pig]

    The = just before the -useHCatalog looked suspicious.
    If I manually tried to run that command line on the datanode (replacing the , between args with spaces) then it wouldn’t work
    But it works if I removed the =”-useHCatalog” portion

    So, the final workaround is this:
    On the Windows nodes, edit C:\hdp\pig-\bin\pig.cmd so that pig always uses HCatalog.
    Do this by copying the set HCAT_FLAG=”true” to immediately after the line set PIGARGS=

    Then, when running pig scripts from hue, remove the -useHCatalog from the Pig argument when submitting the script (just below the script window)

    Pig will work in Hue now, but I guess that trying to submit Pig job with other arguments, may also fail.
    So it appears that the problem may lie in how templeton is concatenating pig arguments together? Along with the credentials argument.

    Comments from Hortonworks on wether this is a correct fix (or other ramifications) would be appreciated.
    And wether future HDP on windows releases will address this


    Hi Steve,

    I’m glad you got this working, it has been in the pipeline for me to test out, so I’m keen to give it a go.



    Steve D


    In the other thread you said a mixed Win/Linux cluster wouldn’t be a supported case.

    in this instance the hue server is not an active part of the cluster, would that be supported?

    any thoughts on why the Pig cmd line args are getting messed up?
    that feels like a bug somewhere.



    Hi Steve,

    Yes that’s correct.
    Basically, if you have any issues with Hue (as it hasn’t been tested on Windows) we would not be able to raise bugs / create fixes.
    That’s not to say it won’t work, but the cluster would fall under support – ie as a Windows cluster if you are a paying support customer.



The topic ‘Getting Hue running with a Windows cluster’ is closed to new replies.

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.