Hue Forum

Pig script got stuck @ dump

  • #49831
    Stanley Nguyen
    Participant

    I think this is something to do when the pig script is executing from HUE. From the sandbox environment, it works fine but with multiple hosts configuration, it gets stuck @ dump statement. If I comment out “dump”, then it works. The same script executes flawlessly from GRUNT. Couldn’t really find any error from logs. I would appreciate if someone can point me to where to look.

    Thanks,

    Stan

to create new topics or reply. | New User Registration

  • Author
    Replies
  • #49842
    Dave
    Moderator

    Hi Stan,

    Could you paste your script here and I can run it against my cluster.
    How big is the data you are dumping?

    Thanks

    Dave

    #49849
    Stanley Nguyen
    Participant

    Thanks Dave. I simply followed the tutorials (both for stock and baseball statistics). Anywhere this is the script:

    a = LOAD ‘stocks’ using org.apache.hcatalog.pig.HCatLoader();
    b = filter a by stock_symbol == ‘IBM';
    c = group b all;
    d = foreach c generate AVG(b.stock_volume);
    dump d;

    From Job Browser, it stops @ 5%. Thanks

    #49858
    Dave
    Moderator

    Hi Stan,

    What version of Hue are you running?
    rpm -qa | grep hue

    I’m having the same issue but in the grunt shell it works fine.

    Thanks

    Dave

    #49868
    Stanley Nguyen
    Participant

    Hi Dave,

    These are the versions I got on my 3 machines setup:

    -bash-4.1$ rpm -qa | grep hue
    hue-2.3.0.2.0.6.0-102.el6.x86_64
    hue-server-2.3.0.2.0.6.0-102.el6.x86_64
    hue-beeswax-2.3.0.2.0.6.0-102.el6.x86_64
    hue-oozie-2.3.0.2.0.6.0-102.el6.x86_64
    hue-hcatalog-2.3.0.2.0.6.0-102.el6.x86_64
    hue-shell-2.3.0.2.0.6.0-102.el6.x86_64
    hue-pig-2.3.0.2.0.6.0-102.el6.x86_64
    hue-common-2.3.0.2.0.6.0-102.el6.x86_64

    On the sandbox VM that I downloaded, the script works fine and it got a newer version of HUE:
    [root@sandbox ~]# rpm -qa | grep -i hue
    hue-common-2.3.0.2.0.6.0-171.el6.x86_64
    hue-oozie-2.3.0.2.0.6.0-171.el6.x86_64
    hue-sandbox-1.2.1-174.noarch
    hue-tutorials-1.2.1-174.noarch
    hue-beeswax-2.3.0.2.0.6.0-171.el6.x86_64
    hue-pig-2.3.0.2.0.6.0-171.el6.x86_64
    hue-server-2.3.0.2.0.6.0-171.el6.x86_64
    hue-2.3.0.2.0.6.0-171.el6.x86_64
    hue-hcatalog-2.3.0.2.0.6.0-171.el6.x86_64
    hue-shell-2.3.0.2.0.6.0-171.el6.x86_64

    #49912
    Stanley Nguyen
    Participant

    Hi Dave,

    I made it work though I’m not sure why :). Fairly new with HDP. As I mentioned previously I have a 3 machines cluster but only one node manager is running on host #3. I added Node Manager to the other twos and it works now. Hopefully you or someone can explain the root cause. Technically it should have worked.

    Thanks

You must be to reply to this topic. | Create Account

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.