Home Forums Hortonworks Sandbox Sandbox – Pig Basic Tutorial example is nbot working

This topic contains 25 replies, has 14 voices, and was last updated by  Larry Liu 2 weeks ago.

  • Creator
    Topic
  • #17798

    Sankarg
    Member

    Hi, I just tried the following pig Basic Tutorial which is not working

    a = LOAD ‘nyse_stocks’ USING org.apache.hcatalog.pig.HCatLoader();
    b = FILTER a BY stock_symbol == ‘IBM’;
    c = group b all;
    d = FOREACH c GENERATE AVG(b.stock_volume);
    dump d;

    when i tried the syntax check, the following logs captured.

    013-03-17 14:35:28,456 [main] INFO org.apache.pig.Main – Apache Pig version 0.10.1.21 (rexported) compiled Jan 10 2013, 04:00:42
    2013-03-17 14:35:28,459 [main] INFO org.apache.pig.Main – Logging error messages to: /home/sandbox/hue/pig_1363556128447.log
    2013-03-17 14:35:41,945 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine – Connecting to hadoop file system at: file:///
    2013-03-17 14:35:45,555 [main] ERROR org.apache.pig.tools.grunt.Grunt – ERROR 1070: Could not resolve org.apache.hcatalog.pig.HCatLoader using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.]
    Details at logfile: /home/sandbox/hue/pig_1363556128447.log

    please do the needful to resolve this issue. Thank you!

    Regards,
    Sankar

Viewing 15 replies - 1 through 15 (of 25 total)

You must be logged in to reply to this topic.

  • Author
    Replies
  • #24682

    Larry Liu
    Moderator

    Hi, Craig

    Thanks for trying and let us know it worked.

    Please let us know any issues you meet.

    Larry

    Collapse
    #24582

    Craig Rowley
    Member

    I was receiving the error for newly downloaded HDP Sandbox (downloaded today) when trying to execute tutorial1, check syntax, explain.
    (using Mac OSX 10.6.8, Virtualbox 4.2.4)

    I used terminal and ssh’d into the VM (using IP from url and root/hadoop as auth)

    I applied the fix to /usr/bin/pig (to add -useHCatalog to last line)

    It worked, as far as I can tell

    Collapse
    #23758

    yi zhang
    Member

    Hi Adam,

    There is no /etc/bin/pig, do you mean /usr/lib/pig/bin/pig or /usr/bin/pig?

    Try peter’s suggestion:

    sed -i ’49s/.*/includeHCatalog=true;/’ /usr/lib/pig/bin/pig

    This will load HCatalog automatically when pig is started.

    Thanks,
    Yi

    Collapse
    #23576

    Has anyone found the resolution to this issue. It looks like we get it first with Hcat then with pig. it must be a config problem but I can’t even get to the directory recommended above. My /etc/bin/ does not have a pig in it on the newest sandbox – downloaded on 4/26/13. I’m happy to try recommendations if folks can reply. I’ll post the results here.

    Collapse
    #21866

    Larry Liu
    Moderator

    Hi, Mehran

    It is great to make it work.

    We are all happy learning ;)

    Larry

    Collapse
    #21775

    I had IBM in double quotes instead of single quotes after changing to single quotes I do get a result. Thanks to Ron at Hortonworks who shared his code with me.

    Thanks everyone!

    Collapse
    #21763

    Larry,
    Thanks a lot for your response. I actually did that as also mentioned in another post and also restarted the host but nothing happens when executing and I get the error below when checking the syntax. Any idea what’s going on?
    I appreciate your help,
    Mehran

    2013-04-15 12:12:28,030 [main] INFO org.apache.pig.Main – Apache Pig version 0.10.1.21 (rexported) compiled Jan 10 2013, 04:00:42
    2013-04-15 12:12:28,031 [main] INFO org.apache.pig.Main – Logging error messages to: /home/sandbox/hue/pig_1366053148027.log
    2013-04-15 12:12:32,601 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine – Connecting to hadoop file system at: file:///
    2013-04-15 12:12:35,471 [main] WARN org.apache.hadoop.hive.conf.HiveConf – DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.
    2013-04-15 12:12:35,897 [main] INFO hive.metastore – Trying to connect to metastore with URI thrift://sandbox:9083
    2013-04-15 12:12:36,485 [main] INFO hive.metastore – Waiting 1 seconds before next connection attempt.
    2013-04-15 12:12:37,485 [main] INFO hive.metastore – Connected to metastore.
    Unexpected character ‘”‘
    2013-04-15 12:12:41,153 [main] ERROR org.apache.pig.tools.grunt.Grunt – ERROR 1200: Unexpected character ‘”‘
    Details at logfile: /home/sandbox/hue/pig_1366053148027.log

    Here the code I am running:
    a = LOAD ‘nyse_stocks’ USING org.apache.hcatalog.pig.HCatLoader();
    b = FILTER a BY stock_symbol==”IBM”;
    c = group b all;
    d = FOREACH c GENERATE AVG(b.stock_volume);
    dump d;

    Collapse
    #21740

    Larry Liu
    Moderator

    Hi, Mehran

    Here is the workaround:

    1. Login to sandbox as root.
    2. Edit file /usr/bin/pig
    3. Replace the last line with the following line:
    exec /usr/lib/pig/bin/pig -useHCatalog “$@”

    Larry

    Collapse
    #21572

    Wondering what was the resolution for this problem? I just encountered the same problem.

    Thanks

    Collapse
    #19759

    tedr
    Member

    Hi Kirill,

    Thanks for letting us know.

    Ted.

    Collapse
    #19560

    Kirill Wood
    Member

    So I found out that UDFs are case sensitive, which means the correct syntax would be all caps-
    MAX(runs.runs), not max
    AVG(b.stock_volume), not avg

    Collapse
    #19414

    tedr
    Member

    Hi Stephen,

    Have you tried any of the fixes mentioned earlier in this thread?

    Thanks,
    Ted.

    Collapse
    #19300

    I’ve downloaded the Sandbox today, and followed the instructions on the left-hand side of the Web browser. I have exactly the same problem as the one documented in this thread.

    2013-03-28 09:30:42,524 [main] INFO org.apache.pig.Main – Apache Pig version 0.10.1.21 (rexported) compiled Jan 10 2013, 04:00:42
    2013-03-28 09:30:42,524 [main] INFO org.apache.pig.Main – Logging error messages to: /home/sandbox/hue/pig_1364488242522.log
    2013-03-28 09:30:42,785 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine – Connecting to hadoop file system at: file:///
    2013-03-28 09:30:43,040 [main] ERROR org.apache.pig.tools.grunt.Grunt – ERROR 1070: Could not resolve org.apache.hcatalog.pig.HCatLoader using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.]
    Details at logfile: /home/sandbox/hue/pig_1364488242522.log

    Collapse
    #18975

    Sasha J
    Moderator

    Kirill,
    it looks like your classpath is broken somehow…
    I just run this tutorial on freshly loaded Sandbox, it works perfectly fine…
    I suggest you reload Sandbox and then try again.

    Thank you!
    Sasha

    Collapse
    #18928

    Kirill Wood
    Member

    Hi Larry,
    I haven’t tried Peter’s way but I tried your fix.. basically –
    1. Login to sandbox as root.
    2. Edit file /usr/bin/pig
    3. Replace the last line with the following line:
    exec /usr/lib/pig/bin/pig -useHCatalog “$@”

    That fixed the original error on HCatLoader which let me proceed with the script:
    ERROR 1070: Could not resolve org.apache.hcatalog.pig.HCatLoader using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.]

    But now, it throws a new error when I use the ‘avg’ function below:
    FOREACH c GENERATE avg(b.stock_volume);

    ERROR 1070: Could not resolve avg using imports: [, org.apache.pig.builtin., org.apache.pig.impl..builtin.]

    I get the same error even in Pig tutorial 2 when I try to use the ‘max’ function in this line:
    FOREACH grp_data GENERATE group as grp, max(runs.runs) as max_runs;

    log file says:
    Failed to parse: Pig script failed to parse:
    Failed to generate logical plan. Nested exception: java.lang.RuntimeException: Cannot instantiate: avg at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:184)

    Thanks!

    Collapse
Viewing 15 replies - 1 through 15 (of 25 total)

You are not currently logged in.






» Lost your Password?

Join Our Community

Stay up-to-date on the latest news, download software, watch training videos and more.

Join the Hortonworks Community

About HDP

Hortonworks Data Platform (HDP) is a 100% open source data management platform based on Apache Hadoop. It allows you to load, store, process and manage data in virtually any format and at any scale.

Learn More

Hadoop Training

Developing Solutions with Apache Hadoop Classes

Understanding Hadoop on Windows Classes

Applying Data Science using Apache Hadoop Classes

Developing Apache Hadoop Applications with Java Classes

View All Classes »