Sandbox – Pig Basic Tutorial example is nbot working

to create new topics or reply. | New User Registration

This topic contains 62 replies, has 42 voices, and was last updated by  Rajeev Trikha 6 days, 17 hours ago.

  • Creator
    Topic
  • #17798

    Sankarg
    Member

    Hi, I just tried the following pig Basic Tutorial which is not working

    a = LOAD ‘nyse_stocks’ USING org.apache.hcatalog.pig.HCatLoader();
    b = FILTER a BY stock_symbol == ‘IBM';
    c = group b all;
    d = FOREACH c GENERATE AVG(b.stock_volume);
    dump d;

    when i tried the syntax check, the following logs captured.

    013-03-17 14:35:28,456 [main] INFO org.apache.pig.Main – Apache Pig version 0.10.1.21 (rexported) compiled Jan 10 2013, 04:00:42
    2013-03-17 14:35:28,459 [main] INFO org.apache.pig.Main – Logging error messages to: /home/sandbox/hue/pig_1363556128447.log
    2013-03-17 14:35:41,945 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine – Connecting to hadoop file system at: file:///
    2013-03-17 14:35:45,555 [main] ERROR org.apache.pig.tools.grunt.Grunt – ERROR 1070: Could not resolve org.apache.hcatalog.pig.HCatLoader using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.]
    Details at logfile: /home/sandbox/hue/pig_1363556128447.log

    please do the needful to resolve this issue. Thank you!

    Regards,
    Sankar

Viewing 30 replies - 31 through 60 (of 62 total)

You must be to reply to this topic. | Create Account

  • Author
    Replies
  • #28615

    tedr
    Moderator

    Hi Nilesh,

    in looking at this line from your post:

    2013-06-27 11:12:28,167 [main] ERROR org.apache.pig.tools.grunt.Grunt – ERROR 1070: Could not resolve org.apache.hcatalog.ping.HCataloader using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.]

    it looks like the problem is a misspelling of the class name – it should be ‘HCatLoader’ not ‘HCataloader’ – note the extra ‘a’ in what you entered and the need for a uppercase ‘L’.

    Thanks,
    Ted.

    Collapse
    #28492

    Hello Ted/Sasha/Larry, I followed the workaround suggestions. Now receiving a different error. Please help.

    2013-06-27 11:12:27,054 [main] INFO org.apache.pig.Main – Apache Pig version 0.11.1.1.3.0.0-107 (rexported) compiled May 20 2013, 03:04:35
    2013-06-27 11:12:27,059 [main] INFO org.apache.pig.Main – Logging error messages to: /hadoop/mapred/taskTracker/hue/jobcache/job_201306271056_0001/attempt_201306271056_0001_m_000000_0/work/pig_1372356747052.log
    2013-06-27 11:12:27,334 [main] INFO org.apache.pig.impl.util.Utils – Default bootup file /usr/lib/hadoop/.pigbootup not found
    2013-06-27 11:12:27,447 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine – Connecting to hadoop file system at: hdfs://sandbox:8020
    2013-06-27 11:12:27,597 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine – Connecting to map-reduce job tracker at: sandbox:50300
    2013-06-27 11:12:28,167 [main] ERROR org.apache.pig.tools.grunt.Grunt – ERROR 1070: Could not resolve org.apache.hcatalog.ping.HCataloader using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.]
    Details at logfile: /hadoop/mapred/taskTracker/hue/jobcache/job_201306271056_0001/attempt_201306271056_0001_m_000000_0/work/pig_1372356747052.log

    Collapse
    #26974

    tedr
    Moderator

    Hi Doop,

    Thanks for letting us know that you figured it out.

    Ted.

    Collapse
    #26941

    ETL Doop
    Member

    The issue resolved. I logged in and replaced the last line in the /usr/bin/pig file:

    Replace to: exec /usr/lib/pig/bin/pig -useHCatalog “$@”

    I followed Kirill Wood steps and it worked…Thank you.

    Collapse
    #26894

    ETL Doop
    Member

    Thanks Sarah for login pwd…Logged in as root.

    Unable to access pig folder. How do i resolve HCatloder issue? Please let me know.

    Collapse
    #26892

    Sasha J
    Moderator

    Dear Doop,
    Do you mean to say “what is username and password for Sandbox”?
    Or you meant something else?
    username is root
    password is hadoop

    Hope this helps!
    Thank you!
    Sasha

    Collapse
    #26891

    ETL Doop
    Member

    2013-06-03 16:43:59,751 [main] INFO org.apache.pig.Main – Apache Pig version 0.10.1.21 (rexported) compiled Jan 10 2013, 04:00:42
    2013-06-03 16:43:59,752 [main] INFO org.apache.pig.Main – Logging error messages to: /home/sandbox/hue/pig_1370303039747.log
    2013-06-03 16:43:59,998 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine – Connecting to hadoop file system at: hdfs://sandbox:8020
    2013-06-03 16:44:00,412 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine – Connecting to map-reduce job tracker at: sandbox:50300
    2013-06-03 16:44:00,896 [main] ERROR org.apache.pig.tools.grunt.Grunt – ERROR 1070: Could not resolve org.apache.hcatalog.pig.HCatLoader using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.]
    Details at logfile: /home/sandbox/hue/pig_1370303039747.log

    1. Downloaded the Version today..
    2. username / pwd for the Virtual version

    Please assist.

    Collapse
    #24682

    Larry Liu
    Moderator

    Hi, Craig

    Thanks for trying and let us know it worked.

    Please let us know any issues you meet.

    Larry

    Collapse
    #24582

    Craig Rowley
    Member

    I was receiving the error for newly downloaded HDP Sandbox (downloaded today) when trying to execute tutorial1, check syntax, explain.
    (using Mac OSX 10.6.8, Virtualbox 4.2.4)

    I used terminal and ssh’d into the VM (using IP from url and root/hadoop as auth)

    I applied the fix to /usr/bin/pig (to add -useHCatalog to last line)

    It worked, as far as I can tell

    Collapse
    #23758

    Yi Zhang
    Moderator

    Hi Adam,

    There is no /etc/bin/pig, do you mean /usr/lib/pig/bin/pig or /usr/bin/pig?

    Try peter’s suggestion:

    sed -i ’49s/.*/includeHCatalog=true;/’ /usr/lib/pig/bin/pig

    This will load HCatalog automatically when pig is started.

    Thanks,
    Yi

    Collapse
    #23576

    Has anyone found the resolution to this issue. It looks like we get it first with Hcat then with pig. it must be a config problem but I can’t even get to the directory recommended above. My /etc/bin/ does not have a pig in it on the newest sandbox – downloaded on 4/26/13. I’m happy to try recommendations if folks can reply. I’ll post the results here.

    Collapse
    #21866

    Larry Liu
    Moderator

    Hi, Mehran

    It is great to make it work.

    We are all happy learning 😉

    Larry

    Collapse
    #21775

    I had IBM in double quotes instead of single quotes after changing to single quotes I do get a result. Thanks to Ron at Hortonworks who shared his code with me.

    Thanks everyone!

    Collapse
    #21763

    Larry,
    Thanks a lot for your response. I actually did that as also mentioned in another post and also restarted the host but nothing happens when executing and I get the error below when checking the syntax. Any idea what’s going on?
    I appreciate your help,
    Mehran

    2013-04-15 12:12:28,030 [main] INFO org.apache.pig.Main – Apache Pig version 0.10.1.21 (rexported) compiled Jan 10 2013, 04:00:42
    2013-04-15 12:12:28,031 [main] INFO org.apache.pig.Main – Logging error messages to: /home/sandbox/hue/pig_1366053148027.log
    2013-04-15 12:12:32,601 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine – Connecting to hadoop file system at: file:///
    2013-04-15 12:12:35,471 [main] WARN org.apache.hadoop.hive.conf.HiveConf – DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.
    2013-04-15 12:12:35,897 [main] INFO hive.metastore – Trying to connect to metastore with URI thrift://sandbox:9083
    2013-04-15 12:12:36,485 [main] INFO hive.metastore – Waiting 1 seconds before next connection attempt.
    2013-04-15 12:12:37,485 [main] INFO hive.metastore – Connected to metastore.
    Unexpected character ‘”‘
    2013-04-15 12:12:41,153 [main] ERROR org.apache.pig.tools.grunt.Grunt – ERROR 1200: Unexpected character ‘”‘
    Details at logfile: /home/sandbox/hue/pig_1366053148027.log

    Here the code I am running:
    a = LOAD ‘nyse_stocks’ USING org.apache.hcatalog.pig.HCatLoader();
    b = FILTER a BY stock_symbol==”IBM”;
    c = group b all;
    d = FOREACH c GENERATE AVG(b.stock_volume);
    dump d;

    Collapse
    #21740

    Larry Liu
    Moderator

    Hi, Mehran

    Here is the workaround:

    1. Login to sandbox as root.
    2. Edit file /usr/bin/pig
    3. Replace the last line with the following line:
    exec /usr/lib/pig/bin/pig -useHCatalog “$@”

    Larry

    Collapse
    #21572

    Wondering what was the resolution for this problem? I just encountered the same problem.

    Thanks

    Collapse
    #19759

    tedr
    Member

    Hi Kirill,

    Thanks for letting us know.

    Ted.

    Collapse
    #19560

    Kirill Wood
    Member

    So I found out that UDFs are case sensitive, which means the correct syntax would be all caps-
    MAX(runs.runs), not max
    AVG(b.stock_volume), not avg

    Collapse
    #19414

    tedr
    Member

    Hi Stephen,

    Have you tried any of the fixes mentioned earlier in this thread?

    Thanks,
    Ted.

    Collapse
    #19300

    I’ve downloaded the Sandbox today, and followed the instructions on the left-hand side of the Web browser. I have exactly the same problem as the one documented in this thread.

    2013-03-28 09:30:42,524 [main] INFO org.apache.pig.Main – Apache Pig version 0.10.1.21 (rexported) compiled Jan 10 2013, 04:00:42
    2013-03-28 09:30:42,524 [main] INFO org.apache.pig.Main – Logging error messages to: /home/sandbox/hue/pig_1364488242522.log
    2013-03-28 09:30:42,785 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine – Connecting to hadoop file system at: file:///
    2013-03-28 09:30:43,040 [main] ERROR org.apache.pig.tools.grunt.Grunt – ERROR 1070: Could not resolve org.apache.hcatalog.pig.HCatLoader using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.]
    Details at logfile: /home/sandbox/hue/pig_1364488242522.log

    Collapse
    #18975

    Sasha J
    Moderator

    Kirill,
    it looks like your classpath is broken somehow…
    I just run this tutorial on freshly loaded Sandbox, it works perfectly fine…
    I suggest you reload Sandbox and then try again.

    Thank you!
    Sasha

    Collapse
    #18928

    Kirill Wood
    Member

    Hi Larry,
    I haven’t tried Peter’s way but I tried your fix.. basically –
    1. Login to sandbox as root.
    2. Edit file /usr/bin/pig
    3. Replace the last line with the following line:
    exec /usr/lib/pig/bin/pig -useHCatalog “$@”

    That fixed the original error on HCatLoader which let me proceed with the script:
    ERROR 1070: Could not resolve org.apache.hcatalog.pig.HCatLoader using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.]

    But now, it throws a new error when I use the ‘avg’ function below:
    FOREACH c GENERATE avg(b.stock_volume);

    ERROR 1070: Could not resolve avg using imports: [, org.apache.pig.builtin., org.apache.pig.impl..builtin.]

    I get the same error even in Pig tutorial 2 when I try to use the ‘max’ function in this line:
    FOREACH grp_data GENERATE group as grp, max(runs.runs) as max_runs;

    log file says:
    Failed to parse: Pig script failed to parse:
    Failed to generate logical plan. Nested exception: java.lang.RuntimeException: Cannot instantiate: avg at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:184)

    Thanks!

    Collapse
    #18836

    Larry Liu
    Moderator

    Hi, Kirill,

    Just wondering if you are using the new version of sandbox? Have you been able to try what peter suggested?

    sed -i ’49s/.*/includeHCatalog=true;/’ /usr/lib/pig/bin/pig

    Larry

    Collapse
    #18800

    Kirill Wood
    Member

    I tried this basic pig example too and got the same error as Sankarg:

    a = LOAD ‘nyse_stocks’ USING org.apache.hcatalog.pig.HCatLoader();
    b = FILTER a BY stock_symbol == ‘IBM';
    c = GROUP b all;
    d = FOREACH c GENERATE avg(b.stock_volume);
    dump d;

    Syntax check gave me this error:
    ERROR 1070: Could not resolve org.apache.hcatalog.pig.HCatLoader using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.]

    As Larry suggested, I edited the last line of the file /usr/bin/pig to exec /usr/lib/pig/bin/pig -useHCatalog “$@”.

    The previous error is gone but now it throws another error:
    2013-03-23 20:46:30,499 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine – Connecting to hadoop file system at: file:///
    2013-03-23 20:46:32,452 [main] WARN org.apache.hadoop.hive.conf.HiveConf – DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.
    2013-03-23 20:46:32,739 [main] INFO hive.metastore – Trying to connect to metastore with URI thrift://sandbox:9083
    2013-03-23 20:46:33,160 [main] INFO hive.metastore – Waiting 1 seconds before next connection attempt.
    2013-03-23 20:46:34,160 [main] INFO hive.metastore – Connected to metastore.
    2013-03-23 20:46:36,073 [main] WARN org.apache.hadoop.hive.conf.HiveConf – DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.
    2013-03-23 20:46:39,105 [main] ERROR org.apache.pig.tools.grunt.Grunt – ERROR 1070: Could not resolve avg using imports: [, org.apache.pig.builtin., org.apache.pig.impl..builtin.]
    Details at logfile: /home/sandbox/hue/pig_1364096789805.log

    log file:
    Failed to parse: Pig script failed to parse:
    Failed to generate logical plan. Nested exception: java.lang.RuntimeException: Cannot instantiate: avg at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:184)

    How can I fix this?

    Thank you

    Collapse
    #18774

    Hi Sankarg, you could fix it by running next line on VM:
    sed -i '49s/.*/includeHCatalog=true;/' /usr/lib/pig/bin/pig
    or use an instruction, that Larry suggested

    Collapse
    #18625

    Seth Lyubich
    Keymaster

    Hi Rahul,

    Can you please clarify which eror you are referring too?

    Thanks,
    Seth

    Collapse
    #18605

    Rahul Jolly
    Member

    Hi Ted – what would be the fix for this error we’ve been seeing below. I am getting the same.
    Would you please let us know the steps we can take to resolve this.

    Thank you.
    Rahul

    Collapse
    #18168

    Larry Liu
    Moderator

    Hi, Sankar,

    Here is a workaround you could try to see if it works:

    1. Login to sandbox as root.
    2. Edit file /usr/bin/pig
    3. Replace the last line with the following line:
    exec /usr/lib/pig/bin/pig -useHCatalog “$@”

    I will continue looking into other solution.

    Thanks
    Larry

    Collapse
    #18104

    Larry Liu
    Moderator

    Hi, Sankar,

    Can you please provide the log from your error message: /home/sandbox/hue/pig_1363556128447.log

    From my initial look, the error said “2013-03-17 14:35:45,555 [main] ERROR org.apache.pig.tools.grunt.Grunt – ERROR 1070: Could not resolve org.apache.hcatalog.pig.HCatLoader using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.]”. It is mostly likely the HCatLoader is not in pig classpath.

    Larry

    Collapse
    #17959

    Sankarg
    Member

    I removed all the lines in the pig code example except the very first line and tried to run the syntax check and i could noticed the message that Detected Tx hang in the virtual box.

    Collapse
Viewing 30 replies - 31 through 60 (of 62 total)
Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.