Home Forums Hortonworks Sandbox Sandbox – Pig Basic Tutorial example is nbot working

This topic contains 40 replies, has 24 voices, and was last updated by  Don Estes 1 week, 2 days ago.

  • Creator
    Topic
  • #17798

    Sankarg
    Member

    Hi, I just tried the following pig Basic Tutorial which is not working

    a = LOAD ‘nyse_stocks’ USING org.apache.hcatalog.pig.HCatLoader();
    b = FILTER a BY stock_symbol == ‘IBM’;
    c = group b all;
    d = FOREACH c GENERATE AVG(b.stock_volume);
    dump d;

    when i tried the syntax check, the following logs captured.

    013-03-17 14:35:28,456 [main] INFO org.apache.pig.Main – Apache Pig version 0.10.1.21 (rexported) compiled Jan 10 2013, 04:00:42
    2013-03-17 14:35:28,459 [main] INFO org.apache.pig.Main – Logging error messages to: /home/sandbox/hue/pig_1363556128447.log
    2013-03-17 14:35:41,945 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine – Connecting to hadoop file system at: file:///
    2013-03-17 14:35:45,555 [main] ERROR org.apache.pig.tools.grunt.Grunt – ERROR 1070: Could not resolve org.apache.hcatalog.pig.HCatLoader using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.]
    Details at logfile: /home/sandbox/hue/pig_1363556128447.log

    please do the needful to resolve this issue. Thank you!

    Regards,
    Sankar

Viewing 30 replies - 1 through 30 (of 40 total)

You must be logged in to reply to this topic.

  • Author
    Replies
  • #51537

    Don Estes
    Participant

    Thanks Nicolas! That worked for me when all the other work arounds didn’t.

    Collapse
    #50613

    suaroman
    Participant

    Just checking to see if there are any updates?
    I have the latest 2.x sandbox and experiencing the same problem.

    None of the recommended solutions has worked.

    This command: (from Larry )
    sed -i ’49s/.*/includeHCatalog=true;/’ /usr/lib/pig/bin/pig

    didn’t resolve the problem

    nor did manually going to pig file and making the adjustments work.

    I don’t think this problem is related to the VM. I’m using VMWare workstation 10.

    If you need anything, i’ll be happy to send you my files or any of the configurations I have.
    thanks,

    Collapse
    #49734

    Maddy Techie
    Participant

    hi – i tried the basic tutorial , but its not working. On Execute it succesfully completes , but the output is blank.
    Below is my code and log file

    a = LOAD ‘nyse_stocks’ USING org.apache.hcatalog.pig.HCatLoader();
    b = filter a by stock_symbol == ‘IBM’;
    c = group b all;
    d = foreach c generate AVG(b.stock_volume);
    dump d;
    —————
    log file

    2014-03-06 16:56:17,294 [main] INFO org.apache.pig.Main – Apache Pig version 0.12.0.2.0.6.0-76 (rexported) compiled Oct 17 2013, 20:44:07
    2014-03-06 16:56:17,322 [main] INFO org.apache.pig.Main – Logging error messages to: /hadoop/yarn/local/usercache/hue/appcache/application_1394153146773_0001/container_1394153146773_0001_01_000002/pig_1394153777285.log
    2014-03-06 16:56:35,564 [main] INFO org.apache.pig.impl.util.Utils – Default bootup file /home/yarn/.pigbootup not found
    2014-03-06 16:56:37,061 [main] INFO org.apache.hadoop.conf.Configuration.deprecation – mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
    2014-03-06 16:56:37,062 [main] INFO org.apache.hadoop.conf.Configuration.deprecation – fs.default.name is deprecated. Instead, use fs.defaultFS
    2014-03-06 16:56:37,063 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine – Connecting to hadoop file system at: hdfs://sandbox.hortonworks.com:8020
    2014-03-06 16:56:37,113 [main] INFO org.apache.hadoop.conf.Configuration.deprecation – mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used
    2014-03-06 16:56:44,048 [main] INFO org.apache.hadoop.conf.Configuration.deprecation – dfs.df.interval is deprecated. Instead, use fs.df.interval
    2014-03-06 16:56:44,049 [main] INFO org.apache.hadoop.conf.Configuration.deprecation – mapred.task.tracker.http.address is deprecated. Instead, use mapreduce.tasktracker.http.address
    2014-03-06 16:56:44,049 [main] INFO org.apache.hadoop.conf.Configuration.deprecation – dfs.max.objects is deprecated. Instead, use dfs.namenode.max.objects
    2014-03-06 16:56:44,050 [main] INFO org.apache.hadoop.conf.Configuration.deprecation – mapred.userlog.retain.hours is deprecated. Instead, use mapreduce.job.userlog.retain.hours
    2014-03-06 16:56:44,051 [main] INFO org.apache.hadoop.conf.Configuration.deprecation – hadoop.native.lib is deprecated. Instead, use io.native.lib.available
    2014-03-06 16:56:44,053 [main] INFO org.apache.hadoop.conf.Configuration.deprecation – mapred.local.dir.minspacestart is deprecated. Instead, use mapreduce.tasktracker.local.dir.minspacestart
    2014-03-06 16:56:44,053 [main] INFO org.apache.hadoop.conf.Configuration.deprecation – mapred.shuffle.read.timeout is deprecated. Instead, use mapreduce.reduce.shuffle.read.timeout
    2014-03-06 16:56:44,055 [main] INFO org.apache.hadoop.conf.Configuration.deprecation – io.sort.spill.percent is deprecated. Instead, use mapreduce.map.sort.spill.percent
    2014-03-06 16:56:44,056 [main] INFO org.apache.hadoop.conf.Configuration.deprecation – mapred.reduce.parallel.copi

    Collapse
    #48729

    Nicolas MASSART
    Participant

    Have the same issue, and i found what’s happen !!!
    The problem come if the parameter “-useHCatalog” is not set.
    If you only write the parameter on the HDP Pig interface it on the text field it is not taken in account, you need to press enter button.

    Collapse
    #48396

    one2go
    Participant

    I encountered this problem (“ERROR 1070: Could not resolve org.apache.hcatalog.ping.HCataloader”) after downloading Hortonworks+Sandbox+2.0+VMware.ova on Feb 8, 2014. The fixes below did not solve the problem when using the browser-based hue editor. (It did solve the problem when running pig from the command line on the sandbox).

    I was able to get it working by patching the following file.

    sed -i 's/^includeHCatalog=.*/includeHCatalog=true;/' /hadoop/yarn/local/filecache/10/pig.tar.gz/pig/bin/pig

    Collapse
    #44734

    I’m also having trouble with the tutorial1 pig script. When I execute it, I get no results back. The job reports success, but the box remains empty. The syntax check passes, while the explain plan fails without presenting any logs. I’ve tried this with Chrome and with Internet Explorer.

    I am running a fresh Hortonworks+Sandbox+2.0+VMware instance with the latest VMware Player. Here is the code I’m running:

    a = LOAD ‘nyse_stocks’ USING org.apache.hcatalog.pig.HCatLoader();
    b = FILTER a BY stock_symbol == ‘IBM’;
    c = GROUP b ALL;
    d = FOREACH c GENERATE SUM(b.stock_volume);
    DUMP d;

    Any thoughts? Thanks!

    Kevin

    Collapse
    #43682

    Hi,

    I am trying to run the Tutorial #2, I follow each step very carefully but when I Execute I get the following Error.

    Job job_1384288052635_0017 was failed

    Here is the script I try to run:

    batting = load ‘Batting.csv’ using PigStorage(‘,’);
    runs = FOREACH batting GENERATE $0 as playerID, $1 as year, $8 as runs;
    grp_data = GROUP runs by (year);
    max_runs = FOREACH grp_data GENERATE group as grp,MAX(runs.runs) as max_runs;
    join_max_run = JOIN max_runs by ($0, max_runs), runs by (year,runs);
    join_data = FOREACH join_max_run GENERATE $0 as year, $2 as playerID, $1 as runs;
    dump join_data;

    When I click on Explain I get this message:

    #———————————————–
    # New Logical Plan:
    #———————————————–
    Logical plan is empty.

    When I run syntax check, everything seems normal.

    Please help.

    Thanks

    Collapse
    #41085

    I think VirtualBox does cause errors on occassion when importing the 2.0 beta sandbox. I had the same error and applied the fixes as described by Larry and Peter, to no effect. After I re-imported the VM the query started working without needing the fixes. Thanks Oracle, not.

    Collapse
    #28615

    tedr
    Moderator

    Hi Nilesh,

    in looking at this line from your post:

    2013-06-27 11:12:28,167 [main] ERROR org.apache.pig.tools.grunt.Grunt – ERROR 1070: Could not resolve org.apache.hcatalog.ping.HCataloader using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.]

    it looks like the problem is a misspelling of the class name – it should be ‘HCatLoader’ not ‘HCataloader’ – note the extra ‘a’ in what you entered and the need for a uppercase ‘L’.

    Thanks,
    Ted.

    Collapse
    #28492

    Hello Ted/Sasha/Larry, I followed the workaround suggestions. Now receiving a different error. Please help.

    2013-06-27 11:12:27,054 [main] INFO org.apache.pig.Main – Apache Pig version 0.11.1.1.3.0.0-107 (rexported) compiled May 20 2013, 03:04:35
    2013-06-27 11:12:27,059 [main] INFO org.apache.pig.Main – Logging error messages to: /hadoop/mapred/taskTracker/hue/jobcache/job_201306271056_0001/attempt_201306271056_0001_m_000000_0/work/pig_1372356747052.log
    2013-06-27 11:12:27,334 [main] INFO org.apache.pig.impl.util.Utils – Default bootup file /usr/lib/hadoop/.pigbootup not found
    2013-06-27 11:12:27,447 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine – Connecting to hadoop file system at: hdfs://sandbox:8020
    2013-06-27 11:12:27,597 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine – Connecting to map-reduce job tracker at: sandbox:50300
    2013-06-27 11:12:28,167 [main] ERROR org.apache.pig.tools.grunt.Grunt – ERROR 1070: Could not resolve org.apache.hcatalog.ping.HCataloader using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.]
    Details at logfile: /hadoop/mapred/taskTracker/hue/jobcache/job_201306271056_0001/attempt_201306271056_0001_m_000000_0/work/pig_1372356747052.log

    Collapse
    #26974

    tedr
    Moderator

    Hi Doop,

    Thanks for letting us know that you figured it out.

    Ted.

    Collapse
    #26941

    ETL Doop
    Member

    The issue resolved. I logged in and replaced the last line in the /usr/bin/pig file:

    Replace to: exec /usr/lib/pig/bin/pig -useHCatalog “$@”

    I followed Kirill Wood steps and it worked…Thank you.

    Collapse
    #26894

    ETL Doop
    Member

    Thanks Sarah for login pwd…Logged in as root.

    Unable to access pig folder. How do i resolve HCatloder issue? Please let me know.

    Collapse
    #26892

    Sasha J
    Moderator

    Dear Doop,
    Do you mean to say “what is username and password for Sandbox”?
    Or you meant something else?
    username is root
    password is hadoop

    Hope this helps!
    Thank you!
    Sasha

    Collapse
    #26891

    ETL Doop
    Member

    2013-06-03 16:43:59,751 [main] INFO org.apache.pig.Main – Apache Pig version 0.10.1.21 (rexported) compiled Jan 10 2013, 04:00:42
    2013-06-03 16:43:59,752 [main] INFO org.apache.pig.Main – Logging error messages to: /home/sandbox/hue/pig_1370303039747.log
    2013-06-03 16:43:59,998 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine – Connecting to hadoop file system at: hdfs://sandbox:8020
    2013-06-03 16:44:00,412 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine – Connecting to map-reduce job tracker at: sandbox:50300
    2013-06-03 16:44:00,896 [main] ERROR org.apache.pig.tools.grunt.Grunt – ERROR 1070: Could not resolve org.apache.hcatalog.pig.HCatLoader using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.]
    Details at logfile: /home/sandbox/hue/pig_1370303039747.log

    1. Downloaded the Version today..
    2. username / pwd for the Virtual version

    Please assist.

    Collapse
    #24682

    Larry Liu
    Moderator

    Hi, Craig

    Thanks for trying and let us know it worked.

    Please let us know any issues you meet.

    Larry

    Collapse
    #24582

    Craig Rowley
    Member

    I was receiving the error for newly downloaded HDP Sandbox (downloaded today) when trying to execute tutorial1, check syntax, explain.
    (using Mac OSX 10.6.8, Virtualbox 4.2.4)

    I used terminal and ssh’d into the VM (using IP from url and root/hadoop as auth)

    I applied the fix to /usr/bin/pig (to add -useHCatalog to last line)

    It worked, as far as I can tell

    Collapse
    #23758

    Yi Zhang
    Moderator

    Hi Adam,

    There is no /etc/bin/pig, do you mean /usr/lib/pig/bin/pig or /usr/bin/pig?

    Try peter’s suggestion:

    sed -i ’49s/.*/includeHCatalog=true;/’ /usr/lib/pig/bin/pig

    This will load HCatalog automatically when pig is started.

    Thanks,
    Yi

    Collapse
    #23576

    Has anyone found the resolution to this issue. It looks like we get it first with Hcat then with pig. it must be a config problem but I can’t even get to the directory recommended above. My /etc/bin/ does not have a pig in it on the newest sandbox – downloaded on 4/26/13. I’m happy to try recommendations if folks can reply. I’ll post the results here.

    Collapse
    #21866

    Larry Liu
    Moderator

    Hi, Mehran

    It is great to make it work.

    We are all happy learning ;)

    Larry

    Collapse
    #21775

    I had IBM in double quotes instead of single quotes after changing to single quotes I do get a result. Thanks to Ron at Hortonworks who shared his code with me.

    Thanks everyone!

    Collapse
    #21763

    Larry,
    Thanks a lot for your response. I actually did that as also mentioned in another post and also restarted the host but nothing happens when executing and I get the error below when checking the syntax. Any idea what’s going on?
    I appreciate your help,
    Mehran

    2013-04-15 12:12:28,030 [main] INFO org.apache.pig.Main – Apache Pig version 0.10.1.21 (rexported) compiled Jan 10 2013, 04:00:42
    2013-04-15 12:12:28,031 [main] INFO org.apache.pig.Main – Logging error messages to: /home/sandbox/hue/pig_1366053148027.log
    2013-04-15 12:12:32,601 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine – Connecting to hadoop file system at: file:///
    2013-04-15 12:12:35,471 [main] WARN org.apache.hadoop.hive.conf.HiveConf – DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.
    2013-04-15 12:12:35,897 [main] INFO hive.metastore – Trying to connect to metastore with URI thrift://sandbox:9083
    2013-04-15 12:12:36,485 [main] INFO hive.metastore – Waiting 1 seconds before next connection attempt.
    2013-04-15 12:12:37,485 [main] INFO hive.metastore – Connected to metastore.
    Unexpected character ‘”‘
    2013-04-15 12:12:41,153 [main] ERROR org.apache.pig.tools.grunt.Grunt – ERROR 1200: Unexpected character ‘”‘
    Details at logfile: /home/sandbox/hue/pig_1366053148027.log

    Here the code I am running:
    a = LOAD ‘nyse_stocks’ USING org.apache.hcatalog.pig.HCatLoader();
    b = FILTER a BY stock_symbol==”IBM”;
    c = group b all;
    d = FOREACH c GENERATE AVG(b.stock_volume);
    dump d;

    Collapse
    #21740

    Larry Liu
    Moderator

    Hi, Mehran

    Here is the workaround:

    1. Login to sandbox as root.
    2. Edit file /usr/bin/pig
    3. Replace the last line with the following line:
    exec /usr/lib/pig/bin/pig -useHCatalog “$@”

    Larry

    Collapse
    #21572

    Wondering what was the resolution for this problem? I just encountered the same problem.

    Thanks

    Collapse
    #19759

    tedr
    Member

    Hi Kirill,

    Thanks for letting us know.

    Ted.

    Collapse
    #19560

    Kirill Wood
    Member

    So I found out that UDFs are case sensitive, which means the correct syntax would be all caps-
    MAX(runs.runs), not max
    AVG(b.stock_volume), not avg

    Collapse
    #19414

    tedr
    Member

    Hi Stephen,

    Have you tried any of the fixes mentioned earlier in this thread?

    Thanks,
    Ted.

    Collapse
    #19300

    I’ve downloaded the Sandbox today, and followed the instructions on the left-hand side of the Web browser. I have exactly the same problem as the one documented in this thread.

    2013-03-28 09:30:42,524 [main] INFO org.apache.pig.Main – Apache Pig version 0.10.1.21 (rexported) compiled Jan 10 2013, 04:00:42
    2013-03-28 09:30:42,524 [main] INFO org.apache.pig.Main – Logging error messages to: /home/sandbox/hue/pig_1364488242522.log
    2013-03-28 09:30:42,785 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine – Connecting to hadoop file system at: file:///
    2013-03-28 09:30:43,040 [main] ERROR org.apache.pig.tools.grunt.Grunt – ERROR 1070: Could not resolve org.apache.hcatalog.pig.HCatLoader using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.]
    Details at logfile: /home/sandbox/hue/pig_1364488242522.log

    Collapse
    #18975

    Sasha J
    Moderator

    Kirill,
    it looks like your classpath is broken somehow…
    I just run this tutorial on freshly loaded Sandbox, it works perfectly fine…
    I suggest you reload Sandbox and then try again.

    Thank you!
    Sasha

    Collapse
    #18928

    Kirill Wood
    Member

    Hi Larry,
    I haven’t tried Peter’s way but I tried your fix.. basically –
    1. Login to sandbox as root.
    2. Edit file /usr/bin/pig
    3. Replace the last line with the following line:
    exec /usr/lib/pig/bin/pig -useHCatalog “$@”

    That fixed the original error on HCatLoader which let me proceed with the script:
    ERROR 1070: Could not resolve org.apache.hcatalog.pig.HCatLoader using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.]

    But now, it throws a new error when I use the ‘avg’ function below:
    FOREACH c GENERATE avg(b.stock_volume);

    ERROR 1070: Could not resolve avg using imports: [, org.apache.pig.builtin., org.apache.pig.impl..builtin.]

    I get the same error even in Pig tutorial 2 when I try to use the ‘max’ function in this line:
    FOREACH grp_data GENERATE group as grp, max(runs.runs) as max_runs;

    log file says:
    Failed to parse: Pig script failed to parse:
    Failed to generate logical plan. Nested exception: java.lang.RuntimeException: Cannot instantiate: avg at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:184)

    Thanks!

    Collapse
Viewing 30 replies - 1 through 30 (of 40 total)