Home Forums Hortonworks Sandbox Sandbox PIG Tutorials

This topic contains 7 replies, has 8 voices, and was last updated by  teknetik 8 months, 2 weeks ago.

  • Creator
    Topic
  • #49362

    Vijay Phagura
    Participant

    First of all thx for providing the sandbox and the tutorials in there. This has been very nice learning experience.
    I was trying tutorial2 and when I execute this in PIG:
    batting = LOAD ‘Batting.csv’ USING PigStorage(‘,’);
    runs = FOREACH batting GENERATE $0 as playerID, $1 as year, $8 as runs;
    grp_data = GROUP runs BY year;
    max_runs = FOREACH grp_data GENERATE group as grp,MAX(runs.runs) as max_runs;
    join_max_run = JOIN max_runs by ($0, max_runs), runs by (year,runs);
    join_data = FOREACH join_max_run GENERATE $0 as year, $2 as playerID, $1 as runs;
    dump join_data;

    I get this errors:
    {“error”:”org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id ‘application_1393376950997_0012′ doesn’t exist in RM.\n\tat org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:247)\n\tat org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:120)\n\tat org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:241)\n\tat org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)\n\tat org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2053)\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)\n\tat java.security.AccessController.doPrivileged(Native Method)\n\tat javax.security.auth.Subject.doAs(Subject.java:396)\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)\n\tat org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047)\n”} (error 500)

    did not really get what it is complaining about. Any ideas, how I can get rid of this error and see the results.

    Also, I tried Tutorial1 it executes fine but I do not see the results as shown in the end of the tutorial. It just sits there with a green bar.

    Thx

Viewing 7 replies - 1 through 7 (of 7 total)

You must be logged in to reply to this topic.

  • Author
    Replies
  • #51295

    teknetik
    Participant

    I also came across this issue and here is how I got this working:

    First I used the script as suggested by Dave below, however I did not copy the file to /tmp and had to use the fullpath of the file as it resides on the HDFS:

    batting = LOAD '/user/hue/lahman591-csv/lahman591-csv/Batting.csv' USING PigStorage(',');

    Collapse
    #50756

    Dave
    Moderator

    Hi All,

    Firstly can you copy the Batting.csv and put it into /tmp/Batting.csv.

    Then modify your PIG script to do look like the following:

    batting = LOAD ‘/tmp/Batting.csv’ USING PigStorage(‘,’);
    runs_raw = FOREACH batting GENERATE $0 as playerID, $1 as year, $8 as runs;
    runs = FILTER runs_raw BY runs > 0;
    grp_data = GROUP runs BY year;
    max_runs = FOREACH grp_data GENERATE group as grp,MAX(runs.runs) as max_runs;
    join_max_run = JOIN max_runs by ($0, max_runs), runs by (year,runs);
    join_data = FOREACH join_max_run GENERATE $0 as year, $2 as playerID, $1 as runs;
    dump join_data;

    Note runs_raw and runs. You are receiving an algebraic function error, this is caused by trying to do a mathematical function on the headers of the data and not the data itself.

    Let me know if this resolves your issue.

    Thanks

    Dave

    Collapse
    #50755

    Malek Safa
    Participant

    Hi,

    i got the same error in another tutorial ‘Using Pig Command’ with the script ‘Pig-Dividend’

    {“error”:”org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id ‘application_1396111271816_0009′ doesn’t exist in RM.\n\tat org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:247)\n\tat org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:120)\n\tat org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:241)\n\tat org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)\n\tat org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2053)\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)\n\tat java.security.AccessController.doPrivileged(Native Method)\n\tat javax.security.auth.Subject.doAs(Subject.java:396)\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)\n\tat org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047)\n”} (error 500)

    any idea?

    Collapse
    #50730

    Michael Toland
    Participant

    Same issue here with me and several colleagues. We are curious as to how to fix this, as this is severely limiting our ability to fully learn and understand Hortonworks.

    Any guidance would be greatly appreciated.

    Collapse
    #50721

    Clement Fleury
    Participant

    +1

    {"error":"org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id 'application_1395912244911_0044' doesn't exist in RM.\n\tat org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:247)\n\tat org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:120)\n\tat org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:241)\n\tat org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)\n\tat org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2053)\n\tat org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)\n\tat java.security.AccessController.doPrivileged(Native Method)\n\tat javax.security.auth.Subject.doAs(Subject.java:396)\n\tat org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)\n\tat org.apache.hadoop.ipc.Server$Handler.run(Server.java:2047)\n"} (error 500)
    
    Collapse
    #50696

    Alejandro Francia
    Participant

    Same issue here…

    Collapse
    #49480

    chris clarke
    Participant

    I’m also experiencing the same issue.

    Collapse
Viewing 7 replies - 1 through 7 (of 7 total)