Home Forums Pig # of failed Map Tasks exceeded allowed limit. FailedCount: 1.

This topic contains 6 replies, has 6 voices, and was last updated by  Antonio Paternina 7 months, 2 weeks ago.

  • Creator
    Topic
  • #27748

    Pavan Bolla
    Participant

    i have followed the instructions given on the Tutorial 2: data processing with pig-processing Base ball stats with pig.
    Pig script:
    batting = load ‘Batting.csv’ using PigStorage(‘,’);
    runs = FOREACH batting GENERATE $0 as playerID, $1 as year, $8 as runs;
    grp_data = GROUP runs BY (year);
    max_runs = FOREACH grp_data GENERATE group as grp, MAX(runs.runs) as max_runs;
    join_max_run = JOIN max_runs by ($0, max_runs), runs by (year,runs);
    join_data = FOREACH join_max_run GENERATE $0 as year, $2 as playerID, $1 as runs;
    dump join_data;

    After executing i got the below error.

    # of failed Map Tasks exceeded allowed limit. FailedCount: 1. LastFailedTask: task_201306181545_0002_m_000000

    Logs:
    2013-06-18 16:03:46,470 [main] ERROR org.apache.pig.tools.pigstats.SimplePigStats – ERROR 2106: Error executing an algebraic function
    2013-06-18 16:03:46,470 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil – 1 map reduce job(s) failed!
    2013-06-18 16:03:46,513 [main] INFO org.apache.pig.tools.pigstats.SimplePigStats – Script Statistics:

    HadoopVersion PigVersion UserId StartedAt FinishedAt Features
    1.2.0.1.3.0.0-107 0.11.1.1.3.0.0-107 mapred 2013-06-18 16:02:06 2013-06-18 16:03:46 HASH_JOIN,GROUP_BY

    Failed!

    Failed Jobs:
    JobId Alias Feature Message Outputs
    job_201306181545_0002 batting,grp_data,max_runs,runs MULTI_QUERY,COMBINER Message: Job failed! Error – # of failed Map Tasks exceeded allowed limit. FailedCount: 1. LastFailedTask: task_201306181545_0002_m_000000

    Input(s):
    Failed to read data from “hdfs://sandbox:8020/user/hue/Batting.csv”

    Output(s):

    Counters:
    Total records written : 0
    Total bytes written : 0
    Spillable Memory Manager spill count : 0
    Total bags proactively spilled: 0
    Total records proactively spilled: 0

    Job DAG:
    job_201306181545_0002 -> null,
    null

    2013-06-18 16:03:46,514 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher – Failed!
    2013-06-18 16:03:46,515 [main] ERROR org.apache.pig.tools.grunt.Grunt – ERROR 1066: Unable to open iterator for alias join_data
    Details at logfile: /hadoop/mapred/taskTracker/hue/jobcache/job_201306181545_0001/attempt_201306181545_0001_m_000000_0/work/pig_1371596519449.log

    Please suggest me how to proceed?

Viewing 6 replies - 1 through 6 (of 6 total)

You must be logged in to reply to this topic.

  • Author
    Replies
  • #47609

    Antonio Paternina
    Participant

    of failed Map Tasks exceeded allowed limit. FailedCount: 2. LastFailedTask:

    does not work for me, but will increase the number get the same error. Please if there is any other solution

    Collapse
    #35335

    Jianyong Dai
    Participant

    Is the failed job the first one or second one? The idea is to reduce the number of mappers. For the first job, you can increase “pig.maxCombinedSplitSize” to allow each mapper take more input files. For the second job, in addition to the previous trick, reduce the reduce# of the first job will decrease number of the input part files, which also helps.

    Collapse
    #34481

    Kevin Knaus
    Member

    I too had the same error. And while your reply, Ted, fixes the error, it does not really explain why the presence of a zero for “runs” record causes the failure. Also, the real issue for me was that in trying to examine and bust out the log report about the failure, which said “unable to open iterator for alias join_data”. Is there a white paper or some place else that would allow someone to understand what the implications of failing to open an iterator might imply? I assume it is a general sort of error that might appear for a variety of issues, not just the unfiltered zero field records. Thanks too for the fix you posted.

    Collapse
    #29644

    tedr
    Moderator

    Hi Gerald,

    Thanks for letting us know.

    Ted.

    Collapse
    #29639

    gehhrald
    Member

    Hi tedr,

    Thank you for the solution. Works for me too.

    Cheers,
    Gerald

    Collapse
    #27751

    tedr
    Moderator

    Hi Pavan,

    If you modify the script a bit, changing/adding the following, it will work:

    runs_raw = FOREACH batting GENERATE $0 as playerID, $1 as year, $8 as runs;
    runs = FILTER runs_raw BY runs > 0;

    thanks,
    Ted.

    Collapse
Viewing 6 replies - 1 through 6 (of 6 total)