Home Forums HDFS mapreduce job not running on hadoop

This topic contains 5 replies, has 2 voices, and was last updated by  Sasha J 1 year, 10 months ago.

  • Creator
    Topic
  • #10395

    I ran a simple wordcount job on mapreduce to view the work flow in oozie, but getting failed and it shows the error message 2012-09-27 20:43:51,155 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1348722981947_0034_m_000000_0: Error: java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.LongWritable, received org.apache.hadoop.io.Text
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:998)
    at org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:550)
    at org.wordcount.WordCount$Map.map(WordCount.java:33)
    at org.wordcount.WordCount$Map.map(WordCount.java:1)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:399)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:416)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147)

    Please help.

    Below is the logic i have in my class

    package org.wordcount;

    import java.io.IOException;
    import java.util.Iterator;
    import java.util.StringTokenizer;

    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.IntWritable;
    import org.apache.hadoop.io.LongWritable;
    import org.apache.hadoop.io.Text;
    import org.apache.hadoop.mapred.FileInputFormat;
    import org.apache.hadoop.mapred.FileOutputFormat;
    import org.apache.hadoop.mapred.JobClient;
    import org.apache.hadoop.mapred.JobConf;
    import org.apache.hadoop.mapred.MapReduceBase;
    import org.apache.hadoop.mapred.Mapper;
    import org.apache.hadoop.mapred.OutputCollector;
    import org.apache.hadoop.mapred.Reducer;
    import org.apache.hadoop.mapred.Reporter;
    import org.apache.hadoop.mapred.TextInputFormat;
    import org.apache.hadoop.mapred.TextOutputFormat;

    public class WordCount {

    public static class Map extends MapReduceBase implements Mapper {
    private final static IntWritable one = new IntWritable(1);
    private Text word = new Text();
    public void map(LongWritable key, Text value, OutputCollector output, Reporter reporter) throws IOException {
    String line = value.toString();
    StringTokenizer tokenizer = new StringTokenizer(line);
    while (tokenizer.hasMoreTokens()) {
    word.set(tokenizer.nextToken());
    output.collect(word, one);
    }
    }
    }

    public static class Reduce extends MapReduceBase implements Reducer {
    public void reduce(Text key, Iterator values, OutputCollector output, Reporter reporter) throws IOException{
    int sum = 0;
    while (values.hasNext()) {
    sum += values.next().get();
    }
    output.collect(key, new IntWritable(sum));
    }

    }

    public static void main(String[] args) throws Exception {
    JobConf conf = new JobConf(WordCount.class);
    conf.setJobName(“wordcount”);

    conf.setOutputKeyClass(Text.class);
    conf.setOutputValueClass(IntWritable.class);

    conf.setMapOutputKeyClass(Text.class);
    conf.setMapOutputValueClass(IntWritable.class);

    conf.setMapperClass(Map.class);
    conf.setCombinerClass(Reduce.class);
    conf.setReducerClass(Reduce.class);

    conf.setInputFormat(TextInputFormat.class);
    conf.setOutputFormat(TextOutputFormat.class);

    FileInputFormat.setInputPaths(conf, new Path(args[0]));
    FileOutputFormat.setOutputPath(conf, new Path(args[1]));

    JobClient.runJob(conf);
    }
    }

Viewing 5 replies - 1 through 5 (of 5 total)

The topic ‘mapreduce job not running on hadoop’ is closed to new replies.

  • Author
    Replies
  • #10442

    Sasha J
    Moderator

    I am somewhat curious about what you mean by “standalone java program.”
    You shouldn’t need any special configurations on the cluster.
    The only thing I can think that it might be is does the hadoop version that is running on the cluster support mapreduce v2? Are you submitting your job to a cluster the was installed with HMC? Currently Hortonworks does not support mapreduce v2 as it is still in alpha.

    Collapse
    #10433

    Thanks for quick response. but the same program is running as a standalone java program but not running when i make it as a mapreduce in the cluster. Do we need to have any extra settings on cluster while running this?

    Collapse
    #10400

    Sasha J
    Moderator

    Then it looks like you need to change:

    public void map(LongWritable key, Text value, OutputCollector output, Reporter reporter) throws IOException {

    to

    public void map(Text key, IntWritable value, OutputCollector output, Reporter reporter) throws IOException {

    Collapse
    #10399

    Hi Thanks for quick response. I need to pass a Text variable as per my requirement. This program is getting executed when i ran as a normal java but not working on mapreduce..

    please help..

    Collapse
    #10397

    Sasha J
    Moderator

    Kishore,

    The stack trace show that somewhere in your code, you are passing a Text variable where the system expects a longWritable. It looks like the method map in Map is getting a Text value for key instead of a longWritable.

    Ted.

    Collapse
Viewing 5 replies - 1 through 5 (of 5 total)