Home Forums MapReduce Running Wordcount on HDP on Centos 6.3

This topic contains 2 replies, has 2 voices, and was last updated by  Anand M 6 months, 4 weeks ago.

  • Creator
    Topic
  • #47625

    Edward
    Participant

    Hi,

    I have created a cluster of size 3 nodes (1 master and 2 slaves) using the Hortonworks distribution. I went through step by step installation using Ambari. This was quite clean and the dashboard shows me the live nodes running and indicates “green” for each node and the services running.

    Now getting on to Map reduce, I did a smoke test for testing the installation is fine for which I followed the link to test map reduce, and it works fine.

    http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-1.2.1/bk_installing_manually_book/content/rpm-chap4-4.html

    Now when I try executing a basic WordCount program I get this Exception:


    [hw_hadoop@hwslave2 WC]$ hadoop jar WordCount.jar WordCount /workspace/wordcount/words.txt /workspace/wordcount/output12
    14/01/29 10:20:49 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
    14/01/29 10:20:49 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
    14/01/29 10:20:49 INFO input.FileInputFormat: Total input paths to process : 1
    14/01/29 10:20:49 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
    14/01/29 10:20:49 INFO lzo.LzoCodec: Successfully loaded & initialized native-lzo library [hadoop-lzo rev cf4e7cbf8ed0f0622504d008101c2729dc0c9ff3]
    14/01/29 10:20:49 WARN snappy.LoadSnappy: Snappy native library is available
    14/01/29 10:20:49 INFO util.NativeCodeLoader: Loaded the native-hadoop library
    14/01/29 10:20:49 INFO snappy.LoadSnappy: Snappy native library loaded
    14/01/29 10:20:50 INFO mapred.JobClient: Running job: job_201401271605_0014
    14/01/29 10:20:51 INFO mapred.JobClient: map 0% reduce 0%
    14/01/29 10:21:05 INFO mapred.JobClient: Task Id : attempt_201401271605_0014_m_000000_0, Status : FAILED
    java.lang.RuntimeException: java.lang.ClassNotFoundException: WordCount$Map
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:857)
    at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:199)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:717)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:363)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)
    Caused by: java.lang.ClassNotFoundException: WordCount$Map
    at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:24

Viewing 2 replies - 1 through 2 (of 2 total)

You must be logged in to reply to this topic.

  • Author
    Replies
  • #49285

    Anand M
    Participant

    Can you try adding your jar to the HADOOP_CLASSPATH variable.

    I faced this issue and I found a work around by doing this.
    export HADOOP_CLASSPATH=hadoop classpath:<your jar file>

    Please let me know if it works.

    Collapse
    #47626

    Edward
    Participant

    Further when I checked online, it says that on a distributed environment, I need to add setJarByClass(WordCount.class) to my program. Still I get the same exception.

    Below is the code snippet and I have placed it in the default package.


    import java.io.IOException;
    import java.util.*;

    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.conf.*;
    import org.apache.hadoop.io.*;
    import org.apache.hadoop.mapreduce.*;
    import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
    import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
    import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
    import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

    public class WordCount {

    public static class Map extends Mapper<LongWritable, Text, Text, IntWritable> {
    private final static IntWritable one = new IntWritable(1);
    private Text word = new Text();

    public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
    String line = value.toString();
    StringTokenizer tokenizer = new StringTokenizer(line);
    while (tokenizer.hasMoreTokens()) {
    word.set(tokenizer.nextToken());
    context.write(word, one);
    }
    }
    }

    public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {

    public void reduce(Text key, Iterable<IntWritable> values, Context context)
    throws IOException, InterruptedException {
    int sum = 0;
    for (IntWritable val : values) {
    sum += val.get();
    }
    context.write(key, new IntWritable(sum));
    }
    }

    public static void main(String[] args) throws Exception {
    Configuration conf = new Configuration();

    Job job = new Job(conf, "wordcount");
    job.setJarByClass(WordCount.class);

    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);

    job.setMapperClass(Map.class);
    job.setReducerClass(Reduce.class);

    job.setInputFormatClass(TextInputFormat.class);
    job.setOutputFormatClass(TextOutputFormat.class);

    FileInputFormat.addInputPath(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));

    job.waitForCompletion(true);
    }
    }

    Collapse
Viewing 2 replies - 1 through 2 (of 2 total)