Running Wordcount on HDP on Centos 6.3

to create new topics or reply. | New User Registration

This topic contains 2 replies, has 2 voices, and was last updated by  Anand M 1 year, 1 month ago.

  • Creator
  • #47625



    I have created a cluster of size 3 nodes (1 master and 2 slaves) using the Hortonworks distribution. I went through step by step installation using Ambari. This was quite clean and the dashboard shows me the live nodes running and indicates “green” for each node and the services running.

    Now getting on to Map reduce, I did a smoke test for testing the installation is fine for which I followed the link to test map reduce, and it works fine.

    Now when I try executing a basic WordCount program I get this Exception:

    [hw_hadoop@hwslave2 WC]$ hadoop jar WordCount.jar WordCount /workspace/wordcount/words.txt /workspace/wordcount/output12
    14/01/29 10:20:49 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
    14/01/29 10:20:49 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
    14/01/29 10:20:49 INFO input.FileInputFormat: Total input paths to process : 1
    14/01/29 10:20:49 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
    14/01/29 10:20:49 INFO lzo.LzoCodec: Successfully loaded & initialized native-lzo library [hadoop-lzo rev cf4e7cbf8ed0f0622504d008101c2729dc0c9ff3]
    14/01/29 10:20:49 WARN snappy.LoadSnappy: Snappy native library is available
    14/01/29 10:20:49 INFO util.NativeCodeLoader: Loaded the native-hadoop library
    14/01/29 10:20:49 INFO snappy.LoadSnappy: Snappy native library loaded
    14/01/29 10:20:50 INFO mapred.JobClient: Running job: job_201401271605_0014
    14/01/29 10:20:51 INFO mapred.JobClient: map 0% reduce 0%
    14/01/29 10:21:05 INFO mapred.JobClient: Task Id : attempt_201401271605_0014_m_000000_0, Status : FAILED
    java.lang.RuntimeException: java.lang.ClassNotFoundException: WordCount$Map
    at org.apache.hadoop.conf.Configuration.getClass(
    at org.apache.hadoop.mapreduce.JobContext.getMapperClass(
    at org.apache.hadoop.mapred.MapTask.runNewMapper(
    at org.apache.hadoop.mapred.Child$
    at Method)
    at org.apache.hadoop.mapred.Child.main(
    Caused by: java.lang.ClassNotFoundException: WordCount$Map
    at Method)
    at java.lang.ClassLoader.loadClass(
    at sun.misc.Launcher$AppClassLoader.loadClass(
    at java.lang.ClassLoader.loadClass(
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(

Viewing 2 replies - 1 through 2 (of 2 total)

You must be to reply to this topic. | Create Account

  • Author
  • #49285

    Anand M

    Can you try adding your jar to the HADOOP_CLASSPATH variable.

    I faced this issue and I found a work around by doing this.
    export HADOOP_CLASSPATH=hadoop classpath:<your jar file>

    Please let me know if it works.



    Further when I checked online, it says that on a distributed environment, I need to add setJarByClass(WordCount.class) to my program. Still I get the same exception.

    Below is the code snippet and I have placed it in the default package.

    import java.util.*;

    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.conf.*;
    import org.apache.hadoop.mapreduce.*;
    import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
    import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
    import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
    import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

    public class WordCount {

    public static class Map extends Mapper<LongWritable, Text, Text, IntWritable> {
    private final static IntWritable one = new IntWritable(1);
    private Text word = new Text();

    public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
    String line = value.toString();
    StringTokenizer tokenizer = new StringTokenizer(line);
    while (tokenizer.hasMoreTokens()) {
    context.write(word, one);

    public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {

    public void reduce(Text key, Iterable<IntWritable> values, Context context)
    throws IOException, InterruptedException {
    int sum = 0;
    for (IntWritable val : values) {
    sum += val.get();
    context.write(key, new IntWritable(sum));

    public static void main(String[] args) throws Exception {
    Configuration conf = new Configuration();

    Job job = new Job(conf, "wordcount");




    FileInputFormat.addInputPath(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));


Viewing 2 replies - 1 through 2 (of 2 total)
Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.