The Hortonworks Community Connection is now live. A completely rebuilt Q&A forum, Knowledge Base, Code Hub and more, backed by the experts in the industry.

You will be redirected here in 10 seconds. If your are not redirected, click here to visit the new site.

The legacy Hortonworks Forum is now closed. You can view a read-only version of the former site by clicking here. The site will be taken offline on January 31,2016

MapReduce Forum

Running Wordcount on HDP on Centos 6.3

  • #47625


    I have created a cluster of size 3 nodes (1 master and 2 slaves) using the Hortonworks distribution. I went through step by step installation using Ambari. This was quite clean and the dashboard shows me the live nodes running and indicates “green” for each node and the services running.

    Now getting on to Map reduce, I did a smoke test for testing the installation is fine for which I followed the link to test map reduce, and it works fine.

    Now when I try executing a basic WordCount program I get this Exception:

    [hw_hadoop@hwslave2 WC]$ hadoop jar WordCount.jar WordCount /workspace/wordcount/words.txt /workspace/wordcount/output12
    14/01/29 10:20:49 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
    14/01/29 10:20:49 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
    14/01/29 10:20:49 INFO input.FileInputFormat: Total input paths to process : 1
    14/01/29 10:20:49 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
    14/01/29 10:20:49 INFO lzo.LzoCodec: Successfully loaded & initialized native-lzo library [hadoop-lzo rev cf4e7cbf8ed0f0622504d008101c2729dc0c9ff3]
    14/01/29 10:20:49 WARN snappy.LoadSnappy: Snappy native library is available
    14/01/29 10:20:49 INFO util.NativeCodeLoader: Loaded the native-hadoop library
    14/01/29 10:20:49 INFO snappy.LoadSnappy: Snappy native library loaded
    14/01/29 10:20:50 INFO mapred.JobClient: Running job: job_201401271605_0014
    14/01/29 10:20:51 INFO mapred.JobClient: map 0% reduce 0%
    14/01/29 10:21:05 INFO mapred.JobClient: Task Id : attempt_201401271605_0014_m_000000_0, Status : FAILED
    java.lang.RuntimeException: java.lang.ClassNotFoundException: WordCount$Map
    at org.apache.hadoop.conf.Configuration.getClass(
    at org.apache.hadoop.mapreduce.JobContext.getMapperClass(
    at org.apache.hadoop.mapred.MapTask.runNewMapper(
    at org.apache.hadoop.mapred.Child$
    at Method)
    at org.apache.hadoop.mapred.Child.main(
    Caused by: java.lang.ClassNotFoundException: WordCount$Map
    at Method)
    at java.lang.ClassLoader.loadClass(
    at sun.misc.Launcher$AppClassLoader.loadClass(
    at java.lang.ClassLoader.loadClass(
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(

  • Author
  • #47626

    Further when I checked online, it says that on a distributed environment, I need to add setJarByClass(WordCount.class) to my program. Still I get the same exception.

    Below is the code snippet and I have placed it in the default package.

    import java.util.*;

    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.conf.*;
    import org.apache.hadoop.mapreduce.*;
    import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
    import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
    import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
    import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

    public class WordCount {

    public static class Map extends Mapper<LongWritable, Text, Text, IntWritable> {
    private final static IntWritable one = new IntWritable(1);
    private Text word = new Text();

    public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
    String line = value.toString();
    StringTokenizer tokenizer = new StringTokenizer(line);
    while (tokenizer.hasMoreTokens()) {
    context.write(word, one);

    public static class Reduce extends Reducer<Text, IntWritable, Text, IntWritable> {

    public void reduce(Text key, Iterable<IntWritable> values, Context context)
    throws IOException, InterruptedException {
    int sum = 0;
    for (IntWritable val : values) {
    sum += val.get();
    context.write(key, new IntWritable(sum));

    public static void main(String[] args) throws Exception {
    Configuration conf = new Configuration();

    Job job = new Job(conf, "wordcount");




    FileInputFormat.addInputPath(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));


    Anand M

    Can you try adding your jar to the HADOOP_CLASSPATH variable.

    I faced this issue and I found a work around by doing this.
    export HADOOP_CLASSPATH=hadoop classpath:<your jar file>

    Please let me know if it works.

    Da Zhou

    Hi, I met the same problem.

    Could you tell me your solution?

The forum ‘MapReduce’ is closed to new topics and replies.

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.