MapReduce Forum

How to count a particular word in a file by taking the word as an argument?

  • #32896
    Good Boy

    Hi Friends,
    I am new to Hadoop mapreduce as well as to java. I am struggling in writing a mapreduce program which will count the number of times a particular word is present in a file. Both the file and the word should be an user input. So I am trying to pass the particular word as an argument to void main() along with the i/p and o/p paths. After getting the word in my void main I need to pass it to my map function to search the occurrence of the word. But I dont know how to do it. Can anyone pls help. Here is my code.

    import java.util.Iterator;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.mapreduce.Job;
    import org.apache.hadoop.mapreduce.Mapper;
    import org.apache.hadoop.mapreduce.Reducer;
    import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
    import org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat;
    import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
    import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
    import org.apache.hadoop.util.GenericOptionsParser;
    public class MyWordCount {
    public static class WordCountMap extends Mapper {
    static String wordToSearch;
    private final static LongWritable ONE = new LongWritable(1L);
    private Text word = new Text();
    public void map(Text key, Text value, Context context)
    throws IOException, InterruptedException {
    if (value.toString().compareTo(wordToSearch) == 0){
    context.write(word, ONE);
    public static class SumReduce extends Reducer {
    public void reduce(Text key, Iterator values,
    Context context) throws IOException, InterruptedException {
    long sum = 0L;
    while (values.hasNext()) {
    sum +=;
    context.write(key, new LongWritable(sum));
    public static void main(String[] rawArgs) throws Exception {
    GenericOptionsParser parser = new GenericOptionsParser(rawArgs);
    Configuration conf = parser.getConfiguration();
    String[] args = parser.getRemainingArgs();
    Job job = new Job(conf, “wordcount”);
    FileInputFormat.addInputPath(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));
    String myWord = args[2];
    I need to get the value of “myWord” from main() function to map() function.
    Thanks in advance

to create new topics or reply. | New User Registration

  • Author
  • #33386


    Please use it as implemented in the following:

    public int run(String[] args) throws Exception {
    Configuration conf = getConf();
    args = new GenericOptionsParser(conf, args).getRemainingArgs();

    // Get the input name as arguments
    String WordCount = args[0];


You must be to reply to this topic. | Create Account

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.