How to run mapreduce job in sandbox?

to create new topics or reply. | New User Registration

Tagged: 

This topic contains 2 replies, has 3 voices, and was last updated by  Jeet Chatterjee 1 year, 5 months ago.

  • Creator
    Topic
  • #43705

    Krish N
    Member

    Thanks

Viewing 2 replies - 1 through 2 (of 2 total)

You must be to reply to this topic. | Create Account

  • Author
    Replies
  • #46307

    i am trying to run a wordcount MR job in sanbox.but i am getting invalid jar error.This is my program

    public class WordCount {

    public static class Map extends MapReduceBase implements
    Mapper<LongWritable, Text, Text, IntWritable> {
    private final static IntWritable one = new IntWritable(1);
    private Text word = new Text();

    public void map(LongWritable key, Text value,
    OutputCollector<Text, IntWritable> output, Reporter reporter)
    throws IOException {
    String line = value.toString();
    StringTokenizer tokenizer = new StringTokenizer(line);
    while (tokenizer.hasMoreTokens()) {
    word.set(tokenizer.nextToken());
    output.collect(word, one);
    }
    }
    }

    public static class Reduce extends MapReduceBase implements
    Reducer<Text, IntWritable, Text, IntWritable> {
    public void reduce(Text key, Iterator<IntWritable> values,
    OutputCollector<Text, IntWritable> output, Reporter reporter)
    throws IOException {
    int sum = 0;
    while (values.hasNext()) {
    sum += values.next().get();
    }
    output.collect(key, new IntWritable(sum));
    }
    }

    public static void main(String[] args) throws IOException {
    JobConf conf = new JobConf(WordCount.class);
    conf.setJobName(“wordcount”);

    conf.setOutputKeyClass(Text.class);
    conf.setOutputValueClass(IntWritable.class);

    conf.setMapperClass(Map.class);
    conf.setCombinerClass(Reduce.class);
    conf.setReducerClass(Reduce.class);

    conf.setInputFormat(TextInputFormat.class);
    conf.setOutputFormat(TextOutputFormat.class);

    FileInputFormat.setInputPaths(conf, new Path(args[0]));
    FileOutputFormat.setOutputPath(conf, new Path(args[1]));

    JobClient.runJob(conf);

    }

    }

    i am trying to run in sandbox using hadoop jar C:\Users\Jeet\VirtualBox VMs\Hortonworks Sandbox 1.3\WordCount.jar

    but getting an invalid jar exception.C:\Users\Jeet\VirtualBox VMs\Hortonworks Sandbox 1.3 this is the path i have installed the sandbox.please help me..

    Collapse
    #43843

    Dave
    Moderator

    Hi Krish,

    You can follow the tutorials, or you can run a hive job which does a select from a table with a limit of 1000.

    This will start a MR job

    Thanks

    Dave

    Collapse
Viewing 2 replies - 1 through 2 (of 2 total)
Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.