Hortonworks Sandbox Forum

How to run mapreduce job in sandbox?

  • #43705
    Krish N


to create new topics or reply. | New User Registration

  • Author
  • #43843

    Hi Krish,

    You can follow the tutorials, or you can run a hive job which does a select from a table with a limit of 1000.

    This will start a MR job




    i am trying to run a wordcount MR job in sanbox.but i am getting invalid jar error.This is my program

    public class WordCount {

    public static class Map extends MapReduceBase implements
    Mapper<LongWritable, Text, Text, IntWritable> {
    private final static IntWritable one = new IntWritable(1);
    private Text word = new Text();

    public void map(LongWritable key, Text value,
    OutputCollector<Text, IntWritable> output, Reporter reporter)
    throws IOException {
    String line = value.toString();
    StringTokenizer tokenizer = new StringTokenizer(line);
    while (tokenizer.hasMoreTokens()) {
    output.collect(word, one);

    public static class Reduce extends MapReduceBase implements
    Reducer<Text, IntWritable, Text, IntWritable> {
    public void reduce(Text key, Iterator<IntWritable> values,
    OutputCollector<Text, IntWritable> output, Reporter reporter)
    throws IOException {
    int sum = 0;
    while (values.hasNext()) {
    sum += values.next().get();
    output.collect(key, new IntWritable(sum));

    public static void main(String[] args) throws IOException {
    JobConf conf = new JobConf(WordCount.class);




    FileInputFormat.setInputPaths(conf, new Path(args[0]));
    FileOutputFormat.setOutputPath(conf, new Path(args[1]));




    i am trying to run in sandbox using hadoop jar C:\Users\Jeet\VirtualBox VMs\Hortonworks Sandbox 1.3\WordCount.jar

    but getting an invalid jar exception.C:\Users\Jeet\VirtualBox VMs\Hortonworks Sandbox 1.3 this is the path i have installed the sandbox.please help me..

You must be to reply to this topic. | Create Account

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.