Hortonworks Sandbox Forum

Running simple Java job

  • #28538

    How do a run a simple Hadoop java job in the Sandbox?

    I’ve tried using using the java type Job Design but I get an error saying ClassNotFound HelloWorld. The myjob.jar file has the HelloWorld class in it.

    My settingings are
    Jar path /user/sample.myjob.jar
    Main class HelloWorld
    And my class is a trivial one that just sets the input file and output directory and uses the default mapper/reducer:

    import java.io.IOException;
    import org.apache.hadoop.mapreduce.Job;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
    import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
    import org.apache.hadoop.io.Text;

    public class HelloWorld {

    public static void main( String [] args ) throws IOException, ClassNotFoundException, InterruptedException {
    // TODO Auto-generated method stub
    @SuppressWarnings (“unused” )
    Job job = new Job ();
    FileInputFormat.addInputPath (job, new Path( “/user/sample/trivialdata.txt” ));
    FileOutputFormat.setOutputPath (job, new Path( “/user/sample/otest1”)) ;
    job.setOutputKeyClass (Text. class) ;
    job.setOutputValueClass (Text. class) ;
    System.exit (job.waitForCompletion ( true) ? 0 : 1 ) ;


to create new topics or reply. | New User Registration

  • Author
  • #44978
    Xiandong Su

    I do not know if you have resolved this issue. If not, try to put the class in a package. I copied your code without any changes, except a package information. Running it on Hortonworks platform, and it ran through without any problems. When specifying main class, you need to have package in it. For exmaple: org.something.HelloWorld

You must be to reply to this topic. | Create Account

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.