Running simple Java job

to create new topics or reply. | New User Registration

This topic contains 1 reply, has 2 voices, and was last updated by  Xiandong Su 1 year, 8 months ago.

  • Creator
  • #28538

    How do a run a simple Hadoop java job in the Sandbox?

    I’ve tried using using the java type Job Design but I get an error saying ClassNotFound HelloWorld. The myjob.jar file has the HelloWorld class in it.

    My settingings are
    Jar path /user/sample.myjob.jar
    Main class HelloWorld
    And my class is a trivial one that just sets the input file and output directory and uses the default mapper/reducer:

    import org.apache.hadoop.mapreduce.Job;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
    import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

    public class HelloWorld {

    public static void main( String [] args ) throws IOException, ClassNotFoundException, InterruptedException {
    // TODO Auto-generated method stub
    @SuppressWarnings (“unused” )
    Job job = new Job ();
    FileInputFormat.addInputPath (job, new Path( “/user/sample/trivialdata.txt” ));
    FileOutputFormat.setOutputPath (job, new Path( “/user/sample/otest1″)) ;
    job.setOutputKeyClass (Text. class) ;
    job.setOutputValueClass (Text. class) ;
    System.exit (job.waitForCompletion ( true) ? 0 : 1 ) ;


Viewing 1 replies (of 1 total)

You must be to reply to this topic. | Create Account

  • Author
  • #44978

    Xiandong Su

    I do not know if you have resolved this issue. If not, try to put the class in a package. I copied your code without any changes, except a package information. Running it on Hortonworks platform, and it ran through without any problems. When specifying main class, you need to have package in it. For exmaple: org.something.HelloWorld

Viewing 1 replies (of 1 total)
Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.