Unable to submit mapreduce job to Yarn via java client

to create new topics or reply. | New User Registration


This topic contains 1 reply, has 2 voices, and was last updated by  Koelli Mungee 1 year, 3 months ago.

  • Creator
  • #51837


    Hi All,
    I’m trying to submit a mapreduce job from a java client running on Windows7 to the Hortonworks Sandbox.

    Driver code:
    public static void main(String[] args) throws Exception
    UserGroupInformation ugi = UserGroupInformation.createRemoteUser(“jackie_leslie”);
    ugi.doAs(new PrivilegedExceptionAction<Object>() {
    String[] jobArgs;
    public Object run() throws Exception {
    JobWrapper mr = new JobWrapper();
    int exitCode = ToolRunner.run(mr, jobArgs);
    return mr;

    private PrivilegedExceptionAction init(String[] myArgs)
    this.jobArgs = myArgs;
    return this;

    Mapper class

    public int run(String[] args) throws Exception
    Configuration config = getConf();
    config.set(“fs.defaultFS”, “hdfs://sandbox.hortonworks.com:8020″);
    config.set(“mapreduce.framework.name”, “yarn”);
    config.set(“yarn.resourcemanager.address”, “localhost:8050″); //8025 or 8032?
    config.set(“hadoop.job.ugi”, “jackie_leslie”);

    Job job = new Job(config);
    FileInputFormat.addInputPath(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));
    boolean success = job.waitForCompletion(true);

    client stacktrace
    Exception in thread “main” org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /tmp/hadoop-yarn/staging/jackie_leslie/.staging/job_1397842310782_0004/job.split could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
    at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2503)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582) etc.

    Also from the logs.
    2014-04-18 13:59:25,603 WARN security.UserGroupInformation (UserGroupInformation.java:getGroupNames(1355)) – No groups available for user jackie_leslie

    Why is this job not being submitted to my data node?


Viewing 1 replies (of 1 total)

You must be to reply to this topic. | Create Account

  • Author
  • #52609

    Koelli Mungee

    Hi Jackie

    Can you verify that the datanodes are up and check the output of “hadoop dfsadmin -report” to ensure that the HDFS is healthy. Are you able to put files into the HDFS from the command line?


Viewing 1 replies (of 1 total)
Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.