Oozie Forum

Oozie: Unknown hadoop job

  • #43222
    Jeremy Dyer

    I’m getting this exception while trying to run an oozie job that has been running fine for the past 2 days. JA017: Unknown hadoop job [job_1383844159459_0005] associated with action [0000001-131107120912274-oozie-oozi-W@***]. Failing this action!

    I am assuming this is something with the Yarn container being shutdown before oozie JavaActionExecutor.check method is expecting it to. I know which machine the container is running on and checked the logs but didn’t see anything interesting. It did show that the container I know the Java action was running in being shutdown but no errors around those logs.

    I’m not seeing anything aside from the above exception in the oozie logs either.

to create new topics or reply. | New User Registration

  • Author
  • #43639
    Jeremy Dyer

    Has anyone ever seen this before or have any suggestions about what could possibly be causing it? Do you think it is something I should post on the Yarn forums?

    D Blair Elzinga

    Jeremy, evidently these hortonworks community forums don’t get much reading from the experts. Did you ever solve your problem?

    I’m having a similar issue and it appears that the underlying issue is actually with the HistoryServer.

    Evidently the oozie design has it looking to the history server (rather than the resourceManager) to see if a job has been completed. If the history server can’t load the job information, then it give various errors.

    Here is a link to similar issue – just the issue I’m having: http://mail-archives.apache.org/mod_mbox/oozie-user/201402.mbox/%3cCAAu13zFQWQuVo-ShJYVT-o+D=SC6-4Zhwn8UKPsB31wS62=0Jw@mail.gmail.com%3e

    Mehant Baid

    I had the same issue. For some reason in my history server the oozie launcher job was not showing up and only the actual map-reduce job submitted by the launcher was showing up. Since oozie uses the history server to keep track of the job even though the jobs actually completed successfully, oozie failed them because it couldn’t track it.

    The way I got around the problem was setting the following property in conf/hadoop-conf/core-site.xml

    Once I set this, I started seeing my launcher job in the history server and everything was running smoothly.

    D Blair Elzinga

    Interesting. I already had yarn.app.mapreduce.am.staging-dir property set, but in mapred-site.xml rather than core-site.xml. Evidently it works from either place?

    oozie launcher jobs were showing up in history server, but evidently the job end notifications are not getting to oozie and so oozie doesn’t think that the jobs have finished while the resource manager shows the jobs completed successfully.

    D Blair Elzinga

    Here is what I see in the history server log for the job at <historyserver>:19888/jobhistory/logs/octopus.svs.usa.hp.com:45454/container_1394028045311_0030_01_000001/job_1394028045311_0030/mapred:
    2014-03-07 14:55:11,375 INFO [Thread-62] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Job end notification started for jobID : job_1394028045311_0030
    2014-03-07 14:55:11,377 INFO [Thread-62] org.mortbay.log: Job end notification attempts left 0
    2014-03-07 14:55:11,377 INFO [Thread-62] org.mortbay.log: Job end notification trying http://:/oozie/callback?id=0000019-140305061228920-oozie-oozi-W@EvaluateMessage2&status=SUCCEEDED&amp;
    2014-03-07 14:55:11,386 WARN [Thread-62] org.mortbay.log: Job end notification to http://:/oozie/callback?id=0000019-140305061228920-oozie-oozi-W@EvaluateMessage2&status=SUCCEEDED&amp; failed
    java.net.ConnectException: Connection refused
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
    at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)
    at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
    at java.net.Socket.connect(Socket.java:529)
    at sun.net.NetworkClient.doConnect(NetworkClient.java:158)
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:411)
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:525)
    at sun.net.www.http.HttpClient.<init>(HttpClient.java:208)
    at sun.net.www.http.HttpClient.New(HttpClient.java:291)
    at sun.net.www.http.HttpClient.New(HttpClient.java:310)
    at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:987)
    at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:966)
    at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:841)
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1195)
    at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:379)
    at org.apache.hadoop.mapreduce.v2.app.JobEndNotifier.notifyURLOnce(JobEndNotifier.java:131)
    at org.apache.hadoop.mapreduce.v2.app.JobEndNotifier.notify(JobEndNotifier.java:180)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.shutDownJob(MRAppMaster.java:566)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler$1.run(MRAppMaster.java:599)
    2014-03-07 14:55:12,389 WARN [Thread-62] org.mortbay.log: Job end notification failed to notify : http://:/oozie/callback?id=0000019-140305061228920-oozie-oozi-W@EvaluateMessage2&status=SUCCEEDED&amp;
    2014-03-07 14:55:17,392 INFO [Thread-62] org.apache.hadoop.ipc.Server: Stopping server on 65444

    D Blair Elzinga

    Notice that the job end notification address is //:

    I’m wondering if this is a configuration issue in my cluster…

You must be to reply to this topic. | Create Account

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.