Home Forums Oozie Oozie: Unknown hadoop job

This topic contains 6 replies, has 3 voices, and was last updated by  D Blair Elzinga 8 months, 2 weeks ago.

  • Creator
    Topic
  • #43222

    Jeremy Dyer
    Member

    I’m getting this exception while trying to run an oozie job that has been running fine for the past 2 days. JA017: Unknown hadoop job [job_1383844159459_0005] associated with action [0000001-131107120912274-oozie-oozi-W@***]. Failing this action!

    I am assuming this is something with the Yarn container being shutdown before oozie JavaActionExecutor.check method is expecting it to. I know which machine the container is running on and checked the logs but didn’t see anything interesting. It did show that the container I know the Java action was running in being shutdown but no errors around those logs.

    I’m not seeing anything aside from the above exception in the oozie logs either.

Viewing 6 replies - 1 through 6 (of 6 total)

You must be logged in to reply to this topic.

  • Author
    Replies
  • #49825

    D Blair Elzinga
    Participant

    Notice that the job end notification address is //:

    I’m wondering if this is a configuration issue in my cluster…

    Collapse
    #49824

    D Blair Elzinga
    Participant

    Here is what I see in the history server log for the job at <historyserver>:19888/jobhistory/logs/octopus.svs.usa.hp.com:45454/container_1394028045311_0030_01_000001/job_1394028045311_0030/mapred:
    2014-03-07 14:55:11,375 INFO [Thread-62] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Job end notification started for jobID : job_1394028045311_0030
    2014-03-07 14:55:11,377 INFO [Thread-62] org.mortbay.log: Job end notification attempts left 0
    2014-03-07 14:55:11,377 INFO [Thread-62] org.mortbay.log: Job end notification trying http://:/oozie/callback?id=0000019-140305061228920-oozie-oozi-W@EvaluateMessage2&status=SUCCEEDED&amp;
    2014-03-07 14:55:11,386 WARN [Thread-62] org.mortbay.log: Job end notification to http://:/oozie/callback?id=0000019-140305061228920-oozie-oozi-W@EvaluateMessage2&status=SUCCEEDED&amp; failed
    java.net.ConnectException: Connection refused
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
    at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)
    at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
    at java.net.Socket.connect(Socket.java:529)
    at sun.net.NetworkClient.doConnect(NetworkClient.java:158)
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:411)
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:525)
    at sun.net.www.http.HttpClient.<init>(HttpClient.java:208)
    at sun.net.www.http.HttpClient.New(HttpClient.java:291)
    at sun.net.www.http.HttpClient.New(HttpClient.java:310)
    at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:987)
    at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:966)
    at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:841)
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1195)
    at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:379)
    at org.apache.hadoop.mapreduce.v2.app.JobEndNotifier.notifyURLOnce(JobEndNotifier.java:131)
    at org.apache.hadoop.mapreduce.v2.app.JobEndNotifier.notify(JobEndNotifier.java:180)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.shutDownJob(MRAppMaster.java:566)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler$1.run(MRAppMaster.java:599)
    2014-03-07 14:55:12,389 WARN [Thread-62] org.mortbay.log: Job end notification failed to notify : http://:/oozie/callback?id=0000019-140305061228920-oozie-oozi-W@EvaluateMessage2&status=SUCCEEDED&amp;
    2014-03-07 14:55:17,392 INFO [Thread-62] org.apache.hadoop.ipc.Server: Stopping server on 65444

    Collapse
    #49796

    D Blair Elzinga
    Participant

    Interesting. I already had yarn.app.mapreduce.am.staging-dir property set, but in mapred-site.xml rather than core-site.xml. Evidently it works from either place?

    oozie launcher jobs were showing up in history server, but evidently the job end notifications are not getting to oozie and so oozie doesn’t think that the jobs have finished while the resource manager shows the jobs completed successfully.

    Collapse
    #49601

    Mehant Baid
    Participant

    I had the same issue. For some reason in my history server the oozie launcher job was not showing up and only the actual map-reduce job submitted by the launcher was showing up. Since oozie uses the history server to keep track of the job even though the jobs actually completed successfully, oozie failed them because it couldn’t track it.

    The way I got around the problem was setting the following property in conf/hadoop-conf/core-site.xml
    <property>
    <name>yarn.app.mapreduce.am.staging-dir</name>
    <value>${fs.defaultFS}/tmp/staging</value>
    </property>

    Once I set this, I started seeing my launcher job in the history server and everything was running smoothly.

    Collapse
    #49575

    D Blair Elzinga
    Participant

    Jeremy, evidently these hortonworks community forums don’t get much reading from the experts. Did you ever solve your problem?

    I’m having a similar issue and it appears that the underlying issue is actually with the HistoryServer.

    Evidently the oozie design has it looking to the history server (rather than the resourceManager) to see if a job has been completed. If the history server can’t load the job information, then it give various errors.

    Here is a link to similar issue – just the issue I’m having: http://mail-archives.apache.org/mod_mbox/oozie-user/201402.mbox/%3cCAAu13zFQWQuVo-ShJYVT-o+D=SC6-4Zhwn8UKPsB31wS62=0Jw@mail.gmail.com%3e

    Collapse
    #43639

    Jeremy Dyer
    Member

    Has anyone ever seen this before or have any suggestions about what could possibly be causing it? Do you think it is something I should post on the Yarn forums?

    Collapse
Viewing 6 replies - 1 through 6 (of 6 total)