The Hortonworks Community Connection is now live. A completely rebuilt Q&A forum, Knowledge Base, Code Hub and more, backed by the experts in the industry.

You will be redirected here in 10 seconds. If your are not redirected, click here to visit the new site.

The legacy Hortonworks Forum is now closed. You can view a read-only version of the former site by clicking here. The site will be taken offline on January 31,2016

Oozie Forum

oozie configuration

  • #21764
    Evan Cutler

    greetings, I am using the sandbox.
    I am trying to start OOZIE jobs.. but they are failing…
    I’m noticing the HDP configuration for OOZIE’s connection to the jobtracker as sandbox:50300…
    I am not thinking that’s correct.

    where do I go and reset it to 50030? Thanks.

  • Author
  • #21807
    Yi Zhang

    Hi Evan,

    Oozie communicates with jobtracker through ipc calls. The proper port is what is configured for mapred.job.tracker. If that is set to 50300, then it is the right port.

    50030 is the http port by default and is not used by oozie.

    Any thing in oozie logs?


    Evan Cutler

    hi yi,
    thanks for responding….
    I was responding to how oozie was showing reports.

    The reports show:
    oozie.use.system.libpath true
    nameNode hdfs://sandbox:8020
    hue-id-w 23
    jobTracker sandbox:50300 hdfs://sandbox:8020/user/hue/oozie/workspaces/_sandbox_-oozie-23

    I didn’t think that was correct because the jobtracker has been sandbox:50030. I don’t know where to confirm the 50300 entry. I’m good iwth it if I can get oozie to start.

    oozie won’t execute anything. pig, sqoop, nothing.

    for example, these sqoop commands work in a putty screen:
    sqoop eval –connect jdbc:oracle:thin:Earthquake/rogue2@// –query “truncate table quake_by_city”
    sqoop export –connect jdbc:oracle:thin:Earthquake/rogue2@// –table quake_by_city –export-dir /apps/hive/warehouse/quake_by_city –input-fields-terminated-by ‘|’ –columns city,state,mag_avg,mag_cnt

    but for some reason they are not exporting in oozie. Am I doing this wrong or something?

    Larry Liu

    Hi, Evan

    What did you do to start oozie? Can you please provide more details?

    By the way, port 50300 is the job tracker RPC port. HDP uses this port for JobTracker.


    Evan Cutler

    I wish I could….
    please forgive…I am a complete noob here..

    ok, I downloaded the VM for VMWARE.
    downloaded the player.
    opened the web front end
    started hue.
    opened a putty screen.
    loaded a file into hdfs
    created a pig job to do something.
    took pig output and registered it in HCATALOG.
    performed select against it in HIVE.

    All good.
    now comes oozie.

    I created a table in a local oracle database. downloaded odbc:thin jars and placed in sqoop lib.
    created sqoop string to load pig output into database. perfect.
    truncated table.
    moved command string into oozie into block that says paste complete command.
    tried to execute. failed.
    looked at example sqoop command. removed word “sqoop” from command. left with:
    eval –connect jdbc:oracle:thin:Earthquake/rogue2@// –query “truncate table quake_by_city”

    if I added sqoop back and pasted line into putty, works perfect.
    pasted line without “sqoop” into oozie and hit run. fails with error: Action failed, error message[Main class [org.apache.oozie.action.hadoop.SqoopMain], exit code [1]]

    looked at logs…map, setup and cleanup says succeeded. I don’t know where to go from there.
    I also tried pig jobs. same issue. Am I missing something?

    Where do I go to troubleshoot from there.

    Larry Liu

    Hi, Evan

    How did you run the sqoop command in oozie?

    Can you please provide all the information including, workflow.xml and script?


    Evan Cutler

    I clicked on oozie on the top menu….got Workflow Editor
    I looked at the Sqoop example.

    I clicked on CREATE.
    I clicked on +Sqoop.
    gave it a name: QBC_Truncate
    put in command section:
    eval –connect jdbc:oracle:thin:Earthquake/rogue2@// –query “truncate table quake_by_city”
    no parameters or any other values.
    Tried to run.
    Got error.
    What else do I do?

    Thanks very much.

    Evan Cutler

    oh, and I forgot to mention, I did download the oracle thin client jar and put into the sqoop libraries…it works that way in teh command line.

    Thanks fuch.

    Evan Cutler

    and please forgive my spelling…new keyboard.

    Larry Liu

    Hi, Evan

    If you put sqoop in the command, it works. I am not sure if this is how it should work. Let me get back to you once I find the answer.


    Evan Cutler

    Thanks Larry, most appreciated.

    Yi Zhang

    Hi Evan,

    I see the same problem here. It looks like a oozie config issue. Will get back to you once figure this out.


    Evan Cutler

    Thank you so much guys.
    I appreciate the time.
    I really want to get this to work.
    This is really cutting edge stuff.

    Thanks again,

    Yi Zhang

    Hi Evan,

    You need to put your odbc jar file in oozie’s share lib /user/oozie/share/lib/sqoop/. You probably want it in other sharelib directories for pig/hive as well.


    Evan Cutler

    Hi Yi…
    I can’t find that directory….I have it right now in:

    I found these directories:

    I am placing it in those two spots…more to follow.

    Evan Cutler

    Sorry Yi…
    same result.
    Am I missing something else?

    Evan Cutler

    ok Yi, figured out you meant to go on the folder in HDFS.

    Got them in there….jobs are still failing….
    Still looking.

    Yi Zhang

    Hi Evan,

    Yes I meant user ooize’s home directory in HDFS. Make sure the jar there has the same permissions as other jars.
    The job tracker’s task logs would be helpful.

    http://sandbox-ip-addr:50030/ is the jobtracker web UI.


    Evan Cutler

    I got the job to work.
    I broke the command into the args section as parts, and the first sqoop ran perfect.

    Now I have another problem.
    As soon as I add a second sqoop, I get a
    “IndexError at /oozie/edit_workflow/” list index out of range error.
    I don’t know why…
    I’ve added, deleted, copied, erased…tried everything to get more than one sqoop on this going.
    Sometimes, The first sqoop gives me this error…
    as soon as that happens the whole workflow disappears and I have to start over.

    Almost there.
    thanks so much.

    Seth Lyubich

    Hi Evan,

    Can you please provide more details on what happened with the second issue? Can you please share configuration details?


    Evan Cutler

    When you work with the sandbox, you have about 8 example oozie workflows.
    The configuration is as the sandbox comes when you download it…I have not changed it.

    Here are the steps I take:
    1. I hit create
    2. I hit add step->sqoop
    3. I give sqoop a name, and enter the commands in the args positions
    4. I hit save. The screen comes back.
    5. I repeat step 2 and 3.
    6. I hit save, and I get the error. Then I notice the whole workflow disappears, along with the parent folder.
    7. I have to delete workflow and start again.

    I have tried also manually creating the XML in full, but there is no import option for creating workflows. Or at least that I’ve seen.


    Hi Evan,

    Thanks for continuing to work with the Sandbox.

    Have you tried to make sure this is an oozie problem and not a sqoop problem? Do workflows created for other types of jobs work?


The forum ‘Oozie’ is closed to new topics and replies.

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.