Oozie Forum

Oozie noob question for pig jobs

  • #19309
    Evan Cutler

    I am doing the trainings, and have successfully executed some pig jobs.
    I did the code, and clicked “Save” which put the jobs in the left column.

    Now I want those jobs in Oozie.
    I kinda understand what Oozie does, but I don’t know how to add those saved jobs into the workflow.

    What do I put in the “Script Name” field to refer to my saved PIG jobs?
    where are they saved?

    when I action the jobs (to what I think I’m doing), I get: E0701: XML schema error, cvc-elt.1.a: Cannot find the declaration of element ‘workflow-app’.

    ok, I’d like to RTFM, but I need to know where I can find it for this step in the training.
    Any help would be greatly appreciated.


to create new topics or reply. | New User Registration

  • Author
  • #19418

    Hi Evan,

    Are you “doing the trainings” in the Hortonworks Sandbox?


    Evan Cutler

    I’m doing it in the sandbox. I have HUE running from the sandbox, and developing PIG jobs there. I hit save on the PIG developer screen, and they show up as saved jobs on the left.

    How do I take those jobs and send them to oozie? Does that even work?

    Or do I have to export all of them out to pig files set class-paths and all that jazz and then put it into oozie?



    Hi Evan,

    I’m trying to track down where these scripts are stored, will get back to you once I find out. Also looking into if this is even possible to do.


You must be to reply to this topic. | Create Account

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.