The Hortonworks Community Connection is now live. A completely rebuilt Q&A forum, Knowledge Base, Code Hub and more, backed by the experts in the industry.

You will be redirected here in 10 seconds. If your are not redirected, click here to visit the new site.

The legacy Hortonworks Forum is now closed. You can view a read-only version of the former site by clicking here. The site will be taken offline on January 31,2016

Oozie Forum

Oozie noob question for pig jobs

  • #19309
    Evan Cutler

    I am doing the trainings, and have successfully executed some pig jobs.
    I did the code, and clicked “Save” which put the jobs in the left column.

    Now I want those jobs in Oozie.
    I kinda understand what Oozie does, but I don’t know how to add those saved jobs into the workflow.

    What do I put in the “Script Name” field to refer to my saved PIG jobs?
    where are they saved?

    when I action the jobs (to what I think I’m doing), I get: E0701: XML schema error, cvc-elt.1.a: Cannot find the declaration of element ‘workflow-app’.

    ok, I’d like to RTFM, but I need to know where I can find it for this step in the training.
    Any help would be greatly appreciated.


  • Author
  • #19418

    Hi Evan,

    Are you “doing the trainings” in the Hortonworks Sandbox?


    Evan Cutler

    I’m doing it in the sandbox. I have HUE running from the sandbox, and developing PIG jobs there. I hit save on the PIG developer screen, and they show up as saved jobs on the left.

    How do I take those jobs and send them to oozie? Does that even work?

    Or do I have to export all of them out to pig files set class-paths and all that jazz and then put it into oozie?



    Hi Evan,

    I’m trying to track down where these scripts are stored, will get back to you once I find out. Also looking into if this is even possible to do.


The forum ‘Oozie’ is closed to new topics and replies.

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.