The Hortonworks Community Connection is now live. A completely rebuilt Q&A forum, Knowledge Base, Code Hub and more, backed by the experts in the industry.

You will be redirected here in 10 seconds. If your are not redirected, click here to visit the new site.

The legacy Hortonworks Forum is now closed. You can view a read-only version of the former site by clicking here. The site will be taken offline on January 31,2016

Oozie Forum

Oozie sqoop workflow action supported?

  • #7962
    Ben Flint

    Am I correct in saying that oozie’s sqoop workflow action is not supported using the the hadoop, oozie, and sqoop versions that ship with HDP 1.0?

  • Author
  • #7975

    Yes, sqoop action is not part of oozie 3.1.3 that is shipped with HDP 1.0. We plan to provide Oozie 3.2 as part of HDP 1.1 which has sqoop action.

    If you need it earlier, I can help you patch sqoop action on top of oozie 3.1.3.

    Sorry for the delay in my response.


    Ben Flint

    Thanks, Venkatesh. What would be involved in patching oozie to include the sqoop action?


    Hi, Ben,

    Sqoop Action is added to Oozie in:
    We could take the patch from OOZIE-156 and apply on Oozie-3.1.3 and build it.

    Few questions:

    * What is your use case?
    * What is your timeline?

    Background: We did not include it since this was not tested at scale and we also provide a ETL tool, powered by Talend that lets you design this in a visual environment and schedule it to run on a Hadoop cluster using Oozie.


    Ben Flint


    My use case is, I think, not different from anyone else who uses sqoop or oozie. I would like an easy way to be able to extract data, perhaps incrementally, from some legacy dbs and load it into hive tables. This is, of course, just the first step in a periodic job, so I would like to be able to manage the job’s workflow using oozie.

    I took Talend for a test-drive a few weeks ago, and while it looks pretty good, I don’t really want to introduce a tool that provides a bunch of feature I do not currently need. Also, I got some errors trying to start Talend yesterday and I don’t really have the time to fight through them. I am working on a pretty short timeline.



    Hi, Ben,

    I can understand. How do you want to proceed. Do you want to patch Oozie or you need a patched Oozie tarball? Let me know if you need any help.

    Clarification on talend: You use the tool once which generates code and deploys that to oozie behind the scenes which runs the job on the cluster.


    Ben Flint

    While I could take the time to patch it myself, I would appreciate it if you could save me some time and effort by supplying a tarball.

    Ben Flint

    Is there any reason I can’t just upgrade oozie to version 3.2? It claims backward-compatibility.


    You should be able to use 3.2. It is backward compatible.

    Also, another quick hack would be to write a java action and invoke sqoop using the Java API.

    SqoopTool sqoopTool = SqoopTool.getTool(“import”);

    SqoopOptions sqoopOptions = new SqoopOptions(configuration);

    Sqoop sqoop = new Sqoop(sqoopTool, configuration, sqoopOptions);
    int result =;

The forum ‘Oozie’ is closed to new topics and replies.

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.