Home Forums Oozie Oozie sqoop workflow action supported?

Tagged: ,

This topic contains 8 replies, has 2 voices, and was last updated by  Venkatesh Seetharam 1 year, 11 months ago.

  • Creator
    Topic
  • #7962

    Ben Flint
    Member

    Am I correct in saying that oozie’s sqoop workflow action is not supported using the the hadoop, oozie, and sqoop versions that ship with HDP 1.0?

Viewing 8 replies - 1 through 8 (of 8 total)

You must be logged in to reply to this topic.

  • Author
    Replies
  • #8075

    You should be able to use 3.2. It is backward compatible.

    Also, another quick hack would be to write a java action and invoke sqoop using the Java API.

    SqoopTool sqoopTool = SqoopTool.getTool(“import”);

    SqoopOptions sqoopOptions = new SqoopOptions(configuration);
    sqoopOptions.setActiveSqoopTool(sqoopTool);
    sqoopOptions.setConnectString(connectionUrl);
    sqoopOptions.setDriverClassName(driver);
    sqoopOptions.setUsername(userName);
    sqoopOptions.setPassword(password);
    sqoopOptions.setTableName(tableName);
    sqoopOptions.setTargetDir(targetDir);
    sqoopOptions.setNumMappers(numMappers);
    sqoopOptions.setFileLayout(SqoopOptions.FileLayout.TextFile);

    Sqoop sqoop = new Sqoop(sqoopTool, configuration, sqoopOptions);
    int result = sqoop.run(args);

    Collapse
    #8004

    Ben Flint
    Member

    Is there any reason I can’t just upgrade oozie to version 3.2? It claims backward-compatibility.

    Collapse
    #7996

    Ben Flint
    Member

    While I could take the time to patch it myself, I would appreciate it if you could save me some time and effort by supplying a tarball.

    Collapse
    #7995

    Hi, Ben,

    I can understand. How do you want to proceed. Do you want to patch Oozie or you need a patched Oozie tarball? Let me know if you need any help.

    Clarification on talend: You use the tool once which generates code and deploys that to oozie behind the scenes which runs the job on the cluster.

    Thanks,
    Venkatesh

    Collapse
    #7991

    Ben Flint
    Member

    Venkatesh,

    My use case is, I think, not different from anyone else who uses sqoop or oozie. I would like an easy way to be able to extract data, perhaps incrementally, from some legacy dbs and load it into hive tables. This is, of course, just the first step in a periodic job, so I would like to be able to manage the job’s workflow using oozie.

    I took Talend for a test-drive a few weeks ago, and while it looks pretty good, I don’t really want to introduce a tool that provides a bunch of feature I do not currently need. Also, I got some errors trying to start Talend yesterday and I don’t really have the time to fight through them. I am working on a pretty short timeline.

    -Ben

    Collapse
    #7986

    Hi, Ben,

    Sqoop Action is added to Oozie in: https://issues.apache.org/jira/browse/OOZIE-156
    We could take the patch from OOZIE-156 and apply on Oozie-3.1.3 and build it.

    Few questions:

    * What is your use case?
    * What is your timeline?

    Background: We did not include it since this was not tested at scale and we also provide a ETL tool, powered by Talend that lets you design this in a visual environment and schedule it to run on a Hadoop cluster using Oozie.

    Thanks,
    Venkatesh

    Collapse
    #7985

    Ben Flint
    Member

    Thanks, Venkatesh. What would be involved in patching oozie to include the sqoop action?

    Collapse
    #7975

    Yes, sqoop action is not part of oozie 3.1.3 that is shipped with HDP 1.0. We plan to provide Oozie 3.2 as part of HDP 1.1 which has sqoop action.

    If you need it earlier, I can help you patch sqoop action on top of oozie 3.1.3.

    Sorry for the delay in my response.

    Thanks,
    Venkatesh

    Collapse
Viewing 8 replies - 1 through 8 (of 8 total)