The Hortonworks Community Connection is now live. A completely rebuilt Q&A forum, Knowledge Base, Code Hub and more, backed by the experts in the industry.

You will be redirected here in 10 seconds. If your are not redirected, click here to visit the new site.

The legacy Hortonworks Forum is now closed. You can view a read-only version of the former site by clicking here. The site will be taken offline on January 31,2016

Pig Forum

Sqoop using shell is failing

  • #28103

    Hi There,

    I have a pig script which includes an “sh” command which executes sqoop. The pig script works fine when executed using “pig -f script.pig” but
    fails when executed as part of an Oozie workflow.

    Versions of each component:

    Hadoop 1.2.0.1.3.0.0-107 (as part of HDP 1.2)
    Apache Pig version 0.11.1.1.3.0.0-107 (rexported)
    Sqoop 1.4.3.1.3.0.0-107
    Oozie client build version: 3.3.2.1.3.0.0-107

    We’ve been struglling with this issue and tried a few things (mostly with regards to classpaths of sqoop/pig/oozie) but none of them really helped.

    We also tried to play with log4j properties with no luck.

    The following exceptions are being logged to Oozie pig launcher map reduce job:

    The following exception is thrown by sqoop (sensitive data is obfuscated of course):
    Exception in thread “main” java.lang.NoClassDefFoundError: Could not initialize class org.apache.log4j.LogManager
    at org.apache.log4j.Logger.getLogger(Logger.java:105)
    at org.apache.sqoop.util.LoggingUtils.setDebugLevel(LoggingUtils.java:50)
    at org.apache.sqoop.tool.BaseSqoopTool.applyCommonOptions(BaseSqoopTool.java:739)
    at org.apache.sqoop.tool.ImportTool.applyOptions(ImportTool.java:722)
    at org.apache.sqoop.tool.SqoopTool.parseArguments(SqoopTool.java:433)
    at org.apache.sqoop.Sqoop.run(Sqoop.java:129)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
    at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
    at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
    at org.apache.sqoop.Sqoop.main(Sqoop.java:238)

    Thanks in advance.

  • Author
    Replies
  • #28104

    And the following exception is thrown by Pig:

    java.io.IOException: java.io.IOException: sh command ‘sqoop import –connect jdbc:mysql://* –driver com.mysql.jdbc.Driver –username * –password * –as-textfile –fields-terminated-by , –enclosed-by “\”” –compress –table “events” –target-dir /user/tmp/sqoop-test.gz –input-null-string “null” –input-null-non-string “” –verbose –columns “id”‘ failed. Please check output logs for details
    at org.apache.pig.tools.grunt.GruntParser.processShCommand(GruntParser.java:1092)
    at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:175)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:194)
    at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:170)
    at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
    at org.apache.pig.Main.run(Main.java:475)
    at org.apache.pig.PigRunner.run(PigRunner.java:49)
    at org.apache.oozie.action.hadoop.PigMain.runPigJob(PigMain.java:283)
    at org.apache.oozie.action.hadoop.PigMain.run(PigMain.java:223)
    at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:37)
    at org.apache.oozie.action.hadoop.PigMain.main(PigMain.java:76)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:491)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:365)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)
    Caused by: java.io.IOException: sh command ‘sqoop import –connect jdbc:mysql://* –driver com.mysql.jdbc.Driver –username * –password * –as-textfile –fields-terminated-by , –enclosed-by “\”” –compress –table “events” –target-dir /user/tmp/sqoop-test.gz –input-null-string “null” –input-null-non-string “” –verbose –columns “id”‘ failed. Please check output logs for details
    at org.apache.pig.tools.grunt.GruntParser.processShCommand(GruntParser.java:1088)
    … 23 more
    ================================================================================
    Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.PigMain], exit code [2]

    #28206
    Sasha J
    Moderator

    Check you classpath settings, or pass the correct classpath through the command line.
    This error:
    Exception in thread “main” java.lang.NoClassDefFoundError: Could not initialize class org.apache.log4j.LogManager
    means that process can not find log4j classes, most likely because log4j.jar is not in class path.

    Thank you!
    Sasha

    #28445

    Hi Sasha,

    The Classpath settings includes log4j classes. But still, the issue occurs. Even when passing class path information by hand.

    Any other idea?

    Thanks!

    #28553

    Hi Seth,

    Thanks.

    Yes, we aet the property you mentioned as part of the oozie workflow.

    Any more ideas?

    #28556
    Seth Lyubich
    Moderator

    Hi Moty,

    Can you please check if your Pig and Mapreduce example jobs work correctly? Some information can be found here.

    http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-1.3.0/bk_dataintegration/content/ch_using-oozie.html

    Please let me know if this is helpful.

    Thanks,
    Seth

    #28597

    Hi Sef,

    Thanks!

    Yes, Pig and MapReduce work seamlessly when not integrating any Sqoop calls in the pig script. Nonetheless, there are a few more shell commands we use in the pig script which works: “hadoop fs -rmr /path/to/delete”

    Any more ideas?

    #28701
    tedr
    Moderator

    Hi Moty,

    Try this: copy the log4j jar file from /usr/lib/hadoop/lib to /usr/lib/sqoop/lib and then rerun the workflow

    Thanks,
    Ted.

    #28739

    Guys,

    We found the issue!

    Seems like a bug in Hadoop MapReduce version we use on HDP:
    https://issues.apache.org/jira/browse/MAPREDUCE-3112

    Since HDP 1.2 doesn’t include this fix, we had to change the hadoop-env.sh as the patch applies.

    Thanks all for your help!

    #28874
    tedr
    Moderator

    HI Moty.

    Thanks for letting us know that you found the solution.

    Thanks,
    Ted.

The forum ‘Pig’ is closed to new topics and replies.

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.