Pig Forum

jython "processing new jar" each time

  • #24680
    Vladislav Pernin

    When using a jython script, the Jython interpreter has to do the “processing new jar” each time.
    This must be cached.

to create new topics or reply. | New User Registration

  • Author
  • #24681
    Larry Liu

    Hi, Vladislav,

    Can you please provide more detail?



    Vladislav Pernin


    Imagine a simple job using a simple jython script.
    def concat(str):
    return str+str

    Register ‘myfuncs.py’ using jython as myfuncs;

    Here are an extract of the trace :
    2013-05-13 11:01:11,755 [main] INFO org.apache.pig.scripting.jython.JythonScriptEngine – created tmp python.cachedir=/tmp/pig_jython_391280432392436532
    *sys-package-mgr*: processing new jar, ‘/usr/lib/hadoop/lib/jersey-json-1.8.jar’
    *sys-package-mgr*: processing new jar, ‘/usr/lib/hadoop/lib/jsp-api-2.1.jar’
    *sys-package-mgr*: processing new jar, ‘/usr/lib/hadoop/lib/jackson-core-asl-1.8.8.jar’
    *sys-package-mgr*: processing new jar, ‘/usr/lib/hadoop/lib/jersey-core-1.8.jar’
    *sys-package-mgr*: processing new jar, ‘/usr/lib/hadoop/lib/jaxb-api-2.2.2.jar’
    *sys-package-mgr*: processing new jar, ‘/usr/lib/hadoop/lib/avro-1.5.3.jar’
    *sys-package-mgr*: processing new jar, ‘/usr/lib/hadoop/lib/jackson-mapper-asl-1.8.8.jar’
    *sys-package-mgr*: processing new jar, ‘/usr/lib/hadoop/lib/log4j-1.2.17.jar’
    *sys-package-mgr*: processing new jar, ‘/usr/lib/hadoop/lib/guava-11.0.2.jar’
    *sys-package-mgr*: processing new jar, ‘/usr/lib/hadoop/lib/jettison-1.1.jar’

    *sys-package-mgr*: processing new jar, ‘/usr/java/jdk1.7.0_17/jre/lib/ext/sunec.jar’
    *sys-package-mgr*: processing new jar, ‘/usr/java/jdk1.7.0_17/jre/lib/ext/dnsns.jar’
    2013-05-13 11:03:48,291 [main] WARN org.apache.pig.scripting.jython.JythonScriptEngine – pig.cmd.args.remainders is empty. This is not expected unless on testing.
    2013-05-13 11:03:48,664 [main] INFO org.apache.pig.scripting.jython.JythonScriptEngine – Register scripting UDF: myfuncs.concat


    Hi Vladislav,

    I can see no errors in in what you have posted, are you saying that the script doesn’t work or what?


    Vladislav Pernin


    I never said there were errors in the trace, just that the jar are processed each time pig is started.
    There is jython caching mechanism to use.
    It could make pig script using fonctions requiring processing jar much faster.


    Hi Vladislav,

    Ok, then what you are asking for is not help with using it, but pointing out something that we could program for more efficient running of Pig scripts, i.e. a feature request/improvement. After making sure that it is not already requested I’ll add this to the database.


You must be to reply to this topic. | Create Account

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.