Pig Forum

jython "processing new jar" each time

  • #24680
    Vladislav Pernin
    Participant

    When using a jython script, the Jython interpreter has to do the “processing new jar” each time.
    This must be cached.

to create new topics or reply. | New User Registration

  • Author
    Replies
  • #24681
    Larry Liu
    Moderator

    Hi, Vladislav,

    Can you please provide more detail?

    Thanks

    Larry

    #25333
    Vladislav Pernin
    Participant

    Hi,

    Imagine a simple job using a simple jython script.
    myfuncs.py
    @outputSchema(“word:chararray”)
    def concat(str):
    return str+str

    Register ‘myfuncs.py’ using jython as myfuncs;

    Here are an extract of the trace :
    2013-05-13 11:01:11,755 [main] INFO org.apache.pig.scripting.jython.JythonScriptEngine – created tmp python.cachedir=/tmp/pig_jython_391280432392436532
    *sys-package-mgr*: processing new jar, ‘/usr/lib/hadoop/lib/jersey-json-1.8.jar’
    *sys-package-mgr*: processing new jar, ‘/usr/lib/hadoop/lib/jsp-api-2.1.jar’
    *sys-package-mgr*: processing new jar, ‘/usr/lib/hadoop/lib/jackson-core-asl-1.8.8.jar’
    *sys-package-mgr*: processing new jar, ‘/usr/lib/hadoop/lib/jersey-core-1.8.jar’
    *sys-package-mgr*: processing new jar, ‘/usr/lib/hadoop/lib/jaxb-api-2.2.2.jar’
    *sys-package-mgr*: processing new jar, ‘/usr/lib/hadoop/lib/avro-1.5.3.jar’
    *sys-package-mgr*: processing new jar, ‘/usr/lib/hadoop/lib/jackson-mapper-asl-1.8.8.jar’
    *sys-package-mgr*: processing new jar, ‘/usr/lib/hadoop/lib/log4j-1.2.17.jar’
    *sys-package-mgr*: processing new jar, ‘/usr/lib/hadoop/lib/guava-11.0.2.jar’
    *sys-package-mgr*: processing new jar, ‘/usr/lib/hadoop/lib/jettison-1.1.jar’

    *sys-package-mgr*: processing new jar, ‘/usr/java/jdk1.7.0_17/jre/lib/ext/sunec.jar’
    *sys-package-mgr*: processing new jar, ‘/usr/java/jdk1.7.0_17/jre/lib/ext/dnsns.jar’
    2013-05-13 11:03:48,291 [main] WARN org.apache.pig.scripting.jython.JythonScriptEngine – pig.cmd.args.remainders is empty. This is not expected unless on testing.
    2013-05-13 11:03:48,664 [main] INFO org.apache.pig.scripting.jython.JythonScriptEngine – Register scripting UDF: myfuncs.concat

    #25420
    tedr
    Member

    Hi Vladislav,

    I can see no errors in in what you have posted, are you saying that the script doesn’t work or what?

    Thanks,
    Ted.

    #25421
    Vladislav Pernin
    Participant

    Hi,

    I never said there were errors in the trace, just that the jar are processed each time pig is started.
    There is jython caching mechanism to use.
    It could make pig script using fonctions requiring processing jar much faster.

    #25506
    tedr
    Member

    Hi Vladislav,

    Ok, then what you are asking for is not help with using it, but pointing out something that we could program for more efficient running of Pig scripts, i.e. a feature request/improvement. After making sure that it is not already requested I’ll add this to the database.

    Thanks,
    Ted.

You must be to reply to this topic. | Create Account

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.