Hortonworks Sandbox Forum

How to run pig script through command line

  • #40401
    Cheng Chen

    I am trying to run the pig scripts in
    Hortonworks Sandbox 1.3, Red Hat (64) under VirtualBox 4.2.16
    that I installed on my Windows 7 (64) machine.
    I have finished the tutorials comes with the HDP.
    Now, I want to run scripts through command line (via putty). But I can not do so. I have the following questions.
    1. Where are the pig scripts saved on the guest machine.
    2. I created a pig script under “root/” . And I typed in “pig -x local wordcount.pig” in the command line (again, in putty). But the script does not run.

    Attached is the error message.

    2013-10-14 14:22:35,382 [main] INFO org.apache.pig.Main – Apache Pig version (rexported) compiled May 20 2013, 18:14:30
    2013-10-14 14:22:35,384 [main] INFO org.apache.pig.Main – Logging error messages to: /root/pig-wordcount/pig_1381785755380.log
    2013-10-14 14:22:35,731 [main] INFO org.apache.pig.impl.util.Utils – Default bootup file /root/.pigbootup not found
    2013-10-14 14:22:35,837 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine – Connecting to hadoop file system at: file:///
    2013-10-14 14:22:36,422 [main] INFO org.apache.pig.tools.pigstats.ScriptState – Pig features used in the script: GROUP_BY
    2013-10-14 14:22:36,559 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler – File concatenation threshold: 100 optimistic? false
    2013-10-14 14:22:36,580 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.CombinerOptimizer – Choosing to move algebraic foreach to combiner
    2013-10-14 14:22:36,606 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer – MR plan size before optimization: 1
    2013-10-14 14:22:36,606 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer – MR plan size after optimization: 1
    2013-10-14 14:22:36,651 [main] INFO org.apache.pig.tools.pigstats.ScriptState – Pig script settings are added to the job
    2013-10-14 14:22:36,672 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler – mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
    2013-10-14 14:22:36,675 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler – Using reducer estimator: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator
    2013-10-14 14:22:36,677 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator – BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=93
    2013-10-14 14:22:36,677 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler – Setting Parallelism to 1
    2013-10-14 14:22:36,709 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler – Setting up single store job
    2013-10-14 14:22:36,722 [main] INFO org.apache.pig.data.SchemaTupleFrontend – Key [pig.schematuple] is false, will not g

to create new topics or reply. | New User Registration

  • Author
  • #40887


    You can just run:

    pig /locationg/of/script/scriptname.pig



    Sasi Kumar

    i configuration hive centos 6.4

    how to find

    13/11/08 12:34:37 INFO pig.Main: Logging error messages to: /home/sasikumar/pig_1383894277625.log
    2013-11-08 12:34:37,806 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine – Connecting to hadoop file system at: file:///

    this error

You must be to reply to this topic. | Create Account

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.