Home Forums Pig Elephant-bird to analyse Tweets

This topic contains 2 replies, has 2 voices, and was last updated by  Rodulfo 3 months, 2 weeks ago.

  • Creator
  • #45901


    Hello I wanted to use Twitters Elephant-bird, to analyze Tweets without having to save them in another format like csv and leave them in their original JSON format.

    I have built Elephant-bird and I wrote the following simple code to load tweets from a file, following some examples I saw:

    REGISTER /user/rmrodriguez/jar/json-simple-1.1.jar;
    REGISTER /user/rmrodriguez/jar/elephant-bird-pig-4.4.jar;
    REGISTER /user/rmrodriguez/jar/elephant-bird-core-4.4.jar;
    REGISTER /user/rmrodriguez/jar/google-collections-1.0.jar;

    A = LOAD 'tweets.20131201-215958.json' USING com.twitter.elephantbird.pig.load.JsonLoader('-nestedLoad');

    tweets = FOREACH A GENERATE (CHARARRAY)$0#'id' AS id;

    DUMP tweets;

    and I get the following error:

    2013-12-19 07:55:55,364 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
    2013-12-19 07:55:55,367 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
    2013-12-19 07:55:55,370 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. com/twitter/elephantbird/util/HadoopCompat
    Details at logfile: /hadoop/yarn/local/usercache/rmrodriguez/appcache/application_1387366430472_0012/container_1387366430472_0012_01_000002/pig_1387457753323.log

    Anyone has experience with elephant-bird that might know the cause for the error or can suggest another way for loading tweets in JSON format?


Viewing 2 replies - 1 through 2 (of 2 total)

You must be logged in to reply to this topic.

  • Author
  • #46254


    Thank you Ramanan,

    That was it,. I just needed to register the HadoopCompat.jar. Now it works!


    check the logfile ‘pig_1387457753323.log’. you may need to register the HadoopCompat jar in your pig script

Viewing 2 replies - 1 through 2 (of 2 total)