Home Forums Pig HiveColumnarLoader not loading int columns

This topic contains 1 reply, has 2 voices, and was last updated by  Thejas Nair 4 months, 2 weeks ago.

  • Creator
    Topic
  • #52648

    Greg Smorag
    Participant

    Hi,
    I have a problem with weird behaviour of HiveColumnadLoader and I wonder, if anybody can help me with that?

    The issue is, when I am attempting to load data from my hive table via pig, which looks more of less like the code below, pig never loads int columns. Everything is completed successfully, but the output of dumping Z, which has null values in place of ints. When I change ‘dump Z’ into something else, e.g. Mongo DB store, if just saves nulls in place of int values.


    register /opt/hadoop/pig/contrib/piggybank/java/piggybank.jar;
    register /opt/hadoop/hive/lib/hive-common-0.12.0.jar;
    register /opt/hadoop/hive/lib/hive-exec-0.12.0.jar;
    A = load '<VALID_HIVE_PATH>/<HIVE_TABLE>' USING org.apache.pig.piggybank.storage.HiveColumnarLoader('column_a int,column_b int,column_c string,column_d string,column_e int');
    L = foreach A generate *;
    Z = filter L by fk_client == '<PARTITION VALUE>';
    dump Z;

    HIVE_TABLE is a hive partitioned tableof type RCfile. The partition denominator column is fk_client column.

    I am using hadoop 2.3, hive 0.12.0 and pig 0.12.1.

    Can you please help with me and point out, what might be the reason or what I am possibily doing wrong, as I am running out of options?

    Thank you very much.

Viewing 1 replies (of 1 total)

You must be logged in to reply to this topic.

  • Author
    Replies
  • #52742

    Thejas Nair
    Participant

    I doubt if HiveColumnadLoader is widely used. You might want to try hcatalog’s HCatLoader with pig instead.

    Collapse
Viewing 1 replies (of 1 total)