Home Forums Pig DBStorage Incompatibility with other storage in script

Tagged: , ,

This topic contains 5 replies, has 3 voices, and was last updated by  tedr 9 months, 3 weeks ago.

  • Creator
    Topic
  • #26333

    Hardik Shah
    Member

    DBStorage is not working with other storage in pig script. means DBStorage is not working with multiple storage statement.

    What I was trying for: 1) I was trying to Store one output using DBStorage And was trying to store same or different output using simple Store to file system 2) I also tried to store using DBStorage and using my custom store function

    But in both cases it not storing the data to database. If I comment out another storage than DBStorage is working properly.

    Even its not throwing any exception or error on reducer’s machine..

    Can anyone point out the problem?

    DBStorage is not working with Simple Store to file system. Its only working if I put only DBStorage no other Store Statement..

    pv_by_industry = GROUP profile_view by viewee_industry_id

    pv_avg_by_industry = FOREACH pv_by_industry GENERATE
    group as viewee_industry_id, AVG(profie_view) AS average_pv;

    STORE pv_avg_by_industry INTO ‘/tmp/hardik’;

    STORE pv_avg_by_industry into /tmp/hardik/db’ INTO
    DBStorage(‘com.mysql.jdbc.Driver’,
    ‘dbc:mysql://hostname/dbname’, ‘user’,
    ‘pass’,
    ‘INSERT INTO table (viewee_industry_id,average_pv) VALUES(?,?)’);

Viewing 5 replies - 1 through 5 (of 5 total)

You must be logged in to reply to this topic.

  • Author
    Replies
  • #28705

    tedr
    Moderator

    Hi

    Any Pig dev’s out there watching that can help this guy out?

    Thanks,
    Ted.

    Collapse
    #26438

    Hardik Shah
    Member

    actually my production code is too large so I put just a sample code to explain the situation …

    Collapse
    #26437

    Hardik Shah
    Member

    sorry it should be like this:
    STORE pv_avg_by_industry into /tmp/hardik/db’ using
    org.apache.pig.piggybank.storage.DBStorage(‘com.mysql.jdbc.Driver’,
    ‘dbc:mysql://hostname/dbname’, ‘user’,
    ‘pass’,
    ‘INSERT INTO table (viewee_industry_id,average_pv) VALUES(?,?)’);

    Collapse
    #26434

    Larry Liu
    Moderator

    Hi, Hardik

    Is this command working for you?

    STORE pv_avg_by_industry INTO
    DBStorage(‘com.mysql.jdbc.Driver’,
    ‘dbc:mysql://hostname/dbname’, ‘user’,
    ‘pass’,
    ‘INSERT INTO table (viewee_industry_id,average_pv) VALUES(?,?)’);

    I am not sure if STORE support mutiple INTO. So I remove ‘ into /tmp/hardik/db’’ from your command.

    Larry

    Collapse
    #26334

    Hardik Shah
    Member

    Hi again,

    Few things are came into picture when I was debugging it.

    DBStorage is setting Auto commit to False.
    So when the batch is executed it will not be auto committed.

    After executing batch OutputCommiter’s method commitTask in DBStorage (inline class’ method) was called in which commit is written

    if (ps != null) {
    try {System.out.println(“Executing Batch in commitTask”);
    ps.executeBatch();
    con.commit();
    ps.close();
    con.close();
    ps = null;
    con = null;
    } catch (SQLException e) {System.out.println(“Exception in commitTask”);
    log.error(“ps.close”, e);
    throw new IOException(“JDBC Error”, e);
    }

    and this method is called by PigOutputCommiter

    public void commitTask(TaskAttemptContext context) throws IOException {
    if(HadoopShims.isMap(context.getTaskAttemptID())) {
    for (Pair mapCommitter :
    mapOutputCommitters) {
    if (mapCommitter.first!=null) {
    TaskAttemptContext updatedContext = setUpContext(context,
    mapCommitter.second);
    mapCommitter.first.commitTask(updatedContext);
    }
    }
    } else {
    for (Pair reduceCommitter :
    reduceOutputCommitters) {
    if (reduceCommitter.first!=null) {
    TaskAttemptContext updatedContext = setUpContext(context,
    reduceCommitter.second);
    reduceCommitter.first.commitTask(updatedContext);
    }
    }
    }
    }

    But when this commitTask is called its connection and preparedStatment Objects become null… so it ll not commit so data is not available in Database…..

    But if you write only DBStorage without any other Store statement in script it will work properly..

    Any clues???

    Collapse
Viewing 5 replies - 1 through 5 (of 5 total)