Home Forums Sqoop sqoop import error

Tagged: 

This topic contains 7 replies, has 2 voices, and was last updated by  tedr 1 year, 7 months ago.

  • Creator
    Topic
  • #13872

    Trang Nguyen
    Member

    When I try to run sqoop with a query clause, I always get the error:
    At minimum, you must specify –connect and –table

    sqoop import –connect jdbc:mysql://trangnlx.sea.openmarket.com:3306/wmpfe –verbose –table tMessageAchive -e “SELECT MessageId, ParentId, CarrierId, OPTIONS, TypeId, TicketId FROM tMessageArchive where TsUpdated > ’2012-11-01′ and TsUpdated < '2012-11-02'" –username trangn -P –outdir sms_message_mysql_20121101 –hive-import –hive-table sms_message_20121102

    I've tried ommited and adding the "–table" option but it doesn't seem to make a difference. Anything that I'm doing wrong?

    Thanks,
    Trang

Viewing 7 replies - 1 through 7 (of 7 total)

The topic ‘sqoop import error’ is closed to new replies.

  • Author
    Replies
  • #14500

    tedr
    Member

    Hi Trang,

    Here’s a couple of things I noticed when reading the Sqoop documentation about the freeform queries (using the –query):
    1) When importing a free-form query, you must specify a destination directory with –target-dir
    2) You must also select a splitting column with –split-by
    Also these docs mention that if you want to run this type of query in parallel (as in using more than one mapper) you must include the token $CONDITIONS (these conditions are inferred by sqoop and if you double quoter the query, as you did, you must escape the token like this \$CONDITIONS.

    So from this it looks like when using the –query format these specific parameters are required by the sqoop language definition. It’s not an issue, it is just the way sqoop was written.

    Thanks,
    Ted.

    Collapse
    #13933

    Trang Nguyen
    Member

    On the failing import, sqoop is rejecting the syntax and asking for additional paratemeters (target-dir, split-by,…) which was not required in the successful import.

    Thanks,
    Trang

    Collapse
    #13932

    Trang Nguyen
    Member

    Failing import:
    sqoop import –connect “jdbc:mysql://tdc-prd-rdb-044:3306/wmp44″ –verbose –query “select MessageId, ParentId, CarrierId, Options, TypeId, TicketId, SubscriberId, InterfaceId, AppTypeId, StateId, ReasonId, SourceTon, SourceAddr, DestTon, DestAddr, UdhId, NotifyId, NotifyStateId, NotifyReasonId, DcsId, Data, BillingId, BillingPrice, ChargeType, ChargePrice, UserAgent, UserIpAddr, TsSubmitted, TsScheduled, TsExpire, TsTransferred, TsDelivered, TsUpdated, ReferenceId, ProgramId, TsDbUpdatedUtc, message_purpose_id, ad_requirement_id, advice_of_charge_message, sms_state_id, sms_error_code, initiate_billing, tmobile_content_delivery_type_id, note, retry_of_ticket_id, carrier_response_code, currency_code, destination_address_country_id from tMessageArchive where TsDbUpdatedUtc >= ’2012-12-01 00:00:00′ and TsDbUpdatedUtc < '2012-12-02 00:00:00'" –username trangn -P –outdir outdir/sms_message_test –hive-import –hive-table sms_message_mysql_test

    Collapse
    #13931

    Trang Nguyen
    Member

    Somehow the posting gets clobbered when it’s stored. I’ll try to repost the successful import statement as two separate postings.

    Sucessfull:
    sqoop import –connect “jdbc:mysql://$DB_HOST:3306/$SERVER_ID” –verbose –table tMessageArchive –columns “MessageId, ParentId, CarrierId, Options, TypeId, TicketId, SubscriberId, InterfaceId, AppTypeId, StateId, ReasonId, SourceTon, SourceAddr, DestTon, DestAddr, UdhId, NotifyId, NotifyStateId, NotifyReasonId, DcsId, Data, BillingId, BillingPrice, ChargeType, ChargePrice, UserAgent, UserIpAddr, TsSubmitted, TsScheduled, TsExpire, TsTransferred, TsDelivered, TsUpdated, ReferenceId, ProgramId, TsDbUpdatedUtc, message_purpose_id, ad_requirement_id, advice_of_charge_message, sms_state_id, sms_error_code, initiate_billing, tmobile_content_delivery_type_id, note, retry_of_ticket_id, carrier_response_code, currency_code, destination_address_country_id” –where “TsDbUpdatedUtc >= ’2012-12-${from} 00:00:00′ and TsDbUpdatedUtc < '2012-12-${to} 00:00:00'" –username trangn -P –outdir outdir/sms_message_${SERVER_ID}_2012_12_${from}_${to} –hive-import –hive-table sms_message_mysql_2012_12_${from}_${to}

    Collapse
    #13930

    Trang Nguyen
    Member

    Somehow the blog posting got clobbered.
    This query works:
    sqoop import –connect “jdbc:mysql://$DB_HOST:3306/$SERVER_ID” –verbose –table tMessageArchive –columns “MessageId, ParentId, CarrierId, Options, TypeId, TicketId, SubscriberId, InterfaceId, AppTypeId, StateId, ReasonId, SourceTon, SourceAddr, DestTon, DestAddr, UdhId, NotifyId, NotifyStateId, NotifyReasonId, DcsId, Data, BillingId, BillingPrice, ChargeType, ChargePrice, UserAgent, UserIpAddr, TsSubmitted, TsScheduled, TsExpire, TsTransferred, TsDelivered, TsUpdated, ReferenceId, ProgramId, TsDbUpdatedUtc, message_purpose_id, ad_requirement_id, advice_of_charge_message, sms_state_id, sms_error_code, initiate_billing, tmobile_content_delivery_type_id, note, retry_of_ticket_id, carrier_response_code, currency_code, destination_address_country_id” –where “TsDbUpdatedUtc >= ’2012-12-${from} 00:00:00′ and TsDbUpdatedUtc = ’2012-12-01 00:00:00′ and TsDbUpdatedUtc < '2012-12-02 00:00:00'" –username trangn -P –outdir outdir/sms_message_test –hive-import –hive-table sms_message_mysql_test –target-dir test

    Not sure why sqoop is not accepting the –query syntax and asking for additional parameters…

    Thanks,
    Trang

    Collapse
    #13929

    Trang Nguyen
    Member

    Hi Ted,

    I was able to run the sqoop import if I structure the command as:
    sqoop import –connect “jdbc:mysql://$DB_HOST:3306/$SERVER_ID” –verbose –table tMessageArchive –columns “MessageId, ParentId, CarrierId, Options, TypeId, TicketId, SubscriberId, InterfaceId, AppTypeId, StateId, ReasonId, SourceTon, SourceAddr, DestTon, DestAddr, UdhId, NotifyId, NotifyStateId, NotifyReasonId, DcsId, Data, BillingId, BillingPrice, ChargeType, ChargePrice, UserAgent, UserIpAddr, TsSubmitted, TsScheduled, TsExpire, TsTransferred, TsDelivered, TsUpdated, ReferenceId, ProgramId, TsDbUpdatedUtc, message_purpose_id, ad_requirement_id, advice_of_charge_message, sms_state_id, sms_error_code, initiate_billing, tmobile_content_delivery_type_id, note, retry_of_ticket_id, carrier_response_code, currency_code, destination_address_country_id” –where “TsDbUpdatedUtc >= ’2012-12-${from} 00:00:00′ and TsDbUpdatedUtc = ’2012-12-01 00:00:00′ and TsDbUpdatedUtc < '2012-12-02 00:00:00'" –username trangn -P –outdir outdir/sms_message_test –hive-import –hive-table sms_message_mysql_test
    13/01/21 12:34:48 DEBUG tool.BaseSqoopTool: Enabled debug logging.
    Enter password:
    13/01/21 12:34:51 DEBUG sqoop.Sqoop: Must specify destination with –target-dir.
    Try –help for usage instructions.
    Must specify destination with –target-dir.

    If I specify the –target-dir (why is it needed?), I get a different error:
    When importing query results in parallel, you must specify –split-by.

    The original query did not require either one of these parameters so I'm not sure why it sqoop is asking for them.

    Thanks,
    Trang

    Collapse
    #13916

    tedr
    Member

    Hi Trang,

    Thanks for using HDP. We are looking into this issue, to help us better understand what is going on could you post the relevant bits of the hive and sqoop logs?

    Thanks,
    Ted.

    Collapse
Viewing 7 replies - 1 through 7 (of 7 total)