Home Forums Hive / HCatalog Hive string cast exception with Avro

This topic contains 4 replies, has 4 voices, and was last updated by  Larry Liu 12 months ago.

  • Creator
    Topic
  • #18432

    Our engineering team is currently hitting an issue with using Avro in our Hive installation and are seeing an exception similar to the following when running a fairly simple query:

    java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.avro.util.Utf8

    The table definition in Hive is the following:

    hive> describe daily_12;
    OK
    uid string from deserializer
    userdatabydataprovider map<string,struct<dataproviderid:int,userid:string,info:map,events:map<string,array<struct<timestamp:bigint,attributes:map>>>,lastaccesstime:bigint>> from deserializer
    audiencedata array<struct> from deserializer

    And the query having the issue is similar to the following:

    select t.uid,t.userdatabydataprovider['1'].userid,t.userdatabydataprovider['1'].info,t.userdatabydataprovider['1'].events,audiencedata from daily_12 t limit 5;

    Supposedly, the same code/query works in a different HDP cluster (I need to confirm this) and I’m looking to try and determine if it is a cluster/Hive issue. We are currently running Hive 0.10.0.21 in our cluster while the other cluster is running Hive 0.10.0.22… I’m looking for other significant differences.

    I’m not a true Java/Avro guy (know enough to be dangerous) but what I’ve found thus far is other people have had issues with strings and Avro expecting Utf8. I plan to sit with the engineers to look at the Avro schema(s) being used. I haven’t found anything specific to Hive but have seen some discussions related to Pig where people have hit a similar error using “avro.java.string” instead of “avro.util.Utf8″ for the String property in schema fields.

    Any other tips/advice for troubleshooting this issue would be greatly appreciated.

Viewing 4 replies - 1 through 4 (of 4 total)

You must be logged in to reply to this topic.

  • Author
    Replies
  • #23172

    Larry Liu
    Moderator

    Another quick question, Bobby, what version of HDP are you using?

    Collapse
    #23058

    Larry Liu
    Moderator

    Hi, Bobby,

    Can you please more background of using Avro? How did you install Avro and steps?

    I am reading the booking hadoop definite guide. There is a section talking about Avro. I hope it is helpful enough.

    Thanks
    Larry

    Collapse
    #23045

    Niels Basjes
    Member

    I’m running into similar problems with Pig.
    What I’ve found so far is that when writing an AVRO file using Java you can specify the class that is to be used for the string type as an argument for the avro compiler.
    String

    As far as I can tell this option causes the actual avro file to be different and the file with “String” (instead of the default Utf8) is not fully supported by PIG. I would not be surprised if Hive has similar problems.

    Collapse
    #18476

    Robert
    Participant

    Hi Bobby,
    Maybe you can try if you can even fetch just one row to help if it’s possible to isolate the problem to the data. If one row consistently works fine, then maybe a specific row or rows is causing the issue.

    Regards,
    Robert

    Collapse
Viewing 4 replies - 1 through 4 (of 4 total)