Hive / HCatalog Forum

Hive json Data

  • #44405
    Anupam Gupta

    HI All,
    I loaded json data into hive table using json-serde-1.1.4 of rcongiu/Hive-JSON-Serde.
    data is in following format
    “DocId”: “ABC”,
    “User”: {
    “Id”: 1234,
    “Username”: “sam1234”,
    “Name”: “Sam”,
    “ShippingAddress”: {
    “Address1”: “123 Main St.”,
    “Address2”: null,
    “City”: “Durham”,
    “State”: “NC”
    “Orders”: [
    “ItemId”: 6789,
    “OrderDate”: “11/11/2012”
    “ItemId”: 4352,
    “OrderDate”: “12/12/2012”

    when I run following query
    SELECT DocId, User.Id, User.ShippingAddress.City as city,
    > User.Orders[0].ItemId as order0id,
    > User.Orders[1].ItemId as order1id
    > FROM complex_json;

    I am getting exception
    java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {
    at org.apache.hadoop.mapred.MapTask.runOldMapper(
    at org.apache.hadoop.mapred.Child$
    at Method)
    at org.apache.hadoop.mapred.Child.main(
    Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable {
    at org.apache.hadoop.hive.ql.exec.MapOperator.process( 43)
    … 8 more
    Caused by: org.apache.hadoop.hive.serde2.SerDeException: Row is not a valid JSON Object – JSONException: A JSONObject text must end with ‘}’ at 2 [character 3 line 1]
    at )
    at org.apache.hadoop.hive.ql.exec.MapOperator.process(
    … 9 more

    But when I used data format(Single line data) like
    {“DocId”:”ABC”,”User”:{“Id”:1234,”Username”:”sam1234″,”Name”:”Sam”,”ShippingAddress”:{“Address1″:”123 Main St.”,”Address2″:””,”City”:”Durham”,”State”:”NC”},”Orders”:[{“ItemId”:6789,”OrderDate”:”11/11/2012″},{“ItemId”:4352,”OrderDate”:”12/12/2012″}]}}

    and then run the above select query
    it will give me following result…

    Total MapReduce CPU Time Spent: 1 seconds 30 msec
    Time taken: 32.255 seconds, Fetched: 1 row(s)

    I followed fowling link for this example..

    Kindly Help…


to create new topics or reply. | New User Registration

You must be to reply to this topic. | Create Account

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.