webhcatalog table creation and json

to create new topics or reply. | New User Registration

This topic contains 2 replies, has 2 voices, and was last updated by  John Brinnand 1 year, 10 months ago.

  • Creator
    Topic
  • #30635

    John Brinnand
    Participant

    Hi Folks,

    So I have webhcat 0.11.0 and have written a Java class which uses HCatalog’s DDL. Basically I took Alans example code from github and leveraged it. It works great but I noticed that when I create a text table, I cannot specify (or haven’t found how to specify) a JsonSerde row format. Basically I want to do something like what is done on the command line.
    hive>use myalphadb; create table test_jsontable (id string, name string) ROW FORMAT SERDE ‘org.apache.hcatalog.data.JsonSerDe';

    However, I don’t see how to do this using HCatCreateTableDesc. And if the table is created as a regular text file, hive will not be able to read the data and will return nulls for all keys and values.

    So – is there a way to tell HCatalog programmatically to create a table use a row format serde of JSON?

    Thanks,

    John

Viewing 2 replies - 1 through 2 (of 2 total)

You must be to reply to this topic. | Create Account

  • Author
    Replies
  • #30976

    John Brinnand
    Participant

    Hi Akki,

    Thanks for the response. Meanwhile – I found that Hive’s JDBC client can be used for our purposes.

    Thanks again for your help,

    John

    Collapse
    #30838

    Akki Sharma
    Moderator

    Hi John,

    Looking at HCatCreateTableDesc code, there isn’t support for the end user to create a table from api that supports custom serdes (unless there is a storagehandler that users a serde – we support that)

    Best Regards,
    Akki

    Collapse
Viewing 2 replies - 1 through 2 (of 2 total)
Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.