Hive / HCatalog Forum

Load xls file into Hive Table

  • #31248
    Anupam Gupta
    Participant

    Hi All,
    I want to know how to load xls file into hive table, my xls file contain comma and double quotes also.

    Thanks in Advance,
    Sandy

to create new topics or reply. | New User Registration

  • Author
    Replies
  • #31264
    Sasha J
    Moderator

    Loading XLS files directly is not supported yet.
    You have to export it to delimited CSV file and then load it.

    Hope this help.
    Sasha

    #31265
    Sasha J
    Moderator

    Loading XLS files directly is not supported yet.
    You have to export it to delimited CSV file and then load it.

    Hope this help.
    Sasha

    #31812
    Anupam Gupta
    Participant

    Hi Sasha,
    Thanks for your Reply, I have a csv file which contain Twitter data I am loading this data into hive table using csv-serde.jar. I able to load data into hive table but when I fetch data using Select query some column and row are showing NULL values.
    Why they are showing NULL I am new to hive please help us.

    Thanks In Advance,
    Sandy

    #32574
    Carter Shanklin
    Participant

    Sandy, there is an Excel import feature in Hue in the Hortonworks Sandbox and in the latest version of HDP under the HCatalog app. You can also load CSV there. Can you try that out and see if it works for you?

    #42668
    Anupam Gupta
    Participant

    Hi All,

    My csv file has special characters , double quotes, single quotes ,numbers etc. ,so I am using csv-serde to serialize and deserialize. but I am getting null for some columns data in a row in hive table, what is the problem?
    Kindly help

    Thanks,
    Sandy

    #42742
    Yi Zhang
    Moderator

    Hi Sandy,

    It is possible that the data can not be typed into the data type they are supposed to be, thus the null value.

    Thanks,
    Yi

    #42753
    Anupam Gupta
    Participant

    Hi Zhang,

    Data showing null for some column in a row (not for all rows) , I have twitter data file (.csv ). I am using csv-serde from bizo available in github.

    Thanks in advance,
    Sandy

    #42870
    Yi Zhang
    Moderator

    Hi Sandy,

    Can you give us some rows that have the data ingested correctly into hive and some rows that aren’t.

    Thanks,
    Yi

    #43678
    Anupam Gupta
    Participant

    Hi Zhang,
    Thanks for your reply, I create hive table using following query..

    create table csvdata(InteractionId
    string, FromUserWatch
    string, FromTermWatch
    string,Username
    string,UserUrl
    string,UserSummary
    string,AvatarImageUrl
    string,Content
    string,CreatedAtUTC
    string,CreatedAtFormatted
    string,Generator
    string,TweetUrl
    string,Latitude
    string,Longitude
    string)
    row format serde ‘com.bizo.hive.serde.csv.CSVSerde’ stored as textfile;

    then load data into table using…

    LOAD DATA LOCAL INPATH ‘/usr/local/data_file.csv’
    OVERWRITE INTO TABLE csvdata;

    I am getting null on Third row for content column data , after content column all data is null for other column in that row.
    Ex.
    Interaction _Id FromUserWatch Username Content
    (1) 3.60E+17 FALSE @RebeccaShabad RT @TheFix: Amazing. RT @SeinfeldToday: George is briefly implicated in the latest

    (2) 3.60E+17 FALSE @ohbambiemma @santwix “I’d like mine to mean something,” Emma stroked her skin, her eyes closing. Hearing a car pull in the drive way, she frowned. “Is

    (3) 3.60E+17 FALSE @Bernie_Bernz #lunch #nyc #sfilatino adorable ?? sandwich shop @

    Thanks In Advance,
    Sandy

    #49745
    Yang Guang
    Participant

    fkdsajjefjklfkdljfklsdajfioewjflkdsjflkerjfkjefwe

    #60961
    Bhushan Bhange
    Participant

    Hi All , Same issue , when I uploaded XLS file using HCatalog -> Create table using Excel file -> XLS file
    In Table preview it does not show NULL values but once create table is done. When I check it shows NULL values
    in Excel only 500 Rows but it shows 6000 rows with so many NULL values though there is no NULL in excel file . Any body knows what should I do ?

You must be to reply to this topic. | Create Account

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.