Hive / HCatalog Forum

Parallel ODBC queries to Hive

  • #25332

    We’re implementing Proof of Concept project on Hive and ODBC connectivity.
    We’re running Hive with Hiveserver2. We’ve established connection to Hive from MS SQL 2012 sp1 via linked server, which uses System DSN (ODBC datasource) based on Hortonworks ODBC driver v1.2.0.1005 (64 bit).
    We’ve created a database view, which uses OPENQUERY to access Hive and run command.

    Everything appears to be OK until we launch parallel selects from the view. The first view returns data, while others crash with errors:

    OLE DB provider “MSDASQL” for linked server “hortonhive” returned message “[Hortonworks][Hardy] (35) Error from Hive: error code: ‘0’ error message: ‘ java.lang.ArrayIndexOutOfBoundsException’.”.
    Msg 7330, Level 16, State 2, Line 1
    Cannot fetch a row from OLE DB provider “MSDASQL” for linked server “hortonhive”.

to create new topics or reply. | New User Registration

  • Author
  • #25357

    in addition: it appears that crash occurs after map/reduce finishes and second query is about to return result set.

    Yi Zhang

    Hi Ramunas,

    Can you give us sample schema of the view?

    Could you help us see the problem by posting the hiveserver2 log (/var/log/hive/hiveserver2.log on the hive server node) and the log4j logs for the user (/tmp/$user/hive.log on the hive client node) when the problem happens? Any other sterr/stout messages would be helpful too.



    I’ve located the source of the problem. It’s not ODBC drivers issue. My Hive table has TIMESTAMP and it’s ToString is not thread safe. Some conditions are triggered while fetching and it causes crash in Hive.
    The problem is described and patch is provided here:


    Further more (thanks to Brock Noland):
    The patch in is much simpler and fixes the same issue. HIVE-4516 will be included in the 4.3.0 release.

    Tian An Koh


    Can i ask if you are to use openquery, how will a hive insert/load query look like?

    Sorry i’m new to Hive too and am also doing some research on this technology.

    I did something similiar using barebones Hive on Ubuntu. I’ve posted the same question here

    Hoping someone can shed some light on it

    Thank you

You must be to reply to this topic. | Create Account

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.