Hive / HCatalog Forum

problem with sql server openquery to hive

  • #33356
    Guitao Ding

    I’m using SQL Server and created a linked server to do queries using hive.
    But now i found when i use openquery(hive, ‘query sql’), the query run twice on the hive side.
    I’m using Hortonworks Hive ODBC Driver 1.2 (64 bit).
    I’m not sure if the first query is to get the metadata of the query result. If so, too silly….
    What should I do to avoid this? This is really annoying for taking much longer time to get the result.

to create new topics or reply. | New User Registration

  • Author
  • #33702

    Hi Guitao,

    Can you provide more info about the query which you have ran?


    Guitao Ding

    Hi Abdelrahman,

    Sorry for my late reply.

    Actually it was not dependent on the query i ran. For example:

    The following query runs twice. I can see in the hive server log. The second and absolutely same job was started immediately after the first job finished. It takes about 40 seconds.
    `select * from openquery(hive, ‘select column1, column2 from table_name limit 200’)`

    Strangely, when I run the same query immediately after the first query finished in SQL Server, the query
    runs only once on the hive side.

    But when i run the same query long time after the first query finished in SQL Server, the query runs twice again on the hive side.


You must be to reply to this topic. | Create Account

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.