The Hortonworks Community Connection is now live. A completely rebuilt Q&A forum, Knowledge Base, Code Hub and more, backed by the experts in the industry.

You will be redirected here in 10 seconds. If your are not redirected, click here to visit the new site.

The legacy Hortonworks Forum is now closed. You can view a read-only version of the former site by clicking here. The site will be taken offline on January 31,2016

Hive / HCatalog Forum

problem with sql server openquery to hive

  • #33356
    Guitao Ding

    I’m using SQL Server and created a linked server to do queries using hive.
    But now i found when i use openquery(hive, ‘query sql’), the query run twice on the hive side.
    I’m using Hortonworks Hive ODBC Driver 1.2 (64 bit).
    I’m not sure if the first query is to get the metadata of the query result. If so, too silly….
    What should I do to avoid this? This is really annoying for taking much longer time to get the result.

  • Author
  • #33702

    Hi Guitao,

    Can you provide more info about the query which you have ran?


    Guitao Ding

    Hi Abdelrahman,

    Sorry for my late reply.

    Actually it was not dependent on the query i ran. For example:

    The following query runs twice. I can see in the hive server log. The second and absolutely same job was started immediately after the first job finished. It takes about 40 seconds.
    `select * from openquery(hive, ‘select column1, column2 from table_name limit 200’)`

    Strangely, when I run the same query immediately after the first query finished in SQL Server, the query
    runs only once on the hive side.

    But when i run the same query long time after the first query finished in SQL Server, the query runs twice again on the hive side.


The forum ‘Hive / HCatalog’ is closed to new topics and replies.

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.