problem with sql server openquery to hive

to create new topics or reply. | New User Registration

This topic contains 2 replies, has 2 voices, and was last updated by  Guitao Ding 1 year, 11 months ago.

  • Creator
  • #33356

    Guitao Ding

    I’m using SQL Server and created a linked server to do queries using hive.
    But now i found when i use openquery(hive, ‘query sql’), the query run twice on the hive side.
    I’m using Hortonworks Hive ODBC Driver 1.2 (64 bit).
    I’m not sure if the first query is to get the metadata of the query result. If so, too silly….
    What should I do to avoid this? This is really annoying for taking much longer time to get the result.

Viewing 2 replies - 1 through 2 (of 2 total)

You must be to reply to this topic. | Create Account

  • Author
  • #33973

    Guitao Ding

    Hi Abdelrahman,

    Sorry for my late reply.

    Actually it was not dependent on the query i ran. For example:

    The following query runs twice. I can see in the hive server log. The second and absolutely same job was started immediately after the first job finished. It takes about 40 seconds.
    `select * from openquery(hive, ‘select column1, column2 from table_name limit 200′)`

    Strangely, when I run the same query immediately after the first query finished in SQL Server, the query
    runs only once on the hive side.

    But when i run the same query long time after the first query finished in SQL Server, the query runs twice again on the hive side.




    Hi Guitao,

    Can you provide more info about the query which you have ran?


Viewing 2 replies - 1 through 2 (of 2 total)
Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.