Hive Join Not Working

to create new topics or reply. | New User Registration


This topic contains 3 replies, has 2 voices, and was last updated by  Christopher Crosbie 1 year, 4 months ago.

  • Creator
  • #43896

    I’ve created two hive tables and I’d just like to do a basic join using the query:

    select a.gene, a.mrn, a.note_category, b.word, b.word_count from
    full_note_gene_combo a
    inner join full_word_counts b
    on a.docnumber = b.docnumber

    However, I only get back an error message stating “unknown exception”. I’ve checked the job logs but was not able to identify the cause of the error from there either.

    Any guidance on how to track down this error would be appreciated.

Viewing 3 replies - 1 through 3 (of 3 total)

You must be to reply to this topic. | Create Account

  • Author
  • #43964

    Sure, I’m having the problem with any correlation. This includes joins, INs, and exists. Here is the tail of the log from the query. I’m happy to provide more details. I’m a newbie and just trying to understand how to decipher the messages to pin point the issue.

    WARNING hue – “GET /logs HTTP/1.0″
    [18/Nov/2013 06:40:39 +0000] middleware INFO Processing exception: Error occurred executing hive query: Unknown exception.: Traceback (most recent call last):
    File “/usr/lib/hue/build/env/lib/python2.6/site-packages/Django-1.2.3-py2.6.egg/django/core/handlers/”, line 100, in get_response
    response = callback(request, *callback_args, **callback_kwargs)
    File “/usr/lib/hue/apps/beeswax/src/beeswax/”, line 554, in execute_query
    return execute_directly(request, query, query_server, design, on_success_url=on_success_url, download=download)
    File “/usr/lib/hue/apps/beeswax/src/beeswax/”, line 1242, in execute_directly
    raise PopupException(_(‘Error occurred executing hive query: ‘ + error_message))
    PopupException: Error occurred executing hive query: Unknown exception.
    DEBUG Thrift call .get_log returned in 0ms: “13/11/18 06:40:39 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive\n13/11/18 06:40:39 INFO ql.Driver: \n13/11/18 06:40:39 INFO ql.Driver: \n13/11/18 06:40:39 INFO ql.Driver: \n13/11/18 06:40:39 INFO ql.Driver: \n13/11/18 06:40:39 INFO parse.ParseDriver: Parsing command: use default\n13/11/18 06:40:39 INFO parse.ParseDriver: Parse Completed\n13/11/18 06:40:39 INFO ql.Driver: \n13/11/18 06:40:39 INFO ql.Driver: \n13/11/18 06:40:39 INFO ql.Driver: Semantic Analysis Completed\n13/11/18 06:40:39 INFO ql.Driver: \n13/11/18 06:40:39 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:null, properties:null)\n13/11/…



    “Container killed on request.” is a harmless message after a container finishes that you can ignore. I think there is a different reason why your query is failing. Is the job failing? Why? Because of one container failing 4 times? Will need more details.


    quick update: Although the std err page does not load, it appears from the log page that the error is actually “Container killed on request. Exit code is 143″

Viewing 3 replies - 1 through 3 (of 3 total)
Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.