The Hortonworks Community Connection is now live. A completely rebuilt Q&A forum, Knowledge Base, Code Hub and more, backed by the experts in the industry.

You will be redirected here in 10 seconds. If your are not redirected, click here to visit the new site.

The legacy Hortonworks Forum is now closed. You can view a read-only version of the former site by clicking here. The site will be taken offline on January 31,2016

Hive / HCatalog Forum

Need help for Hive Streaming (HDP 2.1)

  • #57765
    Alex K
    Participant

    Hi,

    I’m playing around with Hive streaming preview using the example code from https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest. I get the following error:
    14/07/23 16:58:31 INFO hive.metastore: Trying to connect to metastore with URI thrift://localhost:9083
    14/07/23 16:58:32 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
    14/07/23 16:58:32 INFO hive.metastore: Connected to metastore.
    14/07/23 16:58:33 INFO hive.metastore: Trying to connect to metastore with URI thrift://localhost:9083
    14/07/23 16:58:33 INFO hive.metastore: Connected to metastore.
    14/07/23 16:58:34 WARN hdfs.BlockReaderLocal: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
    14/07/23 16:58:34 INFO session.SessionState: No Tez session required at this point. hive.execution.engine=mr.
    14/07/23 16:58:35 INFO log.PerfLogger: <PERFLOG method=Driver.run from=org.apache.hadoop.hive.ql.Driver>
    14/07/23 16:58:35 INFO log.PerfLogger: <PERFLOG method=TimeToSubmit from=org.apache.hadoop.hive.ql.Driver>
    14/07/23 16:58:35 INFO hive.metastore: Trying to connect to metastore with URI thrift://localhost:9083
    14/07/23 16:58:35 INFO hive.metastore: Connected to metastore.
    FAILED: Error in determing valid transactions: Error communicating with the metastore
    14/07/23 16:58:35 ERROR ql.Driver: FAILED: Error in determing valid transactions: Error communicating with the metastore
    org.apache.hadoop.hive.ql.lockmgr.LockException: Error communicating with the metastore
    at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidTxns(DbTxnManager.java:281)
    at org.apache.hadoop.hive.ql.Driver.recordValidTxns(Driver.java:843)
    […]
    at HSMain.main(HSMain.java:32)
    Caused by: org.apache.thrift.TApplicationException: Internal error processing get_open_txns
    at org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
    at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
    […]

    I’m working on Sandbox 2.1. I used the CREATE TABLE statement listed in the example. Are there any other configuration changes I need to make to the Hadoop&Hive setup on the sandbox?

    Thanks,
    Alex

  • Author
    Replies
  • #57766
    Eugene Koifman
    Moderator

    It may help if you provide a more detailed stack trace. Perhaps hive.log has it.
    One thing to check is if you metastore DB has the tables needed for transaction support, for example ‘TXN’, ‘TXN_COMPONENTS’, etc

    #57767
    Alex K
    Participant

    Hi Eugene,

    the metastore DB is unchanged from the Sandbox 2.1 environment. I checked, and the MySQL schema hive doesn’t have the tables TXN and TXN_COMPONENT. Are there instructions how to enable transaction support in a Sandbox environment?

    Thanks,
    Alex

    Full stack trace of the causing exception:
    Caused by: org.apache.thrift.TApplicationException: Internal error processing get_open_txns
    at org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
    at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71)
    at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_open_txns(ThriftHiveMetastore.java:3367)
    at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_open_txns(ThriftHiveMetastore.java:3355)
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getValidTxns(HiveMetaStoreClient.java:1545)
    at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.getValidTxns(DbTxnManager.java:279)
    … 12 more

    #57808
    Eugene Koifman
    Moderator

    The tables that ACID support requires are defined in https://github.com/apache/hive/blob/trunk/metastore/scripts/upgrade/derby/hive-txn-schema-0.13.0.derby.sql. (similarly for other DBs).
    I don’t use Sandbox myself so I don’t know if there is a user friendly way to create them, but they must be in the DB.

    #57811
    #57812
    Alex K
    Participant

    Thank you, this helped!

    I created the tables from https://github.com/apache/hive/blob/trunk/metastore/scripts/upgrade/mysql/hive-txn-schema-0.13.0.mysql.sql and changed the hive-site.xml as described in http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1.1/bk_dataintegration/content/ch_using-hive-transactions.html. I’m able to insert rows into Hive using the streaming API now.

The forum ‘Hive / HCatalog’ is closed to new topics and replies.

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.