Hive / HCatalog Forum

Hive JDBC Client connection to Sandbox

  • #23250
    Tanzir
    Participant

    Hello everyone,

    I have installed HDP sandbox and able to run hive in both environment(through web interface and by putting my local machine and running hive command). But when I tried to connect sandbox’s hive from my java client, I’m facing connection issue. Looks like it takes forever to make a connection (it doesn’t go through).

    I got this url right after sandbox is started: http://192.168.56.101/

    My firewall is disabled as well as my anti-virus.

    For the client, I used exactly the same code (except the url) given in the example:
    https://cwiki.apache.org/confluence/display/Hive/HiveClient

    Here is my code:

    package com.cci.jetstream.nosql.wfe.utils;

    import java.sql.Connection;
    import java.sql.DriverManager;
    import java.sql.ResultSet;
    import java.sql.SQLException;
    import java.sql.Statement;

    public class JdbcHiveSampleClient {
    private static String driverName = "org.apache.hadoop.hive.jdbc.HiveDriver";

    /**
    * @param args
    * @throws SQLException
    */
    public static void main(String[] args) throws SQLException {
    try {
    Class.forName(driverName);
    } catch (ClassNotFoundException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    System.exit(1);
    }
    Connection con = DriverManager.getConnection("jdbc:hive://192.168.56.101:10000/default", "root", "");
    //Connection con = DriverManager.getConnection("jdbc:hive://10.0.2.15:10000/default", "", "");
    Statement stmt = con.createStatement();
    String tableName = "testHiveDriverTable";
    stmt.executeQuery("drop table " + tableName);
    ResultSet res = stmt.executeQuery("create table " + tableName + " (key int, value string)");
    // show tables
    String sql = "show tables '" + tableName + "'";
    System.out.println("Running: " + sql);
    res = stmt.executeQuery(sql);
    if (res.next()) {
    System.out.println(res.getString(1));
    }
    // describe table
    sql = "describe " + tableName;
    System.out.println("Running: " + sql);
    res = stmt.executeQuery(sql);
    while (res.next()) {
    System.out.println(res.getString(1) + "\t" + res.getString(2));
    }

    // load data into table
    // NOTE: filepath has to be local to the hive server
    // NOTE: /tmp/a.txt is a ctrl-A separated file with two fields per line
    String filepath = "/tmp/a.txt";
    sql = "load data local inpath '" + filepath + "' into table " + tableName;
    System.out.println("Running: " + sql);
    res = stmt.executeQuery(sql);

    // select * query
    sql = "select * from " + tableName;
    System.out.println("Running: " + sql);
    res = stmt.executeQuery(sql);
    while (res.next()) {
    System.out.println(String.valueOf(res.getInt(1)) + "\t" + res.getString(2));
    }

    // regular hive query
    sql = "select count(1) from " + tableName;
    System.out.println("Running: " + sql);
    res = stmt.executeQuery(sql);
    while (res.next()) {
    System.out.println(res.getString(1));
    }
    }
    }

    What I’m missing here? Any information and help will be highly appreciated. Thanks in advance.

    Tanzir

to create new topics or reply. | New User Registration

  • Author
    Replies
  • #23318
    Seth Lyubich
    Moderator

    Hi Tanzir,

    I think you can start by making sure that Hive Server is reachable. You can try telneting to Hive server port with something like ‘telnet sandbox_IP 10000′. If it is able to connect then the problem is in the code, driver, etc.

    Please let us know if this is helpful. Also, can you provide any errors messages that you are getting?

    Thanks,
    Seth

    #23322
    Tanzir
    Participant

    Hi Sef,
    Thanks a lot for your reply. I did try with the telnet. My sandbox is running on 192.168.56.101 and from my desktop(windows 7) I logged in by putty and ran “telnet 192.168.56.101 10000″ and was able to connect it.

    I have also installed Hortonworks Hive ODBC driver and tested the connection(through ODBC Data Source Administrator) and it was successful.

    Here is it’s output:


    Driver Version: V1.2.0.1005

    Running connectivity tests...

    Attempting connection
    Connection established
    Disconnecting from server

    TESTS COMPLETED SUCCESSFULLY!

    So it looks like the problem is only related to JDBC /JDBC driver?

    If it fails to create a connection does it generate any logs before that? In that case, which file I should be looking into? hive-server2.log under /var/log/hive ?

    Have anyone tried JDBC client with Sandbox yet?

    Thanks again for your help.

    #23360
    Tanzir
    Participant

    I am able to make it work, but exactly not sure what went wrong:

    I ran the following command:

    -bash-4.1$ hive --service hiveserver -p 10001

    And the I changed my url port to 10001 and it worked.

    I did try with 10000 and it didn’t work. It threw exception when I started hiveserver:


    -bash-4.1$ hive --service hiveserver -p 10000
    Starting Hive Thrift Server
    WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.
    org.apache.thrift.transport.TTransportException: Could not create ServerSocket on address 0.0.0.0/0.0.0.0:10000.
    at org.apache.thrift.transport.TServerSocket.(TServerSocket.java:93)
    at org.apache.thrift.transport.TServerSocket.(TServerSocket.java:75)
    at org.apache.hadoop.hive.metastore.TServerSocketKeepAlive.(TServerSocketKeepAlive.java:34)
    at org.apache.hadoop.hive.service.HiveServer.main(HiveServer.java:674)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

    I checked 10000 port with netstat but didn’t get useful information:

    -bash-4.1$ netstat -nl | grep 10000
    tcp 0 0 0.0.0.0:10000 0.0.0.0:* LISTEN

    My assumption was, sandbox by default starts hiveserver on port 10000. So am I wrong here? Then if it didn’t start by default then how the ODBC connection was successful?

    Any information would be highly appreciated.

    Thanks again.

    #23364
    Sasha J
    Moderator

    Tanzir,
    that’s right, Hive server starting on port 10000 by default.
    Please, run this command on Sandbox:
    netstat -anlp | grep 10000
    You will see something like this:
    tcp 0 0 0.0.0.0:10000 0.0.0.0:* LISTEN 9132/java

    So, by looking at the command line of correspondent process, you can see what was actually started.

    Hope this helps.

    thank you!
    Sasha

    #23406
    Tanzir
    Participant

    Hi Sasha,
    Thanks for your reply. When I ran that command, I got this:


    [root@sandbox ~]# netstat -anlp | grep 10000
    tcp 0 0 0.0.0.0:10000 0.0.0.0:* LISTEN 3198/java

    Then I did this:


    [root@sandbox ~]# ls -l /proc/3198/exe
    lrwxrwxrwx 1 hive hadoop 0 Apr 26 12:07 /proc/3198/exe -> /usr/jdk/jdk1.6.0_31/bin/java

    So it looks like hive is using 10000, then why the JDBC connection is not successful when using port 10000 (hive.url=jdbc:hive://192.168.56.101:10000/default)? And why it works when I start another hive server on 10001 and use 10001 for JDBC connection (hive.url=jdbc:hive://192.168.56.101:10001/default)?

    Thanks again for your time and help.

    #23415
    Seth Lyubich
    Moderator

    Hi,

    I think you can try to restart Hive server 2 on port 10000 and see if this happens again. You can also try test Hive serve2 using beeline to see if you can access data in following way:

    start beeline interface
    # /usr/lib/hive/bin/beeline

    Connect to Hive Server2:
    beeline> !connect jdbc:hive2://sandbox:10000 username password org.apache.hive.jdbc.HiveDriver

    Then you cant try command like:
    show tables;

    And see if it works. If it works, you can try to see if you can access Hive Server 2 over ODBC or JDBC.

    Hope this helps,

    Thanks,
    Seth

    #23969
    Tanzir
    Participant

    Thanks a lot Sef, I have figured out the problem. Thanks again for your time.

    #23970
    Seth Lyubich
    Moderator

    Hi Tanzir,

    Thanks for letting me know that issue is resolved. Can you please let me know what was the issue?

    Seth

    #24882
    Tanzir
    Participant

    Hi Sef,
    JDBC connection wasn’t working when I was trying to use it with the sandbox. After restarting the hive server on port 10000 resolved the problem.

    #24887
    Seth Lyubich
    Moderator

    Hi Tanzir,

    Thanks for letting us know.

    Seth

    #48068

You must be to reply to this topic. | Create Account

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.