Hive / HCatalog Forum

webhcat kerberos

  • #53550
    Gwenael Le Barzic
    Participant

    Hello !

    We are using a cluster HDP 2.0.6 secured by Kerberos.
    This topic concerns a problem with webhcat constantly answering error 401 when we try to reach it.

    The two tests are performed from the server where webhcat is installed.

    Here is a first test :

    curl -i http://localhost:50111/templeton/v1?user.name=my_user_name
    HTTP/1.1 200 OK
    Content-Type: application/json
    Transfer-Encoding: chunked
    Server: Jetty(7.6.0.v20120127)
    {"responseTypes":["application/json"]}

    Here is a second test :

    curl -i http://localhost:50111/templeton/v1/status?user.name=my_user_name
    HTTP/1.1 401
    WWW-Authenticate: Negotiate
    Set-Cookie: hadoop.auth=;Path=/;Expires=Thu, 01-Jan-1970 00:00:00 GMT
    Cache-Control: must-revalidate,no-cache,no-store
    Content-Type: text/html;charset=ISO-8859-1
    Content-Length: 1268
    Server: Jetty(7.6.0.v20120127)
    
    < html>
    < head>
    < meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>
    < title>Error 401 </title>
    < /head>
    < body>
    < h2>HTTP ERROR: 401</h2>
    < p>Problem accessing /templeton/v1/status. Reason:
    < pre>    </pre></p>
    < hr /><i><small>Powered by Jetty://</small></i>
    < /body>
    < /html>

    The REST server is launched :

    cat /var/run/webhcat/webhcat.pid
    ps aux | grep 24224
    hcat     24224  0.4  0.4 1578176 301212 ?      Sl   15:42   0:12 /usr/jdk/jdk1.6.0_31/bin/java -Xmx1024m -Djava.net.preferIPv4Stack=true -Dwebhcat.log.dir=/var/log/webhcat/ -Dlog4j.configuration=webhcat-log4j.properties -Dhadoop.log.dir=/var/log/hadoop/hcat -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/lib/hadoop -Dhadoop.id.str=hcat -Dhadoop.root.logger=INFO,console -Djava.library.path=:/usr/lib/hadoop/lib/native/Linux-amd64-64:/usr/lib/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Xmx1024m -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar /usr/lib/hcatalog/sbin/../share/webhcat/svr//webhcat-0.12.0.2.0.6.1-102.jar org.apache.hive.hcatalog.templeton.Main
    275800164 25649 0.0  0.0 103240   840 pts/2    S+   16:33   0:00 grep 24224

    The kerberos configuration seems OK for me, as we have the right keytabs available, we have the right principal.
    I will add the webhcat-site.xml content in the post just after.

    I checked in the documentation and it says HTTP 401 error is linked with wrong credentials :
    http://dev.hortonworks.com.s3.amazonaws.com/HDPDocuments/HDP2/HDP-2.0.3.0/ds_HCatalog/rest.html
    401 Unauthorized: Credentials were missing or incorrect.

    But how can I add the credentials in the request ? I thought that adding user.name at the end of the query was OK, but it didn’t not seem to work here.

    Best regards.

    Gwenael Le Barzic

to create new topics or reply. | New User Registration

  • Author
    Replies
  • #53551
    Gwenael Le Barzic
    Participant

    Here is the content of the webhcat-site.xml :

    <!–Tue May 13 15:41:47 2014–>
    <configuration>
    <property>
    <name>templeton.kerberos.principal</name>
    <value>HTTP/<FQDN OF THE HOST>@<OUR_KB_REALM></value>
    </property>
    <property>
    <name>templeton.port</name>
    <value>50111</value>
    </property>
    <property>
    <name>templeton.hadoop</name>
    <value>/usr/bin/hadoop</value>
    </property>
    <property>
    <name>webhcat.proxyuser.hue.groups</name>
    <value>*</value>
    </property>
    <property>
    <name>templeton.kerberos.secret</name>
    <value>secret</value>
    </property>
    <property>
    <name>templeton.storage.class</name>
    <value>org.apache.hcatalog.templeton.tool.ZooKeeperStorage</value>
    </property>
    <property>
    <name>templeton.hive.path</name>
    <value>hive.tar.gz/hive/bin/hive</value>
    </property>
    <property>
    <name>templeton.kerberos.keytab</name>
    <value>/etc/security/keytabs/spnego.service.keytab</value>
    </property>
    <property>
    <name>templeton.override.enabled</name>
    <value>false</value>
    </property>
    <property>
    <name>templeton.hive.properties</name>
    <value>hive.metastore.local=false,hive.metastore.uris=thrift://<FQDN OF THE HOST>:9083,hive.metastore.sasl.enabled=true,hive.metastore.execute.setugi=true,hive.metastore.warehouse.dir=/apps/hive/warehouse,hive.exec.mode.local.auto=false,hive.metastore.kerberos.principal=hive/_HOST@<OUR_KB_REALM></value>
    </property>
    <property>
    <name>templeton.streaming.jar</name>
    <value>hdfs:///apps/webhcat/hadoop-streaming.jar</value>
    </property>
    <property>
    <name>templeton.hadoop.conf.dir</name>
    <value>/etc/hadoop/conf</value>
    </property>
    <property>
    <name>templeton.zookeeper.hosts</name>
    <value><FQDN OF THE HOST>:2181,<FQDN OF THE HOST>:2181,<FQDN OF THE HOST>:2181</value>
    </property>
    <property>
    <name>templeton.pig.archive</name>
    <value>hdfs:///apps/webhcat/pig.tar.gz</value>
    </property>
    <property>
    <name>templeton.exec.timeout</name>
    <value>60000</value>
    </property>
    <property>
    <name>templeton.jar</name>
    <value>/usr/lib/hcatalog/share/webhcat/svr/webhcat.jar</value>
    </property>
    <property>
    <name>templeton.hive.archive</name>
    <value>hdfs:///apps/webhcat/hive.tar.gz</value>
    </property>
    <property>
    <name>templeton.pig.path</name>
    <value>pig.tar.gz/pig/bin/pig</value>
    </property>
    <property>
    <name>webhcat.proxyuser.hue.hosts</name>
    <value>*</value>
    </property>
    <property>
    <name>templeton.hcat</name>
    <value>/usr/bin/hcat</value>
    </property>
    <property>
    <name>templeton.libjars</name>
    <value>/usr/lib/zookeeper/zookeeper.jar</value>
    </property>
    </configuration>

    Best regards.

    Gwenael Le Barzic

    #53557
    Gwenael Le Barzic
    Participant

    Hello again.

    I found the following link which seems interesting :
    http://www.deplication.net/2014/02/curl-with-kerberos-authentication.html

    By doing :
    1. kinit -f
    2. curl “http://<FQDN_OF_MY_HOST>50111/templeton/v1/status” -u : –negotiate

    I got the right message :
    {"status":"ok","version":"v1"}

    But now, when I try to try a show databases, I have another error, more linked with permissions.

    Here is the test :
    curl “http://<FQDN_OF_MY_HOST>:50111/templeton/v1/ddl/database?like=lab*” -u : –negotiate

    And here is the result :
    {"errorDetail":"org.apache.hadoop.hive.ql.metadata.AuthorizationException: No privilege 'Show_Database' found for in
    puts { }\n\tat org.apache.hadoop.hive.ql.security.authorization.BitSetCheckedAuthorizationProvider.checkAndThrowAuth
    orizationException(BitSetCheckedAuthorizationProvider.java:476)\n\tat org.apache.hadoop.hive.ql.security.authorizati
    on.BitSetCheckedAuthorizationProvider.authorize(BitSetCheckedAuthorizationProvider.java:76)\n\tat org.apache.hive.hc
    atalog.cli.SemanticAnalysis.HCatSemanticAnalyzerBase.authorize(HCatSemanticAnalyzerBase.java:130)\n\tat org.apache.h
    ive.hcatalog.cli.SemanticAnalysis.HCatSemanticAnalyzer.authorizeDDLWork(HCatSemanticAnalyzer.java:267)\n\tat org.apa
    che.hive.hcatalog.cli.SemanticAnalysis.HCatSemanticAnalyzerBase.authorizeDDL(HCatSemanticAnalyzerBase.java:105)\n\ta
    t org.apache.hive.hcatalog.cli.SemanticAnalysis.HCatSemanticAnalyzer.postAnalyze(HCatSemanticAnalyzer.java:234)\n\ta
    t org.apache.hadoop.hive.ql.Driver.compile(Driver.java:444)\n\tat org.apache.hadoop.hive.ql.Driv* Connection #0 to h
    ost <FQDN_OF_MY_HOST> left intact

    I was wondering. Is there impersonification here ? To whom do I have to give permissions ? My personal user that I used when I kinit myself ? Or the hcatalog user ?

    Any helps is appreciated guys !

    Best regards.

    Gwenael Le Barzic

    #53628
    Thejas Nair
    Moderator

    Looks like you have the hive default authorization enabled (client side), but the privileges haven’t been setup correctly. Set hive.security.authorization.enabled=false in your hive-site.xml.

    I would recommend using Storage based authorization on hive mestastore for proper security with HDP 2.0.6.
    https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Authorization#LanguageManualAuthorization-MetastoreServerSecurity
    https://cwiki.apache.org/confluence/display/Hive/HCatalog+Authorization

    In HDP 2.1 you can use https://cwiki.apache.org/confluence/display/Hive/SQL+Standard+based+hive+authorization which provides proper security.

    #53728
    Gwenael Le Barzic
    Participant

    Hello Thejas.

    Thank you for your answer.

    I investigated a little bit in this direction and thanks to one of my colleague, we noticed some parameters in the configuration which were badly set.
    Here they are with the right values:

    hive.security.authorization.enabled=true
    hive.security.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider
    hive.security.metastore.authorization.manager=org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider

    Thanks to these parameters, the authorization are the HDFS ones, which means that I am able to access the structure of the database and the tables inside if I put the right authorizations in the folder /apps/hive/warehouse/<My_DB>.db.

    Best regards.

    Gwenael Le Barzic

You must be to reply to this topic. | Create Account

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.