Home Forums Hive / HCatalog MapRedTask failed when executing Hive mapreduce query from beeline

This topic contains 14 replies, has 2 voices, and was last updated by  Juan Martin Pampliega 8 months, 1 week ago.

  • Creator
    Topic
  • #43747

    Hi,
    I have a cluster with HDP 2 installed and running correctly for a few days with the default configuration. The queries to hive work fine using the hive command line.
    The problem appears when I try to query hive from beeline or any another application using jdbc.

    The queries that do not launch any mapreduce jobs (like select *) work fine but the ones that generate mapreduce jobs fail with the following error:

    2013-11-14 10:41:55,502 WARN thrift.ThriftCLIService (ThriftCLIService.java:ExecuteStatement(213)) – Error fetching results:
    org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
    at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:104)
    at org.apache.hive.service.cli.operation.SQLOperation.run(SQLOperation.java:145)
    at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:208)
    at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:190)
    at org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:150)
    at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:207)
    at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1133)
    at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1118)
    at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
    at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
    at org.apache.hive.service.auth.TUGIContainingProcessor$1.run(TUGIContainingProcessor.java:58)
    at org.apache.hive.service.auth.TUGIContainingProcessor$1.run(TUGIContainingProcessor.java:55)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:526)
    at org.apache.hive.service.auth.TUGIContainingProcessor.process(TUGIContainingProcessor.java:55)
    at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)

    From what I see this is an issue due to incorrect permissions but if I launch the hive command line with hdfs or hive user and run the same queries they work fine.

    Any ideas how to fix this?

Viewing 14 replies - 1 through 14 (of 14 total)

You must be logged in to reply to this topic.

  • Author
    Replies
  • #43993

    By local mode I mean when the the map reduce job is converted to a local map reduce job.

    I tried changing the /tmp/hive-hive permissions but it doesn’t seem to improve anything. I think the issue is originated because the local map-reduce job uses a local filesystem and it does not have the same permissions as the directory in HDFS.

    Collapse
    #43971

    Thejas Nair
    Participant

    By local mode, do you mean running hive command line in local mode ? (not through hive server2). If it is not related to hive server2, creating a new topic would be better.

    For the hive server2 issue, have you checked if changing the permissions of “/tmp/hive-hive/” will fix the issue (instead of disabling dfs permission issue) ?

    Collapse
    #43965

    So, I have bypassed this issue by setting dfs.permissions.enabled to false.

    The problem now is that for the query I need to register a csv serde jar. I have added the property hive.aux.jars.path to the hive-site.xml and tried different ways of pointing to the jar but they all fail when running mapreduce jobs in local mode.

    Any ideas how I need to reference this jar or where can I put it?

    Collapse
    #43932

    The .log file in the /tmp/hive directory of the machine where the hiveserver2 is running shows the following stack trace:

    org.apache.hadoop.security.AccessControlException: Permission denied: user=hive, access=EXECUTE, inode=”/tmp/hive-hive/hive_2013-11-18_12-47-20_879_8044219515320128885-2″:hdfs:hdfs:drwx——
    at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:234)
    at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkTraverse(FSPermissionChecker.java:187)
    at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:150)

    Collapse
    #43914

    Hi, I have fixed running queries from the hive command line from the aws-dev-02 machine with users hdfs and hive.
    Also, I have realized that running queries that run actual map reduce jobs from beeline from another machine work fine, the problem is when hive transforms the mapreduce query to a local job.

    Any ideas how to fix this?

    Collapse
    #43871

    Thejas Nair
    Participant

    Can you use the same user as what hiveserver2 runs as and run the hive command line from same machine? If you see an error with the home dir with this user as well, can you try creating the home dir ?

    Collapse
    #43858

    No, still having the same problem.

    What do you mean by missing home dir? the hive user or hdfs user should have home directories in the aws-dev-02 machine?

    Collapse
    #43857

    Thejas Nair
    Participant

    were you able to fix the issue ? Is it a problem with missing home dir for this user or permissions for the bin/hadoop script?

    Collapse
    #43850

    The exception I posted earlier is taken from the hive.log.

    When I ran a mapreduce query in the hive command line in aws-dev-02 as either the hdfs or the hive user I get the following error:

    java.io.IOException: Cannot run program “/usr/lib/hadoop/bin/hadoop” (in directory “/home/ec2-user”): error=13, Permission denied
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:1041)
    at java.lang.Runtime.exec(Runtime.java:617)
    at java.lang.Runtime.exec(Runtime.java:450)

    I usually run the queries from the hive command line with the ec2-user who belongs to the hdfs and hive groups.

    Btw, thanks for all your help Thejas.

    Collapse
    #43781

    Thejas Nair
    Participant

    Yes, assuming aws-dev-02 is the hive server2 host, can you try running hive command line from there as hive user ?
    Can you also check the /tmp/hive/hive.log (or different file if the log dir is configured differently for hive client) to see if there is additional error information there ?

    Collapse
    #43768

    I see no map reduce jobs being launched in the YARN RM console or the MR2 JobHistory list.

    You mean running the hive command line client in the aws-dev-02 host? I am running it in another host and also the beeline command line.

    Collapse
    #43767

    Thejas Nair
    Participant

    The “hive” that you are seeing in your core-site.xml *proxyuser* config is the hive user. So that looks right, as long as aws-dev-02 is the hostname of the hiveserver2 machine. You can also try changing the hostname to hadoop.proxyuser.hive.hosts to * , if there some issue with the host name mapping.

    Do you see a map reduce job being launched in the RM web UI ?
    When you tried running hive command line, did you try it from same host as the hive user ?

    Collapse
    #43766

    I have all the default values set by the Ambari installer.

    Hiveserver2 is running as user hive. Should I run the query with the same user?

    I have hive.server2.enable.doAs set to true.

    I am not exactly sure what you mean by the proxyuser in the core-site.xml. I don’t have any properties set for hiveserve2 like the ones I have for hive:

    hadoop.proxyuser.hive.groups=users
    hadoop.proxyuser.hive.hosts=aws-dev-02

    Should I have the same for hiveserver2? which values should they be?

    Collapse
    #43765

    Thejas Nair
    Participant

    What user are you running hiveserver2 as ? do you see a map reduce job being launched in the RM web ui ?
    Do you have hive.server2.enable.doAs set to any value in hive-site.xml ? Do you have the hive server2 user in core-site.xml proxy user entries ?

    Collapse
Viewing 14 replies - 1 through 14 (of 14 total)