Home Forums Hive / HCatalog Hive script error

This topic contains 1 reply, has 2 voices, and was last updated by  Sasha J 2 years ago.

  • Creator
    Topic
  • #8930

    Hi there,
    I wrote a Hive script to calculate the Jaccard Index. It works fine on a small data set but throws errors using a medium-sized data set. The error looks to me like it’s a file system error (in blue):

    java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=1) {“key”:{“joinkey0″:”00a8456″},”value”:{“_col0″:”0713b0″},”alias”:1} at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:268) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:519) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1122) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=1) {“key”:{“joinkey0″:”00a8456″},”value”:{“_col0″:”0713b0″},”alias”:1} at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:256) … 7 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /tmp/hive-kurban/hive_2012-08-24_18-03-24_281_7249222202881112947/_task_tmp.-ext-10000/_tmp.000000_0 could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1566) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:673) at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1122) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382) at org.apache.hadoop.hive.ql.exec.JoinOperator.processOp(JoinOperator.java:133) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:247) … 7 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /tmp/hive-kurban/hive_2012-08-24_18-03-24_281_7249222202881112947/_task_tmp.-ext-10000/_tmp.000000_0 could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1566) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:673) at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1122) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:602) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:742) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:745) at org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:852) at org.apache.hadoop.hive.ql.exec.JoinOperator.processOp(JoinOperator.java:108) … 9 more Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /tmp/hive-kurban/hive_2012-08-24_18-03-24_281_7249222202881112947/_task_tmp.-ext-10000/_tmp.000000_0 could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1566) atorg.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:673) at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1122) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382) at org.apache.hadoop.ipc.Client.call(Client.java:1092) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229) at $Proxy2.addBlock(Unknown Source) at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62) at $Proxy2.addBlock(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3595) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3456) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClient.java:2672) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2912)

    I emailed Sheetal who asked me to file this incident. I looked some more at the other error messages but didn’t find any additional information. The job number was:

    job_201208231359_0075

    I ran the job on hadoop01 because on hadoop05 I’m getting:

    [kurban@hadoop05 esa_similarities]$ hive -f jaccard.sql
    WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.
    Logging initialized using configuration in jar:file:/usr/lib/hive/lib/hive-common-0.9.0.jar!/hive-log4j.properties
    Hive history file=/tmp/kurban/hive_job_log_kurban_201208270926_682189032.txt
    OK
    Time taken: 3.937 seconds
    OK
    Time taken: 0.0040 seconds
    OK
    Time taken: 0.0040 seconds
    OK
    Time taken: 0.0050 seconds
    OK
    Time taken: 0.0050 seconds
    FAILED: Error in metadata: MetaException(message:Got exception: org.apache.hadoop.security.AccessControlException org.apache.hadoop.security.AccessControlException: Permission denied: user=kurban, access=EXECUTE, inode=”hive”:hive:hdfs:rwx——)
    FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
    [kurban@hadoop05 esa_similarities]

    Thanks,

    Kryztof

Viewing 1 replies (of 1 total)

You must be logged in to reply to this topic.

  • Author
    Replies
  • #8931

    Sasha J
    Moderator

    Kryztof,
    you have this permission error when trying to write to HDFS:
    org.apache.hadoop.security.AccessControlException: Permission denied: user=kurban, access=EXECUTE, inode=”hive”:hive:hdfs:rwx——)
    You can run your script as user “hive”, or you can create your output location in HDFS and make it owned by user “kurban”.

    Other error:
    org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /tmp/hive-kurban/hive_2012-08-24_18-03-24_281_7249222202881112947/_task_tmp.-ext-10000/_tmp.000000_0 could only be replicated to 0 nodes, instead of 1
    could also be related to same permissions problems.

    Please, make sure you have user “kurban” on all cluster nodes and make sure you have setup output location with the correct ownership.

    Thank you!
    Sasha

    Collapse
Viewing 1 replies (of 1 total)