Home Forums HDFS Permission issues while running mr-jobs

This topic contains 36 replies, has 2 voices, and was last updated by  Sasha J 1 year, 6 months ago.

  • Creator
    Topic
  • #10739

    Hello,
    I have set up a 3 node hadoop cluster (using HDP 1.1). Namenode is being run as hdfs user and the jobtracker is being run as mapred user (both belong to supergroup). I am currently trying to run a MR job as root and it gives me the following error :

    2012-10-09 10:06:31,233 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:root cause:org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=READ_EXECUTE, inode=”system”:mapred:supergroup:rwx——

    This error arises when the mr job tries to write into the /tmp/hadoop-mapred/mapred/system directory. This directory has permission 700. To overcome this error I tried setting permissions of this directory to 700. But the hadoop framework keeps complaining and tells me that I have to set the permission of this directory back to 700.

    hadoop dfs -ls /tmp/hadoop-mapred/mapred/
    Found 2 items
    drwxrwxrwx – root supergroup 0 2012-10-09 10:03 /tmp/hadoop-mapred/mapred/staging
    drwx—— – mapred supergroup 0 2012-10-09 10:03 /tmp/hadoop-mapred/mapred/system

    Please let me know what I’m doing wrong here. Can’t seem to run a map reduce job as root.

    Thanks,
    Aishwarya

Viewing 6 replies - 31 through 36 (of 36 total)

The topic ‘Permission issues while running mr-jobs’ is closed to new replies.

  • Author
    Replies
  • #10750

    Is there some way to print the contents of mapred.system.dir ? I have a feeling this is set to /tmp/hadoop-mapred/mapred/system currently.

    I am listing the contents of the config files :

    hdfs_site.xml

    dfs.block.size
    134217728

    dfs.name.dir
    /state/partition1/apache-hdfs/namedir

    dfs.hosts.exclude
    /usr/lib/hadoop/conf/apache-hdfs/hdfs/dfs.exclude

    dfs.secondary.http.address
    aishdev.local:50090

    dfs.http.address
    aishdev.local:50070

    dfs.name.dir
    /state/partition1/apache-hdfs/snamedir

    dfs.block.size
    134217728

    core-site.xml

    fs.default.name
    hdfs://aishdev.local:9800/

    topology.script.file.name
    /opt/rocks/bin/hadoop-topology

    io.file.buffer.size
    131072

    io.file.buffer.size
    131072

    mapred-site.xml

    mapred.job.tracker
    aishdev:9801

    io.sort.mb
    160

    io.sort.spill.percent
    1.0

    io.sort.factor
    100

    mapred.child.java.opts
    -Xmx1g

    mapred.jobtracker.taskScheduler
    org.apache.hadoop.mapred.FairScheduler

    io.sort.record.percent
    0.138

    Collapse
    #10749

    Sasha J
    Moderator

    Works fine for me…

    [root@node ~]# su – hdfs -c “hadoop fs -mkdir /user/root”
    [root@node ~]# su – hdfs -c “hadoop fs -chown -R root /user/root”
    [root@node ~]# hadoop jar /usr/lib/hadoop/hadoop-examples.jar pi 5 5
    Number of Maps = 5
    Samples per Map = 5
    Wrote input for Map #0
    Wrote input for Map #1
    Wrote input for Map #2
    Wrote input for Map #3
    Wrote input for Map #4
    Starting Job
    12/10/09 12:44:12 INFO mapred.FileInputFormat: Total input paths to process : 5
    12/10/09 12:44:12 INFO mapred.JobClient: Running job: job_201210091143_0004
    12/10/09 12:44:13 INFO mapred.JobClient: map 0% reduce 0%
    12/10/09 12:44:19 INFO mapred.JobClient: map 40% reduce 0%
    12/10/09 12:44:22 INFO mapred.JobClient: map 60% reduce 0%
    12/10/09 12:44:23 INFO mapred.JobClient: map 80% reduce 0%
    12/10/09 12:44:25 INFO mapred.JobClient: map 100% reduce 0%
    12/10/09 12:44:30 INFO mapred.JobClient: map 100% reduce 100%
    12/10/09 12:44:31 INFO mapred.JobClient: Job complete: job_201210091143_0004
    12/10/09 12:44:31 INFO mapred.JobClient: Counters: 30
    12/10/09 12:44:31 INFO mapred.JobClient: Job Counters
    12/10/09 12:44:31 INFO mapred.JobClient: Launched reduce tasks=1
    12/10/09 12:44:31 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=18142
    12/10/09 12:44:31 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
    12/10/09 12:44:31 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
    12/10/09 12:44:31 INFO mapred.JobClient: Launched map tasks=5
    12/10/09 12:44:31 INFO mapred.JobClient: Data-local map tasks=5
    12/10/09 12:44:31 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=11352
    12/10/09 12:44:31 INFO mapred.JobClient: File Input Format Counters
    12/10/09 12:44:31 INFO mapred.JobClient: Bytes Read=590
    12/10/09 12:44:31 INFO mapred.JobClient: File Output Format Counters
    12/10/09 12:44:31 INFO mapred.JobClient: Bytes Written=97
    12/10/09 12:44:31 INFO mapred.JobClient: FileSystemCounters
    12/10/09 12:44:31 INFO mapred.JobClient: FILE_BYTES_READ=65
    12/10/09 12:44:31 INFO mapred.JobClient: HDFS_BYTES_READ=1170
    12/10/09 12:44:31 INFO mapred.JobClient: FILE_BYTES_WRITTEN=161531
    12/10/09 12:44:31 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=215
    12/10/09 12:44:31 INFO mapred.JobClient: Map-Reduce Framework
    12/10/09 12:44:31 INFO mapred.JobClient: Map output materialized bytes=175
    12/10/09 12:44:31 INFO mapred.JobClient: Map input records=5
    12/10/09 12:44:31 INFO mapred.JobClient: Reduce shuffle bytes=175
    12/10/09 12:44:31 INFO mapred.JobClient: Spilled Records=20
    12/10/09 12:44:31 INFO mapred.JobClient: Map output bytes=90
    12/10/09 12:44:31 INFO mapred.JobClient: Total committed heap usage (bytes)=1200553984
    12/10/09 12:44:31 INFO mapred.JobClient: CPU time spent (ms)=3410
    12/10/09 12:44:31 INFO mapred.JobClient: Map input bytes=120
    12/10/09 12:44:31 INFO mapred.JobClient: SPLIT_RAW_BYTES=580
    12/10/09 12:44:31 INFO mapred.JobClient: Combine input records=0
    12/10/09 12:44:31 INFO mapred.JobClient: Reduce input records=10
    12/10/09 12:44:31 INFO mapred.JobClient: Reduce input groups=10
    12/10/09 12:44:31 INFO mapred.JobClient: Combine output records=0
    12/10/09 12:44:31 INFO mapred.JobClient: Physical memory (bytes) snapshot=742219776
    12/10/09 12:44:31 INFO mapred.JobClient: Reduce output records=0
    12/10/09 12:44:31 INFO mapred.JobClient: Virtual memory (bytes) snapshot=6451920896
    12/10/09 12:44:31 INFO mapred.JobClient: Map output records=10
    Job Finished in 19.425 seconds
    Estimated value of Pi is 3.68000000000000000000
    [root@node ~]#

    What is your hdfs-site.xml, mapped-site.xml, core-site.xml?
    As you can see, you job trying to use /tmp/hadoop-mapred/mapred/system, which it should not.
    Did you make any changes in the configuration files?

    Collapse
    #10748

    I created a home directory for user ‘root’

    hadoop dfs -ls /user
    Found 1 items
    drwxr-xr-x – root supergroup 0 2012-10-09 12:04 /user/root

    But when I run the job I still get the same error.

    hadoop jar /usr/lib/hadoop/hadoop-examples.jar pi 5 5
    Number of Maps = 5
    Samples per Map = 5
    Wrote input for Map #0
    Wrote input for Map #1
    Wrote input for Map #2
    Wrote input for Map #3
    Wrote input for Map #4
    Starting Job
    12/10/09 12:06:33 INFO mapred.FileInputFormat: Total input paths to process : 5
    12/10/09 12:06:33 INFO mapred.JobClient: Running job: job_201210091002_0003
    12/10/09 12:06:34 INFO mapred.JobClient: map 0% reduce 0%
    12/10/09 12:06:34 INFO mapred.JobClient: Job complete: job_201210091002_0003
    12/10/09 12:06:34 INFO mapred.JobClient: Counters: 0
    12/10/09 12:06:34 INFO mapred.JobClient: Job Failed: Job initialization failed:
    org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=EXECUTE, inode=”system”:mapred:supergroup:rwx——

    The namenode logs indicate the following:

    2012-10-09 12:06:33,395 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 9800, call create(/tmp/hadoop-mapred/mapred/system/job_201210091002_0003/jobToken, rw
    xr-xr-x, DFSClient_728313058, true, true, 3, 134217728) from 10.1.1.1:55198: error: org.apache.hadoop.security.AccessControlException: Permission denied: user=root, acc
    ess=EXECUTE, inode=”system”:mapred:supergroup:rwx——
    org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=EXECUTE, inode=”system”:mapred:supergroup:rwx——

    Do I also have to fix configuration settings elsewhere ?

    Collapse
    #10747

    Sasha J
    Moderator

    Any user should be able to execute mapreduce jobs.
    it should have it’s own “home” directory in HDFS.
    For example:

    [root@node ~]# hadoop fs -ls /user
    Found 5 items
    drwxrwx— – ambari_qa hdfs 0 2012-10-04 14:06 /user/ambari_qa
    drwxr-xr-x – hdfs hdfs 0 2012-10-09 11:47 /user/hdfs
    drwx—— – hive hdfs 0 2012-10-09 11:49 /user/hive
    drwxrwx— – oozie hdfs 0 2012-09-11 18:28 /user/oozie
    drwxr-xr-x – templeton hdfs 0 2012-09-11 18:35 /user/templeton

    [root@node ~]# su – hbase -c “hadoop jar /usr/lib/hadoop/hadoop-examples.jar pi 5 5″
    Number of Maps = 5
    Samples per Map = 5
    org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException: Permission denied: user=hbase, access=WRITE, inode=”user”:hdfs:hdfs:rwxr-xr-x
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

    [root@node ~]# su – hdfs -c “hadoop fs -mkdir /user/hbase”

    [root@node ~]# su – hdfs -c “hadoop fs -chown -R hbase /user/hbase”
    [root@node ~]# hadoop fs -ls /user
    Found 6 items
    drwxrwx— – ambari_qa hdfs 0 2012-10-04 14:06 /user/ambari_qa
    drwx—— – hbase hdfs 0 2012-10-09 11:51 /user/hbase
    drwxr-xr-x – hdfs hdfs 0 2012-10-09 11:47 /user/hdfs
    drwx—— – hive hdfs 0 2012-10-09 11:49 /user/hive
    drwxrwx— – oozie hdfs 0 2012-09-11 18:28 /user/oozie
    drwxr-xr-x – templeton hdfs 0 2012-09-11 18:35 /user/templeton
    [root@node ~]# su – hbase -c “hadoop jar /usr/lib/hadoop/hadoop-examples.jar pi 5 5″
    Number of Maps = 5
    Samples per Map = 5
    Wrote input for Map #0
    Wrote input for Map #1
    Wrote input for Map #2
    Wrote input for Map #3
    Wrote input for Map #4
    Starting Job
    12/10/09 11:51:46 INFO mapred.FileInputFormat: Total input paths to process : 5
    12/10/09 11:51:47 INFO mapred.JobClient: Running job: job_201210091143_0003
    12/10/09 11:51:48 INFO mapred.JobClient: map 0% reduce 0%
    12/10/09 11:51:54 INFO mapred.JobClient: map 40% reduce 0%
    12/10/09 11:51:57 INFO mapred.JobClient: map 60% reduce 0%
    12/10/09 11:51:59 INFO mapred.JobClient: map 80% reduce 0%
    12/10/09 11:52:00 INFO mapred.JobClient: map 100% reduce 0%
    12/10/09 11:52:06 INFO mapred.JobClient: map 100% reduce 100%

    Collapse
    #10746

    Yes running the mapred job as the same user who started jobtracker (mapred in this case) works. But we want any user to be able to run jobs on this MR cluster. Is that not possible with HDP 1.1 ? Also was there some permission related changes in HDP 1.1 ?

    Collapse
    #10745

    Sasha J
    Moderator

    In general, your should not run anything as root.
    run your job as mapred used and it will be executed normally.
    directory /tmp/hadoop-mapred/mapred/system is used by JobTracker and no other users should touch it…

    Collapse
Viewing 6 replies - 31 through 36 (of 36 total)