Home Forums MapReduce Parallel Reducer execution never goes above two…

This topic contains 1 reply, has 1 voice, and was last updated by  Rupert Bailey 4 months ago.

  • Creator
    Topic
  • #50719

    Rupert Bailey
    Participant

    I can’t seem to get my HDP instance to run any more than two reducers in parallel, even though I’ve got 4 processors 8GB ram and small JVM’s allocated to the instance. I am running a 10M row teragen’ed terasort requesting 4 reducer instances with the following command:

    hadoop jar /usr/lib/hadoop/hadoop-examples.jar terasort -Dmapred.job.queue.name=default -Dmapred.reduce.tasks=4 ./terasort-input ./terasort-output

    Values that I have set in the mapred-site.xml file are:
    mapred.cluster.map.memory.mb=768
    mapred.tasktracker.map.tasks.maximum=8
    mapred.cluster.max.map.memory.mb=6144
    mapred.job.map.memory.mb=1536
    mapred.cluster.reduce.memory.mb=512
    mapred.tasktracker.reduce.tasks.maximum=8
    mapred.cluster.max.reduce.memory.mb=4096
    mapred.job.reduce.memory.mb=2048

    capacity-shedule.xml:
    mapred.capacity-scheduler.queue.default.capacity=50
    (4 reducers appear on localhost:50030/scheduler for this default queue)
    mapred.capacity-scheduler.queue.queue1.capacity=25
    mapred.capacity-scheduler.queue.queue2.capacity=25

    I have the following specifications:

    Hadoop Version: 1.3.2
    Install Method: Ambari
    Nodes: 1
    Operating System: Centos 6.5 Desktop
    Virtual Machine: VMWare.vmx 8GB allocated ram, 4 allocated virtual CPU’s
    Physical Machine: 16GB Ram 2 Core Hyperthreaded i5-3320 (4 threads)

    What do I need to tweek to get this to run 4 reducers at once? I have been able to make it do 4 reducers, but only two at once.

Viewing 1 replies (of 1 total)

You must be logged in to reply to this topic.

  • Author
    Replies
  • #50771

    Rupert Bailey
    Participant

    Okay the solution was to increase mapred.cluster.reduce.memory.mb

    mapred.cluster.reduce.memory.mb=768MB #allowed 2 reducers to run at once
    mapred.cluster.reduce.memory.mb=1536MB #allowed all 4 reducers to run at once.

    So it seems this is a cluster wide memory setting that sets the limit of reducer task memory across the cluster of each slot allocation. But that’s a guess Feel free to comment if you understand why this is the case. :)

    Collapse
Viewing 1 replies (of 1 total)