YARN Forum

yarn-utils.py calculations

  • #54688
    Tom Stewart

    I ran the yarn-utils.py with the following output:

    [scripts]# python yarn-utils.py -c 12 -m 96 -d 6 -k True
    Using cores=12 memory=96GB disks=6 hbase=True
    Profile: cores=12 memory=69632MB reserved=28GB usableMem=68GB disks=6
    Num Container=11
    Container Ram=6144MB
    Used Ram=66GB
    Unused Ram=28GB

    When I compare that to the following page:

    Shouldn’t these values be equal per the calculation table?

    Configuration File Configuration Setting Value Calculation
    yarn-site.xml yarn.nodemanager.resource.memory-mb = Containers * RAM-per-Container
    yarn-site.xml yarn.scheduler.minimum-allocation-mb = RAM-per-Container
    yarn-site.xml yarn.scheduler.maximum-allocation-mb = containers * RAM-per-Container
    mapred-site.xml mapreduce.map.memory.mb = RAM-per-Container
    mapred-site.xml mapreduce.reduce.memory.mb = 2 * RAM-per-Container
    mapred-site.xml mapreduce.map.java.opts = 0.8 * RAM-per-Container
    mapred-site.xml mapreduce.reduce.java.opts = 0.8 * 2 * RAM-per-Container
    yarn-site.xml (check) yarn.app.mapreduce.am.resource.mb = 2 * RAM-per-Container
    yarn-site.xml (check) yarn.app.mapreduce.am.command-opts = 0.8 * 2 * RAM-per-Container

    The values in the table calculated as “2 * RAM-per-Container” don’t appear to be that way in the python script. What values should I use for my cluster, those I calculate from the web page or take the ones from the script?

to create new topics or reply. | New User Registration

You must be to reply to this topic. | Create Account

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.