Multi-Tenancy in HDP 2.0: Capacity Scheduler and YARN
YARN and the Hortonworks Data Platform 2.0 enables one Hadoop cluster to share data and analytical processing capabilities across the Enterprise organization. Organizations can use the Hortonworks Data Platform 2.0 to:
- Pool all enterprise data into one scalable and reliable storage platform
- Enable all analytical processing IN the data platform
- Provide access to this data and processing across all business units
The Capacity Scheduler (CS) ensures that groups of users and applications will get a guaranteed share of the cluster, while maximizing overall utilization of the cluster. Through an elastic resource allocation, if the cluster has available resources then users and applications can take up more of the cluster than their guaranteed minimum share.
In HDP 1.x and MapReduce v1, the cluster’s capacity is measured in MapReduce slots. Each node in the cluster has a pre-defined set of slots, and the Capacity Scheduler ensures that a percentage of those slots are available to a set of users and groups.
In HDP 2.0, with YARN and MapReduce v2, the cluster capacity is measured as the physical resources (RAM now, and CPU as well in the future) that is available across the entire cluster.
In this blog post, we’ll walk through how to configure the YARN Capacity Scheduler to deliver multi-tenancy across different groups and users in your organization:
- Configuring guaranteed capacities across organizational groups
- Configuring resource limits to protect against users and applications from monopolizing the cluster
- Configuring access control and sharing for a set of users in an organizational group
- Managing the Capacity Scheduler through Ambari
Managing the Capacity Scheduler through Ambari
Apache Ambari in HDP 2.0 provides an interface to configure the Capacity Scheduler:
The Capacity Scheduler queues, capacities and user ACLs can be dynamically updated. Just update the configuration above, and run the following admin command on the cluster to refresh the scheduler configuration without restarting any services:
$HADOOP_YARN_HOME/bin/yarn rmadmin -refreshQueues
Capacity Guarantees across Organizations
With the Capacity Scheduler, you can assign minimum guaranteed capacities to groups of users or applications. For instance, in an Enterprise Organization with three Business Units (BU) that share a cluster, each Business Unit can be assigned to a CS queue and guaranteed a minimum capacity.
For example, the following defines queues for the ‘Marketing’, ‘Finance’ and ‘Product’ Business Units in the cluster:
yarn.scheduler.capacity.root.queues=”Marketing, Finance, Product”
Cluster capacity share is divided up amongst the BUs. Each BU is given a minimum capacity share for the cluster. Capacity in YARN refers to the resource available across all the Node Managers for YARN to assign containers to. Each BU is given a minimum guaranteed percentage of total cluster capacity available. The total guaranteed capacity must equal 100%.
These queues will enforce cluster resource sharing across any applications submitted to these queues. The Marketing organizational group will get a minimum share of 50% of the cluster across all its applications (MapReduce v2, Tez, graph processing, etc), regardless the type of application.
Enforcing Capacity Limits
The CS provides elastic resource scheduling, which means that if some of the resources in cluster are idle, then one queue can take up more of the cluster capacity than was minimally allocated to them in the above configuration. This elasticity can be controlled via maximum capacity allocations. The following enforces that the Marketing organization will be able to spike its cluster utilization only up to 75% of total capacity when available:
With YARN, long running applications (like streaming apps) as well as short lived application jobs will share resources in the same cluster. In order to plan for the capacity that will be requested and kept by long running applications, a separate queue can be configured for those long-running applications. The following set of properties add two queues for Marketing, and assigns a guaranteed share of 25% to long running marketing applications, while capping the cluster utilization of long running marketing applications to 25% as well.
yarn.scheduler.capacity.root.queues=“Marketing-longrunning, Marketing-adhoc, Finance, Product”
Enforcing Access Control and User Limits
The queues set up above will define how cluster utilization across the Business Units is split. Each queue will be given an Access Control List (ACL) that authorizes the set of users and groups can submit jobs to the queue.
yarn.scheduler.capacity.root.Marketing-adhoc.acl_submit_applications=“Samantha, Rahul, Maria, William”
To ensure that one user does not monopolize the Marketing queue, a common user limit can be configured for the queue. The following sets a minimum of 25% of the Marketing-adhoc queue to be guaranteed to each user in the queue, if all users have application jobs running at one time in that queue.
With HDP 2.0 Beta, you can use Apache Ambari to configure YARN for multi-tenant use in your Enterprise organization. Download HDP 2.0 Beta and deploy today!
For further understanding of how the Capacity Scheduler works, check out Arun’s detailed blog post.