Apache Hadoop clusters grow and change with use. Maybe you used Apache Ambari to build your initial cluster with a base set of Hadoop services targeting known use cases and now you want to add other services for new use cases. Or you may just need to expand the storage and processing capacity of the cluster.
Ambari can help in both scenarios. In this blog, we’ll cover a few different ways that Ambari can help you expand your cluster.
You can add more hosts to the cluster and assign these hosts to run as DataNodes and NodeManagers. This allows you to expand both your HDFS storage capacity and your YARN processing power.
Learn more about “Adding Hosts to a Cluster” in the Ambari User’s Guide.
If you already have enough hosts in the cluster but you are not running all components on all the machines, you can expand your cluster by adding components to machines.
For example, if you have hosts that are not running DataNode or NodeManager components, follow these steps in Ambari Web:
Suppose you want to expand your cluster capacity by retiring (i.e. “deleting”) hosts from your cluster and adding new hosts with updated hardware for more memory, drives and CPU power. Earlier we talked about adding hosts to the cluster. In this example, we will first remove the older hosts from the cluster.
Note: be sure to decommission the components on the host prior to performing the delete.
Learn more about “Deleting a Host from a Cluster” in the Ambari User’s Guide.
Ambari supports installing and managing Services that are logically grouped into a “Stack.” When you perform the initial cluster install, you select the services to include in the cluster. For example, you might initially select HDFS, YARN, MapReduce and Tez. Over time, as your Hadoop needs expand, you might want to add other Services to that cluster, such as Hive or HBase.
Here’s how you would do that:
Learn more about “Adding a Service” in the Ambari User’s Guide.