Best Practices for Cluster Network Configuration


What should one keep in mind when configuring the network for a Hadoop cluster?


These are the best practices for configuring the network for a Hadoop cluster. These are recommended for a stable and performant Hadoop cluster.

  • Machines should be on an isolated network from the rest of the data center. This means that no other applications or nodes should share network I/O with the Hadoop infrastructure. This is recommended as Hadoop is I/O intensive, and all other interference should be removed for a performant cluster.
  • Machines should have static IPs. This will enable stability in the network configuration. If the network were configured with dynamic IPs, on a machine reboot or if the DNS lease were to expire then the machine’s IP address would change, and this would cause the Hadoop services to malfunction.
  • Reverse DNS should be setup. Reverse DNS ensures that a node’s hostname can be looked up through the IP address. Certain Hadoop functionalities utilize and require reverse DNS.


Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.