A distributed Java-based file system for storing large volumes of data
HDFS and YARN form the data management layer of Apache Hadoop. YARN is the architectural center of Hadoop, the resource management framework that enables the enterprise to process data in multiple ways simultaneously—for batch, interactive and real-time data workloads on one shared dataset. YARN provides the resource management and HDFS provides the scalable, fault-tolerant, cost-efficient storage for big data.
HDFS is a Java-based file system that provides scalable and reliable data storage, and it was designed to span large clusters of commodity servers. HDFS has demonstrated production scalability of up to 200 PB of storage and a single cluster of 4500 servers, supporting close to a billion files and blocks. When that quantity and quality of enterprise data is available in HDFS, and YARN enables multiple data access applications to process it, Hadoop users can confidently answer questions that eluded previous data platforms.
In HDP 2.2, the rolling upgrade feature and the underlying HDFS High Availability configuration enable Hadoop operators to upgrade the cluster software and restart upgraded services, without taking the entire cluster down.
Hortonworks Focus for HDFS
Since its first deployment at Yahoo in 2006, HDFS has established itself as the defacto scalable, reliable and robust file system for big data. Since then, HDFS has addressed several fundamental challenges of distributed storage at unparalleled scale and with enterprise rigor.
The Apache community continues innovating. For example, a new initiative called Ozone introduces an object store, which extends HDFS beyond a file system, toward a more versatile enterprise object-enables storage layer for use cases such as storing all the photos uploaded on Facebook or all the email attachments in Gmail.
Fortune 1000 CIOs see the Hadoop cluster’s storage and compute resources as a valuable infrastructure for running both Hadoop and non-Hadoop applications and services. This emerging trend of PaaS-on-Hadoop, propelled by YARN, opens up the Hadoop infrastructure to new use cases. Object store is a natural fit for the storage component of a PaaS model, and the Apache community is thrilled to work on adding these new capabilities to HDFS and Apache Hadoop.
The Apache Hadoop HDFS team is working on the following improvements:
|Reliable and Secure Operations||
|Scalability and Efficiency||
|Support for Heterogenous Hardware||
Recent Progress in HDFS
What HDFS Does
HDFS is a scalable, fault-tolerant, distributed storage system that works closely with a wide variety of concurrent data access applications, coordinated by YARN. HDFS will “just work” under a variety of physical and systemic circumstances. By distributing storage and computation across many servers, the combined storage resource can grow linearly with demand while remaining economical at every amount of storage.
These specific features ensure that data is stored efficiently in a Hadoop cluster and that it is highly available:
|Rack awareness||Considers a node’s physical location when allocating storage and scheduling tasks|
|Minimal data motion||Hadoop moves compute processes to the data on HDFS and not the other way around. Processing tasks can occur on the physical node where the data resides, which significantly reduces network I/O and provides very high aggregate bandwidth.|
|Utilities||Dynamically diagnose the health of the file system and rebalance the data on different nodes|
|Rollback||Allows operators to bring back the previous version of HDFS after an upgrade, in case of human or systemic errors|
|Standby NameNode||Provides redundancy and supports high availability (HA)|
|Operability||HDFS requires minimal operator intervention, allowing a single operator to maintain a cluster of 1000s of nodes|
How HDFS Works
An HDFS cluster is comprised of a NameNode, which manages the cluster metadata, and DataNodes that store the data. Files and directories are represented on the NameNode by inodes. Inodes record attributes like permissions, modification and access times, or namespace and disk space quotas.
The file content is split into large blocks (typically 128 megabytes), and each block of the file is independently replicated at multiple DataNodes. The blocks are stored on the local file system on the DataNodes.
The Namenode actively monitors the number of replicas of a block. When a replica of a block is lost due to a DataNode failure or disk failure, the NameNode creates another replica of the block. The NameNode maintains the namespace tree and the mapping of blocks to DataNodes, holding the entire namespace image in RAM.
The NameNode does not directly send requests to DataNodes. It sends instructions to the DataNodes by replying to heartbeats sent by those DataNodes. The instructions include commands to:
- replicate blocks to other nodes,
- remove local block replicas,
- re-register and send an immediate block report, or
- shut down the node.