The Hortonworks Blog

More from Arpit Agarwal

The Hadoop Distributed File System (HDFS) is the reliable and scalable data storage core of the Hortonworks Data Platform (HDP). In HDP, HDFS and YARN combine to form the distributed operating system for your data platform, providing resource management for diverse workloads and scalable data storage for the next generation of analytical applications.

In this blog, we’ll describe the key concepts introduced by Heterogeneous Storage in HDFS and how they are utilized to enable key tiered storage scenarios.…

Hadoop has traditionally been used for batch processing data at large scale. Batch processing applications care more about raw sequential throughput than low-latency and hence the existing HDFS model where all attached storages are assumed to be spinning disks has worked well.

There is an increasing interest in using Hadoop for interactive query processing e.g. via Hive. Another class of applications makes use of random IO patterns e.g. HBase. Either class of application benefits from lower latency storage media.…