A recurrent question on the various Hadoop mailing lists is “why does Hadoop prefer a set of separate disks to the same set managed as a RAID-0 disks array?”
It’s about time and snowflakes.
JBOD and the Allure of RAID-0
In Hadoop clusters, we recommend treating each disk separately, in a configuration that is known, somewhat disparagingly as “JBOD”: Just a Box of Disks.
In comparison RAID-0, which is a bit of misnomer, there being no redundancy, stripes data across all the disks in the array.…