This blog is a follow up on our previous blog “Snapshots for HDFS”
In June we had posted an early prototype of snapshots that allowed us to experiment with a few ideas in HDFS-2802. Since then we have added more details to the design document and made significant progress on a brand new implementation (over 40 subtasks in HDFS-2802).
Some of the highlights of this new design include:
- Read-Only Copy-on-Write (COW) snapshots (but can be extended RW later)
- Snapshots for entire namespace or sub directories
- Snapshots are managed by Admin, but users are allowed to take snapshots
- Snapshots are efficient
- Creation is instantaneous with O(1) cost.