This blog covers our on-going work on Snapshots in Apache Hadoop HDFS. In this blog, I will cover the motivations for the work, a high level design and some of the design choices we made. Having seen snapshots in use with various filesystems, I believe that adding snapshots to Apache Hadoop will be hugely valuable to the Hadoop community. With luck this work will be available to Hadoop users in late 2012 or 2013.
A snapshot is a point-in-time image of the entire filesystem or a subtree of a filesystem. Some of the scenarios where snapshots are very useful:
We considered two options for snapshots.
Option #1: Both datanodes and namenode are aware of the snapshots and save state internally about the snapshots. Datanode is aware of the fact that some of the blocks are for the snapshot files.
Option #2: Only namenode is aware of the snapshot. Datanode is not aware of the fact that some of the blocks are owned by snapshots of the original file.
Option #2 is selected to keep the design simple. Additionally, taking snapshots is very fast with option #2. Datanode does not know anything about snapshots and is not aware of block ownership issues between root file system and snapshots. Keeping datanodes free from snapshot information simplifies the design immensely by eliminating the need for distributed co-ordination from the design of the snapshots by restricting the changes to namenode only.
A key requirement is to ensure that it is very easy to create and delete snapshots. Snapshot creation and deletion is an admin-only capability. To create a snapshot, one specifies a snapshot name, a path to the root of the subtree whose snapshot is to be taken, and whether or not the snapshot is read-only or a read-write. Deleting snapshot requires just a snapshot name. A command to list all the snaps in the filesystem will be provided.
Snapshots can be referenced with regular HDFS path names with a reserved string .snapshot_<name>:
This has the benefit that snapshots can be referenced with all existing Hadoop commands and APIs that take a pathname by adding a reserved snapshot string to the pathname.
Examples: Consider a directory structure of /a/b/c/foo.txt. Admin has created a snapshot hdfs1 at /a/b. To access data related to snapshot hdfs1, some examples of the commands would be:
hadoop dfs -ls /a/b/.snapshot_hdfs1/c/foo.txt
To copy file from /temp/foo/foo1.txt in snapshot branch to /fooBar would be,
hadoop dfs -cp /a/b/.snapshot_hdfs1/c/foo.txt /foobar/.
Some caveats for RO snapshots include the fact that RO snapshot is immutable. So, operations such as creating a new file, deleting a file, creating a new directory, renaming a file or directory will fail when executed on the snapshot branch.
Snapshots are a very useful feature to have in a mature filesystem. This is a work in progress and we have a functional prototype implemented. The first version of this feature will support RO snapshots only. The support for RW snapshots will be added in the subsequent releases. There are several features that can be incorporated into snapshots, such as time to live for snapshots with auto deletion, schedule based creation of snapshots, marking specific directories as snapshot-worthy, quota based restriction on space used by RW snapshots and delegation of authority for creating/deleting snapshots at specific locations to users etc.
To track the development of snapshots feature in HDFS, please follow the jira HDFS-2802.
~ Hari Mankude