I am in the process of developing a framework around Hadoop that enables RabbitMQ messages to be persisted in HDFS. The messages will continuously stream into the system, as they are stock prices or weather data etc. Unfortunately it looks like I will not be able to append to a file in HDFS version 1.x.x. as per:
HADOOP-8230. Major improvement reported by eli2 and fixed by eli
Enable sync by default and disable append
Append is not supported in Hadoop 1.x. Please upgrade to 2.x if you need append. If you enabled dfs.support.append for HBase, you’re OK, as durable sync (why HBase required dfs.support.append) is now enabled by default. If you really need the previous functionality, to turn on the append functionality set the flag “dfs.support.broken.append” to true.
Could anyone please elaborate on this release note message? Why is it possible for HBase to append? Can I create a program somehow that is able to safely and robustly append to files?
I am running HortonWorks Windows distribution on a cluster of 3 machines.
Many thanks for your help in advance.