HDFSConcat to merge files
I am trying to use HDFSConcat (https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.21/hdfs/src/java/org/apache/hadoop/hdfs/tools/HDFSConcat.java) to merge files from a map-reduce step into one. I want to avoid a single reducer.
What I realize is that the source can be such that it last block may not be fully filled but seems like the target (final concatenated file) needs to be such that its last block is completely filled.
Has anyone come across this HDFSConcat Class and used it in someway.
The test is here : https://svn.apache.org/repos/asf/hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestHDFSConcat.java
In my case both source files and destination can have its last block not completely filled.
Any advise is appreciated.
Support from the Experts
A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.
Become HDP Certified
Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world