You might need to advise
how big these files are
how many nodes your cluster
now many processors per node
ram per node.
Details of the source Machine.
This will indicate a good block size. you could consider (size of file)/(number of nodes * number of processors)
It will be a map only process without a sort so make sure the max number of mappers is increased to at least: number of nodes * number of processors
You may be trying to execute these sequentially, consider spawning child processes (in unix use an “&” at the end) and looping trough the files. this might mean your speed is increased at the source by several processors reading each file. If the source has multiple disks consider a file on each disk and spawing a process per disk, as you’ll be speed bound pulling from disk.
Reduce your replication factor to 1