Home Forums HDFS Small files – Hadoop 2.0 migrate from Hadoop 0.21

This topic contains 5 replies, has 3 voices, and was last updated by  Robert Molina 6 months, 1 week ago.

  • Creator
    Topic
  • #46929

    Vitezslav Zak
    Participant

    Hi there,

    we will face the great task such as migration from hadoop 0.21 to hadoop 2.0. We have 7 servers (1x namenode, 6x datanode). We don’t dare to only upgrade hadoop 0.21 to 2.0. We want to migrate data gradually from one hadoop instance to another.

    Internally we have some java applications, which connect to Hadoop via Hadoop Java libraries 0.21. But if we want to connect to both hadoop instances we would use two different library versions in one project. But this is not acceptable.

    We considered about using webhdfs for second instance of hadoop, to avoid usage 2 different libraries. But we use hadoop archives, which cannot be accessible via webhdfs.

    Questions:
    1. Is there any possibility to write java application accessing to 2 different versions of hadoop?
    1.1. Could we connect to both hadoop instances with same java libraries (0.21)?
    2. Is there any other way to avoid problem with a lot of small files, than using hadoop archives? (we have something about 120 million images)

    Thanks for the reply

Viewing 5 replies - 1 through 5 (of 5 total)

You must be logged in to reply to this topic.

  • Author
    Replies
  • #47205

    Robert Molina
    Moderator

    Hi Vitezslav,
    1. You can maybe try installing both client files in separate locations, then before running your java application run have some script to specify HADOOP_HOME_DIR
    and HADOOP_CONF_DIR and point to the specific one you are running your java client against.

    1.1 Most likely not, this is not normally tested, since api’s change normally.

    2. other than HAR,You can also use open source file crusher utility for concatenating small files into default block sizes of 128 MB.

    Regards,
    Robert

    Collapse
    #47160

    Vitezslav Zak
    Participant

    Yes we plan to migrate from Hadoop 0.21 to Hadoop 2.0 through using your platform HDP 2.0.6.

    Collapse
    #47153

    Pavel Hladik
    Participant

    Please, can somebody answer to important questions? And yes, we are going migrate 0.21 to 2.0.6 node per node with capacity of 300TB data.

    Collapse
    #47013

    Vitezslav Zak
    Participant

    Yes, we plan to migrate from HDP 0.21 to HDP 2.0.

    Collapse
    #46936

    Robert Molina
    Moderator

    Hi Vitezslav,
    Are you planning on migrating to HDP 2.0 ?

    Regards,
    Robert

    Collapse
Viewing 5 replies - 1 through 5 (of 5 total)