Home Forums MapReduce HDP map/reduce fast performance

This topic contains 1 reply, has 2 voices, and was last updated by  Rupert Bailey 4 months, 1 week ago.

  • Creator
    Topic
  • #48728

    Dharanikumar Bodla
    Participant

    hi to all,
    Good Morning,
    I had a set of 22documents in the form of text and loaded in hdfs,when running a map/reduce funtion from command line of hdfs ,it took 4mins 31 secs for streaming the 22 text files.How do increase the map/reduce process as fast as possible so that these text files should complete the process by 5-10 seconds.
    What changes I need to do on ambari hadoop.
    Allocated 2GB of data for Yarn,and 400GB for HDFS
    default virtual memory for a job map-task = 341 MB
    default virtual memory for a job reduce-task = 683 MB
    MAP side sort buffer memory = 136 MB
    And when running a job ,Hbase error with Region server goes down,Hive metastore status service check timed out.

    Thanks & regards,
    Bodla Dharani Kumar,

Viewing 1 replies (of 1 total)

You must be logged in to reply to this topic.

  • Author
    Replies
  • #50717

    Rupert Bailey
    Participant

    You might need to advise
    how big these files are
    how many nodes your cluster
    now many processors per node
    ram per node.

    Details of the source Machine.

    This will indicate a good block size. you could consider (size of file)/(number of nodes * number of processors)
    It will be a map only process without a sort so make sure the max number of mappers is increased to at least: number of nodes * number of processors
    You may be trying to execute these sequentially, consider spawning child processes (in unix use an “&” at the end) and looping trough the files. this might mean your speed is increased at the source by several processors reading each file. If the source has multiple disks consider a file on each disk and spawing a process per disk, as you’ll be speed bound pulling from disk.
    Reduce your replication factor to 1

    Collapse
Viewing 1 replies (of 1 total)