Home Forums HDP on Windows – Installation hadoop – Map reduce on multiple cluster

Tagged: ,

This topic contains 2 replies, has 2 voices, and was last updated by  Sorna Lingam 11 months, 2 weeks ago.

  • Creator
    Topic
  • #43617

    Sorna Lingam
    Member

    I have configured Hadoop cluster . And im having two machines DEV140 and DEV144 When i run the mapreduce program using the following code

    hadoop jar /HDP/hadoop-1.2.0.1.3.0.0-0380/contrib/streaming/hadoop-streaming-1.2.0.1.3.0.0-0380.jar -mapper “python C:\Python33\mapper.py” -reducer “python C:\Python33\redu.py” -input “/user/sornalingam/input/input.txt” -output “/user/sornalingam/output/out20131112_09″

    where : mapper – C:\Python33\mapper.py and reducer C:\Python33\redu.py is in DEV144’s local disk

    The Mapreduce job is performed only in machine DEV144 but not in DEV140 I have sufred for it But i could not find any resource. Kindly help me soon

    How can i run the mapreduce to use both machines that is in multiple clusters

    for more detail refer this link

    http://stackoverflow.com/questions/19928671/hadoop-map-reduce-on-multiple-cluster

    Thanks

    Sornalingam

Viewing 2 replies - 1 through 2 (of 2 total)

You must be logged in to reply to this topic.

  • Author
    Replies
  • #43670

    Sorna Lingam
    Member

    Hi Seth Lyubich

    i traced down the error log and found this

    In machine DEV140 : python: can’t open file ‘C:\Python33\mapper.py': [Errno 2] No such file or directory

    Im getting this error log

    Actually im having my Map and reduce program in my DEV144 : machine local drive

    Now how can i resolve

    1 .Do i need to have my map and reduce program in all the cluster ?
    2 . How can i solve this ?

    Thanks,
    Sornalingam

    Collapse
    #43648

    Seth Lyubich
    Keymaster

    Hi Sorna,

    I believe this was addressed on http://stackoverflow.com/questions/19928671/hadoop-map-reduce-on-multiple-cluster . In addition JobTracker will manage where the task is being processed. If both TaskTrackers are up and configured with slots the tasks might get scheduled based on data locacality.

    Hope this helps,

    Thanks,
    Seth

    Collapse
Viewing 2 replies - 1 through 2 (of 2 total)