Thanks for your question.
The "easy explanation" follows:
These are all daemon processes.
NameNode – Stores the information of on which DataNode the data is stored. This data is broken into blocks and replicated a number of times and then stored on various DataNodes throughout the cluster. There is usually only one of these in a cluster.
DataNodes – the nodes on which the data is actually stored. As mentioned above it has been split into blocks and each block replicated a number of times (usually 3) and these replicas are placed on different DataNodes to provide a bit of fault tolerance. There can be any number of DataNodes in a cluster, from one to thousands. The number is only limited by the RAM available on the NameNode, the NameNode stores it’s working data in RAM. The DataNodes store the actual data on the harddrives.
JobTracker – When a job for analyzing this data is submitted this daemon talks to the NameNode to figure out which nodes the data is stored on so that it can split this job into tasks and then send the task to the TaskTracker that is closest to the actual data being processed. There is also usually only one JobTracker per cluster.
TaskTracker – these daemons are sent individual tasks to perform by the JobTracker. They perform this task and send the results back to the JobTracker. There can, and should be, as many TaskTrackers in a cluster as there are DataNodes. A typical Hadoop cluster will have a TaskTracker running on each DataNode. This is so that under ideal circumstances the data will be processed by the computer/node on which it resides thus reducing the need to transfer data around to be processed.
I hope this helps your understanding of Hadoop.