Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Sign up for the Developers Newsletter

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

* I understand I can unsubscribe at any time. I also acknowledge the additional information found in Hortonworks Privacy Policy.
closeClose button
October 17, 2013
prev slideNext slide

Elephants can remember: MapReduce Job History in HDP 2.0

An important tool in the Hadoop developer toolkit is the ability to look at key metrics for a MapReduce job – to understand the performance of each job and to optimize future job runs.

In this blog article, we’ll explore how HDP 2.0 stores and provides insight into the performance of a MapReduce job on YARN.

Change from MapReduce v1 and HDP 1.x

In MapReduce-v2 on YARN in HDP 2.0, the JobTracker no longer exists. The job life cycle management functionality is now the responsibility of the short-lived Application Masters. Each MapReduce-v2 job will spin up an Application Master, and after the MapReduce2 job is complete, the Application Master will be terminated.

For this reason, a new MapReduce JobHistory server was added for MapReduce-v2, which maintains information about MapReduce jobs after their Application Master terminates. The Resource Manager Web UI manages the forwarding of requests to the JobHistory server when the Application Master completes.

Viewing Job History in Ambari

With HDP 2.0, Ambari provides a screen to manage and monitor the JobHistory Server.


The JobHistory UI is accessible as a link from this screen. The JobHistory UI lists all executed MapReduce2 jobs.


You can drill down into each job to get the detailed metrics about the job runtime.


Job history data persisted to HDFS

All the underlying data per job is persisted to HDFS. This means that historical operational metrics for each job is maintained and is accessible for the lifetime of the HDP cluster.

In HDP 2.0, the MapReduce job history files are stored in the “/mr-history/done” directory on HDFS. The directories are organized by date the job executed on:


Go Get It

Download HDP 2.0 Beta and deploy today!


Leave a Reply

Your email address will not be published. Required fields are marked *

If you have specific technical questions, please post them in the Forums