Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Sign up for the Developers Newsletter

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

* I understand I can unsubscribe at any time. I also acknowledge the additional information found in Hortonworks Privacy Policy.
closeClose button
August 01, 2011
prev slideNext slide

Delivering High-Quality Apache Hadoop Releases

As enterprises increasingly adopt Apache Hadoop for critical data, the need for high-quality releases of Apache Hadoop becomes even more crucial. Storage systems in particular require robustness and data integrity since enterprises cannot tolerate data corruption or loss. Further, Apache Hadoop offers an execution engine for customer applications that comes with its own challenges. Apache Hadoop handles failures of disks, storage nodes, compute nodes, network and applications. The distributed nature, scale and rich feature set makes testing Apache Hadoop non-trivial.

Testing Apache Hadoop does not just involve writing a test plan based upon the design spec. Instead, it requires an understanding of the numerous use cases of an API rather than the actual specifications of that API. The intriguing part has always been in analyzing the impact that every feature, improvement and bug fix has on the various Hadoop subsystems and user applications. Additionally, one has to go beyond the unit tests, functional, scale and reliability tests and run tests against live data and live user applications to test the integration of Hadoop with the other products in the ecosystem.

Delivering a high quality Apache Hadoop release has been a focus for our team since their early days at Yahoo!, where Apache Hadoop has been used in production across thousands of nodes. Over the years we have developed elaborate test suites and procedures. Every stable Apache release of Hadoop from the early days to hadoop-20.2xx has gone through this rigorous test procedure.

This work now continues as part of the Yahoo! and Hortonworks partnership. The next generation of Hadoop (hadoop-0.23) has significant new features co-developed by the two companies. It will be hardened as an enterprise-quality product and rolled out across Yahoo! and other organizations around the globe.

Our rigorous process has resulted in a remarkable record of data integrity and robustness. Even the large commercial storage vendors will have a hard time matching the level of testing managed by Yahoo! and Hortonworks.

Can you trust your data to anything less?

Hortonworks QA Process for Apache Hadoop

This section of the post describes the stringent process followed to test and qualify Apache Hadoop releases.

The process consists of the following procedures:

  • Nightly QA tests
  • Release certification
  • Deployment to sandbox, research and production clusters

1. Nightly QA Tests

At Hortonworks, we have a nightly automated deploy setup that deploys the latest Apache Hadoop 0.20.x and 0.23.x code base to two QA clusters. Once the deployment succeeds we run 1200+ automated tests that include the following:

Benchmarks and end-to-end tests

Benchmark tests help in tracking any performance degradations due to recent code check-ins. End-end tests ensures that the new code does not break the overall functioning of the existing framework. These tests are considered as acceptance tests before running an exhaustive set of functional tests.

Functional tests

This area of testing is the most challenging one due to the advanced, leading-edge feature set available in Apache Hadoop that has to be tested in a distributed environment. Below is a sample of the depth of functional testing needed before distributing a Apache Hadoop release:

  • The entire breadth of the product is tested covering all the sub systems such as HDFS, MapReduce, streaming, distcp, archives and so forth.
  • QA does a deep dive into individual sub system such as block replication, quotas, balancer etc. in HDFS and job scheduling, distributed cache, task controller etc. in MapReduce.
  • Detailed testing of each of these components is completed, such as user limits, high RAM jobs, reservation and queue limits in capacity scheduler. In queue limits alone there are several number of use cases such as verification of limits on tasks per job, jobs per user, pending/running tasks per queue and per user, etc.

Thus, the Hortonworks QA process will catch any regressions introduced by a new patch and have early insights into the quality of the upcoming releases.

2. Release certification

Prior to calling for an Apache release vote, Hortonworks QA will ensure that the following tests succeed:

  • All the unit tests succeed on Apache Jenkins
  • No degradation is observed in the benchmark numbers
  • No regressions are introduced in the nightly test run

Once the above tests succeed, the following non-functional tests are executed:

Compatibility with existing clusters

To ensure a smooth upgrade on existing clusters, Hortonworks QA verifies the compatibility of the new code with the existing cluster. For this we run all the existing tests on the upgraded cluster and make sure that the old user jobs are able to run.

Hadoop Stack Integration Testing

We also run tests to verify that other products in the Apache Hadoop ecosystem such as Pig, HCatalog, Hive and Oozie seamlessly integrate with the new code.


To guarantee privacy, security and integrity, and to ensure that users are correctly authenticated to the edge service, Hortonworks QA verifies the following security scenarios:

  • User level authorization and authentication to perform HDFS operations
  • User level authorization and authentication to submit, execute and administer MapReduce jobs
  • Service level authorization to access HDFS and to run MapReduce jobs

The framework is also tested for scenarios such as unauthorized users, services, expired/cancelled/invalid kerberos tickets, block tokens, delegation tokens and corrupt credentials.


The focus of scalability tests is to verify that Apache Hadoop can gracefully handle increase in requests, data sets, jobs etc. without degrading the performance. To run scale tests, Hortonworks feeds load directly from Yahoo!’s production grids onto our 800+ QA cluster using GridMixV3, rumen and folder.

We also test the framework with:

  • High data volume
  • Increased number of files and directories in the namespace
  • Large number of HDFS and local FS read and writes
  • Increased number of jobs and tasks in various states such as pending, running and completed
  • Large number of users in the system and queues


Reliability of the framework is its ability to function normally in case of failures. The Hortonworks QA reliability tests broadly cover:

Service failures – failure of:

  • Namenode
  • Secondary Namenode
  • Jobtracker/Resourcemanager
  • Tasktracker/Nodemanager
  • Datanode

Network failures – connection time outs, lost Tasktrackers and fetch failures

Bad hardware – corrupt disks, missing blocks, corrupt data and lost map outputs

Testing reliability in Apache Hadoop is critical because it is not just sufficient to check for the recovery mechanism. One also has to confirm that the state of the system is reflected correctly. For example, a task running on a lost Tasktracker will eventually be rescheduled, but the testing is complete only after verifying that no further tasks are scheduled on the lost TT, the total cluster capacity is reduced and the TT no longer appears in the active TT list. And in the case where the lost TT rejoins, tasks should continue to be scheduled on them. We cover all of the above failure scenarios to unravel any unreliability in the code.

3. Release Testing at Yahoo!

Once the release is certified by QA, it is then deployed onto three of Yahoo!’s sandbox clusters, each having 400-1000 nodes. The release will be available for 2 months waiting for the signoff from all the production projects.

After the sandbox environment, the release is then moved to six of Yahoo!’s research clusters where it is deployed for another 2 months before being deployed to production clusters. The average number of jobs per week on the research clusters vary from 0.25-0.5 million – thus, by the time we exit this stage, the Apache Hadoop release has run more than 10 million jobs and stored tens of petabytes of data.

Only then, is the release deployed on to the production clusters.

Also, the production logs are made available to QA to run future scale tests on QA clusters.


The highly stringent Dev-QA-Operation process described in this post for rolling out new releases to the grids is followed for every release of Apache Hadoop. Testing the Apache Hadoop release in such a manner, at very large scale, has helped Yahoo! qualify high-quality releases. This will now help Hortonworks do the same.

— Ramya Sunil




mital says:

Hi Ramya, Can you please give some insight on which language your functional tests are written that are testing PIG scripts? Thanks,

Ramya Sunil says:

Hi Mital,

The system and integration tests for Pig are written in perl.


vijay k says:
Your comment is awaiting moderation.

In case of Functional test, how do you test block replication and load balancer functionality ?
To test the high RAM jobs, queing functionality, distributed cache. Have you written a high intensive MR jobs ?

Release certification :
– Are the unit tests, benchmark tests, regression written by dev or qa ?
– Is puppet used to upgrade the cluster and test the exsting cluster ?
Reliability :
– Do you induce the failure in the namenode,tasktracker..etc..
– How do you corrupt the stuff before checking ?

Really a robust framework and well written details on how testing is done. Great job.

Ganesh says:
Your comment is awaiting moderation.

In your article there is a comment regarding 1200+ automated tests. Which Test automation tool is used by your team.

Hemanth says:
Your comment is awaiting moderation.

Hi Ramya,
You have given a very good insight about the quality process and the care you take for release. Can you also let us know the steps we must keep in mind for minor patch and bug fixes.

Vamshi says:

HI Ramya, Is there a way you can share the functional and integration test cases related to each component in Apache hadoop?

viswa says:

HI Ramya, your work is very nice and informative, Could you please share the functional and integration test cases and other Hadoop testing related Stuff for learning purpose to or please share any white papers or stuff to learn big data/hadoop testing please. Thank you in advance.

Days of Our Lives Spoilers says:

great post..keep it up..!!

iGameGuardian Download says:

Definitely an impressive guide, can you share your email ? I have some questions

Leave a Reply

Your email address will not be published. Required fields are marked *