Introducing Apache Tez 0.4

Community Rocks

Tez04We are excited to announce that the Apache™ Tez community voted to release version 0.4 of the software.

Apache Tez is an alternative to MapReduce that provides a powerful framework for executing a complex topology of tasks for data access in Hadoop. Version 0.4 incorporates the feedback from extensive testing of Tez 0.3, released just last month.

This release is especially meaningful because it coincides with completion of the Stinger Initiative (a collaborative community effort involving 145 developers across 44 companies) and the upcoming release of Apache Hive 0.13.

Major community achievements in this Tez 0.4 release were:

  • Application Recovery – This is a major improvement to the Tez framework that preserves work when the job controller (YARN Tez Application Master) gets restarted due to node loss or cluster maintenance. When the Tez Application Master restarts, it will recover all the work that was already completed by the previous master. This is especially useful for long running jobs where restarting from scratch would waste work already completed.
  • Stability for Hive on Tez – We did considerable testing with the Apache Hive community to make sure the imminent release of Hive 0.13 is stable on Tez. We appreciate the great partnership.
  • Data Shuffle Improvements – Data shuffling re-partitions and re-distributes data across the cluster. This is a major operation in distributed data processing, so performance and stability are important. Tez 0.4 includes improvements in memory consumption, connection management, and in the handling of errors and empty partitions.
  • Windows Support – The community fixed bugs and made changes to Tez so that it runs as smoothly on Windows as it does on Linux. We hope this will encourage adoption of Tez on Windows-based systems.

We hope that Tez 0.4 provides a stable, reliable and high performance framework for wider community adoption. We encourage you to try out Apache Tez for your use cases. We look forward to hearing feedback and suggestions for improvements. We’re all ears!

Also, we would like to thank the wider Apache community for their support and cooperation.

-       The Apache Tez Team

Download

Categorized by :
Data Analyst & Scientist Developer Stinger Tez

Leave a Reply

Your email address will not be published. Required fields are marked *

If you have specific technical questions, please post them in the Forums

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

HDP 2.1 Webinar Series
Join us for a series of talks on some of the new enterprise functionality available in HDP 2.1 including data governance, security, operations and data access :
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.
Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.