The Apache Tez community is thrilled to announce the release of version 0.5 of the project. We’re referring to this as “the developer release” because it’s all about developers. The community focused on meeting the key needs of developers using Tez to create their applications and engines. Tez 0.5 includes clean and intuitive developer APIs, easy debugging, extensive documentation and deployment with rolling upgrades.
Apache Hadoop YARN paved the way for Apache Tez. With Hadoop 2, Tez has proven itself rock-solid stable for users of Apache Hive with Tez and Apache Pig with Tez. This release extends the benefits of Tez to many more developers that aim to take advantage of its reliability, scale, and performance within their engineering projects.
Now developers can take full advantage of the Hadoop platform with YARN as its architectural center. YARN enables purpose-built applications to run within a shared execution environment, and Tez enables developers to write purpose-built applications for the data processing domain.
Applications like Apache Hive, Apache Pig and Cascading use Tez’s core directed acyclic graph (DAG) APIs for a variety of batch and interactive use cases. The resulting wealth of feedback from users of these applications and community members involved in those projects has been incorporated into the Tez code.
Our testing and real-world experiences show that the core APIs are stable and should stand up to the challenge of even more widespread adoption. The Tez community plans to maintain backwards compatibility for these APIs, so developers, vendors and ISVs can continue to confidently build their applications with Tez.
To develop code without the benefit of tools for easy debugging is challenging. In this release, we provide capabilities for debugging both application code and performance:
The community has worked hard to write extensive javadocs for all the APIs that are exposed by Tez. We also clarified the naming and packaging of the APIs to make them more intuitive.
Because code samples are the best form of documentation, we include a number of examples to showcase how to build applications using the Tez APIs. We wrote the examples from the point of view of a developer using Tez, in order to guide the reader on a path from basic to more complex use cases.
Apache Tez has always been easy to deploy. It is a client-side YARN application, which means that there is nothing to install in the cluster. This is important from a usability standpoint but also from a safety point of view. It’s perfectly safe to try out Tez on any cluster (even your production cluster) because it is not going to change anything or leave behind any traces.
We have improved the packaging to be able to support rolling upgrades of a Hadoop cluster. Rolling upgrades will soon be released for Apache Hadoop, allowing cluster administrators to upgrade a Hadoop cluster without any downtime. Tez, as a leading example of an engine running within the YARN framework, is ready to work with the latest and greatest possibilities of Hadoop.
We thank our users, developers and contributors for helping us strengthen Tez during the early days while the developer tools matured. Prior releases proved Tez’ performance and scalability. We are confident that with this release, Tez is a stable and rock solid framework for developers of big data applications. Now is the time for independent software vendors (ISVs) and developers to take full advantage of DAG-driven capabilities in Tez for their purpose-built applications on YARN.
— The Tez team