Hortonworks Data Platform 2.0 Alpha 2 now available: focus on performance

We are very pleased to announce the Alpha 2 release of the Hortonworks Data Platform 2.0 (HDP 2.0 Alpha2) is now available for download!

A key focus in HDP 2.0 Alpha 2 is on performance as announced in the Stinger initiative, and includes a series of enhancements to the performance of Apache Hive for interactive SQL queries.  In fact HDP 2.0 Alpha 2 was used to perform the tests announced yesterday, showing a 45X performance increase using Hive.  There is much more to come but we are pleased with the early results, and encourage Hive users to take a look and continue to give us feedback.

Consistent with HDP 2.0 Alpha 1, this version is built from the developmental Apache Hadoop 2.0 line and includes Apache YARN, a next-generation resource-management and application framework that enables Hadoop to support an ever-expanding range of use cases.  We are extremely excited about the opportunities that YARN enables – for background, check out Arun Murthy’s blog post series where he provides a YARN overview.

Notable new components over Alpha 1 include:

  • Apache Tez: A new Apache project that provides an optimized data processing framework on top of YARN. Tez is a general-purpose, highly customizable framework that simplifies data processing tasks across both small-scale, low-latency and large-scale, high-throughput workloads in Hadoop. Tez can provide an order of magnitude performance boost for the broader ecosystem of data processing tools such as Apache Hive and Apache Pig.
  • Apache Hive Interactive Query: Beyond the speedups made possible by Apache Tez, several new features were added to speed Hive queries. A new file format called the ORCFile (optimized RC file) optimizes how data is stored and accessed in Hive, and significant query optimizations reduce latency and improve performance.

Note that Tez is not enabled by default.  Instructions for doing so, and allowing Hive to use Tez, are in the installation guide.

Learn More
Please take a look at the Hortonworks Documentation to learn more about installing and using HDP 2.0 Alpha 2.

Download It
You can download HDP 2.0 Alpha 2 from the Hortonworks Download site.

Tell Us About It
Please visit the HDP 2.0 Alpha Forum to ask questions, get help, provide feedback and hear what others are doing with HDP. 

We are excited about the opportunities that Hadoop 2 provides for the future of Hadoop and large-scale data processing. HDP 2.0 Alpha 2 is a key milestone that provides organizations with a packaged release to evaluate and gain experience with the upcoming Apache Hadoop 2 technology stack. We look forward to your feedback on HDP 2.0 Alpha 2 while we work with the community to make Hadoop 2 a stable reality. Enjoy!

Note: This Alpha release is a technology preview to gather feedback from outside of Hortonworks. Some features are missing or incomplete. Some APIs may change. Do not use Alpha 2 for production use. Keep away from open flame. Support is only available via Forums.

Categorized by :
Apache Hadoop HDP 2 Hive Tez YARN

Comments

Larry Growney
|
April 3, 2013 at 8:55 am
|

During the installation of HDP, I am prompted for a sandox login/password. Is there a default logon?

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Try it with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.

Get Sandbox

Recently in the Blog

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.

Thank you for subscribing!