A good YARN from the Hadoop Summit

Big data innovators are providing some new plot developments in the story of Hadoop at the 2013 Hadoop Summit. Hortonworks' recently unveiled YARN offering, an update to the way the Hadoop HDFS manages and stores Hadoop big data clusters, is one of the most-discussed evolutions to the Hadoop framework at the conference, held in San Jose, Calif., reported SiliconANGLE. Analysts are touting it as an overhaul of the Hadoop HDFS improving on the Hadoop and MapReduce system that long defined Hadoop data clustering.

"YARN is a re-imagination, re-architecture of Hadoop itself," Hortonworks founder Arun Murthy said at the conference. "You get significantly more value off of your existing investment."

One of the benefits of YARN is that, in the embodiment of the open-source and user-friendly founding principles that have defined Hadoop, it can be seamlessly merged into an organization's existing framework. It's main advantage for data processing is that it separates resource management and job scheduling and monitoring, which makes the operating system better distributed in practice, CRN reported. Murthy said at the conference that the ultimate expectation for YARN is that it will improve the core of Hadoop functionality, both for Hadoop-savvy enterprise users and as part of Hadoop tutorials for those who are just starting out.

Reporting on the conference, SiliconANGLE contributor Jeffrey Kelly highlighted the main benefits of Hadoop. He wrote that YARN makes significant gains in resource management, enabling multiple applications to run efficiently on a single cluster of machines. As organizations merge to YARN, they will be better able to control their data management and form stronger ties between different managing entities of their Hadoop operations.

"[YARN] means Hadoop can be used as the foundation of an enterprise data management architecture, storing all of an enterprise's data and being utilized as a shared data service," Kelly wrote. "With YARN, the marketing team can run SQL-style applications while the data science team churns through petabytes of data, all on a single Hadoop deployment."

Categorized by :

Leave a Reply

Your email address will not be published. Required fields are marked *

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.