Rewriting the big data playbook with Hadoop

The explosive growth of big data is seemingly only matched by the breakneck rise of strategies that purport to put it to best use. A litany of approaches, including a now-familiar roundup of analytics, business intelligence, data science and ad hoc reporting, have all been attached to the precipitous growth of the big data industry. While each of these approaches certainly have their benefits, it can be difficult for organizations to discern which is right for them. Many companies enlist some combination of different techniques in an effort to maximize possible benefits. While this tactic can work as well, poor integration or communication could end up truncating the potential advantages.

Apache Hadoop may prove to be the best framework for running big data applications, according to TechTarget's Jack Vaughan, due to the system's continued focus on evolution. The Hadoop system has experienced exponential growth over a short time – Vaughan wrote that a fair assessment of Hadoop just a few years back might be "HDFS, MapReduce and some glue," but that Hadoop has quickly filled incoming needs for a better big data platform. The recent unveiling of YARN is another step in the direction of more dynamic, user-friendly and customizable big data applications. 

Vaughan wrote that YARN redefines not only Hadoop architecture, but the process of structuring big data in general. The new system will enable users to have more options when creating and distributing clusters, and makes the process of plugging other applications into the architecture a more utile and flexible one. Vaughan spoke with several top industry analysts about the next stage of evolution in the Hadoop ecosystem. Many echoed the assertions of Gartner analyst Merv Adrian, who spoke to the way that Hadoop is changing the big data paradigm.

"The Hadoop community is a center of gravity that is attracting innovative new uses," he told Vaughan. 

One platform to rule them all?
Changing the nature of big data architecture is part the process by which Hadoop and YARN could establish a single platform for big data analysis, according to SiliconANGLE assistant editor Ryan Cox. He wrote that YARN enables Hadoop to morph into a multi-application operating system from its humbler single application origins. This transformation positions Hadoop as a possible platform from which to run all other operations to store and synthesize big data. Cox also wrote that Hadoop's continued emphasis on retaining the full power of its open-source origins further increases its potential for platform status. 

Categorized by :

Leave a Reply

Your email address will not be published. Required fields are marked *

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.