Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Sign up for the Developers Newsletter

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

* I understand I can unsubscribe at any time. I also acknowledge the additional information found in Hortonworks Privacy Policy.
closeClose button
December 03, 2014
prev slideNext slide

Want New Ways to Optimize Your Big Data Workloads?

Data platforms within Enterprises are in midst of a generational shift. After successful reliance on databases for decades, leading organizations today are complementing their data platforms to create a Modern Data Architecture (MDA) with Apache Hadoop in a Data Lake environment. Hadoop with its scale out and schema free architecture enables organizations to store and analyze all its structured and unstructured data in a single consolidated data environment. A key partner in the Hadoop journey has been the complementary infrastructure of server, storage and networking. Today, as the Hadoop innovation accelerates, the infrastructure must evolve so that hardware and software can together deliver on the promise of big data.

Joint Engineering

The HP Big Data Analytics reference architecture from HP represents a rethinking of Hadoop infrastructure with the separation of storage, networking and compute. The architecture delivers extreme flexibility, with an ability to scale each layer independently. The HP Big Data reference architecture complements the existing Hadoop infrastructures of co-located storage and compute, offering Enterprise the choice to meet its unique operational and functional needs.

At Hortonworks we partnered with HP to optimize Hadoop deployment on the HP Big Data reference architecture. With Apache Hadoop YARN as the operating system for computation access, Hortonworks and HP engineers teamed with the Apache Hadoop community to develop the YARN labels functionality, enhancing control and scheduling for different types of computation frameworks.

Yarn Labels

With Yarn Labels, administrators can tag servers into distinct sets of compute resources and enable job execution on nodes that best fit its workload needs. The YARN label represents a key advancement of Hadoop and spurs infrastructure innovations to optimally meet the big data needs. As part of the joint HP and Hortonworks collaboration, we worked with the broader Apache Hadoop community to define and deliver the Labels functionality. This collaboration of HP and Hortonworks exemplifies our commitment to represent enterprise needs in the open community and work in the community to deliver continuous innovation.


The HP Big Data reference architecture along with the Apache Hadoop YARN functionality offers several key benefits including:

  • Elasticity – ability to scale compute separate from storage, as application workloads needs dictate
  • Flexibility – ability to direct workload to nodes that best match its compute requirements
  • Maintainability – ability to incrementally incorporate the latest innovations in storage, networking and server technologies

As Apache Hadoop community, we are constantly looking to question the conventional wisdom and deliver differentiated solution. The HP Big Data Analytics reference architecture represents a rethinking that adds unique value to Hadoop deployments.

Learn More



julie says:

Great Article Sharing…It’s Really Very Informative

Ashwin says:

This is great idea from you.I like your Blog post and the way you explain.Thank you so much to share this.

Leave a Reply

Your email address will not be published. Required fields are marked *