The Hortonworks Blog

It is that time of the year again!

Annual Apache HBase conference, HBaseCon 2015, is around the corner, and as always, it is packed with action and illuminating talks.

The conference is this Thursday, May 7th. As in the previous years, there will be 4 tracks covering Operations, Internals, Ecosystem and Use Cases.

Here are a few sessions that I am personally excited about:

This year, SQL solutions are well represented.…

This week we are participating in the Microsoft Ignite conference in Chicago. Microsoft Ignite focuses on all Microsoft technologies and professionals and we are excited to demonstrate all of the ways we’ve been working with Microsoft to Do Hadoop together. As a long time Microsoft partner we are glad to be participating in this event for the 3rd year in a row showing of a history of joint engineering and commitment to the Microsoft platforms and users.…

This is the third post in a series that explores the theme of supporting rolling-upgrades & downgrades of a Hadoop YARN cluster. See here for an introductory post.

Introduction

Carrying out a rolling upgrade/downgrade of all nodes in a Hadoop cluster can be a very disruptive process. Before HDP 2.2, if a NodeManager (NM) were brought down, all active containers on that node would be killed. This would significantly interrupt all applications in the cluster being upgraded/downgraded.…

It’s going to be a big week at EMC World! We’ll be exhibiting at the event and there are a number of opportunities to meet with us and hear about the partnership between EMC and Hortonworks. We look forward to seeing you there!

Booth

Hortonworks will be in booth #132, right next to the EMC Open@EMC booth. We’d love to meet with you to discuss how EMC Isilon and the Hortonworks Data Platform deliver a Modern Data Architecture.…

In today’s healthcare industry, shifting reimbursement models and increasing costs for supplies and labor come at the same time as mandates to improve care delivery while lowering costs. Improved healthcare outcomes should usually come at a higher cost, but more and better data can drive insights and efficiencies that help with both of those opposing pressures—helping care providers create new ways to both practice medicine and do business.

In her blog post entitled “Top 5 Health Care Trends to Watch in 2015,” Susan DeVore described this year’s top healthcare challenges and how the industry is addressing them.…

We at Hortonworks live by a few core principles:

  • Innovate at the core of Hadoop
  • Make Hadoop be an Enterprise Class Data Platform
  • Do it all in open source
  • Enable the ecosystem

Our vision of “Hadoop Everywhere” is shared by our partner community who bring their industry expertise, unique software value-add and passion for customer success to enable transformational change across our joint customers. We as a Hadoop community are succeeding everyday in transforming enterprises into a data-first organization.…

Having just returned from our Hadoop Summit Europe event, I was struck by the number of sessions that involved large scale businesses outlining the impact of their advanced analytic applications (built on Hadoop) and how those analytics are empowering better business decisions.

The story of business value is significant. Session after session, representatives from various industries talked about how their modern data architectures with Hadoop led to increased agility, new innovative customer experiences, and lower cost structures.…

On April 30, learn from experts at Hortonworks, Cisco, and Red Hat about accelerating the implementation of a scalable, cost-efficient and robust Big Data solution. Here is a sneak preview of what you’ll hear from our speakers:

  • Ali Bajawa, Senior Partner Solution Engineer, Hortonworks
  • Ron Graham, System Engineer for Big Data Analytics, Cisco
  • Irshad Raihan, Senior Principal, Big Data Product Marketing, Red Hat

Register Now

1. What should a company consider when looking for a big data solution?…

The Apache Hadoop community is happy to announce the release of Apache Hadoop 2.7.0! We want to express our gratitude to every contributor, reviewer and committer.

The Hadoop community fixed 923 JIRAs in total as part of the 2.7.0 release. Of the 923 fixes:

  • 259 were in Hadoop Common
  • 350 were in HDFS
  • 253 were in YARN
  • 61 were in MapReduce

Hadoop 2.7.0 is the first Hadoop release in 2015, following late last year’s 2.6.0.…

Interest in Hadoop as a transformational data platform continues to grow around the world, as more enterprises are building and deploying Hadoop solutions. Hortonworks has been a leader in this regard, as evidenced by the growth of the Hortonworks Data Platform (HDP), with both new and renewing customers worldwide. Customer demand for HDP applications and creative use cases is reaching ever-increasing levels. As such, demand for skilled professional services resources to guide HDP development and deployment represents a tremendous business opportunity for partners.…

Waterline Data is a Hortonworks Technology Partner and recently earned HDP Certification and YARN Ready with their solution that automates the inventory of data assets in the data lake, enables data governance, and provides self-service to data engineers and data scientists to find and understand their data. Learn more by joining the upcoming webinar on May 6, download the Sandbox tutorial or joint whitepaper. Our guest blogger is Oliver Claude, CMO at Waterline Data.…

In this blog, Kevin Petrie (Attunity Senior Director of Marketing) joins me to share thoughts on Hadoop and the Enterprise Data Warehouse.

Some believe that Hadoop and the Enterprise Data Warehouse (EDW) will continue to coexist, side-by-side, solving different use cases. The peanut butter is over here, and the chocolate is over there.

At Hortonworks and Attunity, we see something else. We see how Hortonworks subscribers use Hortonworks Data Platform (HDP) for EDW optimization.…

Can you identify the unused data in your data warehouse? Are you using your “big data” efficiently? Are your data migration projects cost effective? Is your data in compliance with industry regulations? If you answered “no” to any or all of these questions, then you may want to learn more about how to optimize your data warehouse.

On April 23rd at 11:00 am PST, Adis Cesir, Big Data Solution Engineer at Hortonworks, Ramu Kalvakuntla, Principal at RCG Global Services Big Data Practice, and Santosh Chitakki, Director of Product Management at Attunity, will be telling us more about rebalancing data warehouses and integrating your current enterprise data warehouse with a Modern Data Architecture.…

Introduction

Apache Spark is a fast, in-memory data processing engine with elegant and expressive development APIs in Scala, Java, and Python that allow data workers to efficiently execute machine learning algorithms that require fast iterative access to datasets. Spark on Apache Hadoop YARN enables deep integration with Hadoop and other YARN enabled workloads in the enterprise.

In this blog, we will introduce the basic concepts of Apache Spark and the first few necessary steps to get started with Spark on Hortonworks Sandbox.…

Enterprises across all major industries adopt Apache Hadoop for its ability to store and process an abundance of new types of data in a modern data architecture. This “Any Data” capability has always been a hallmark feature of Hadoop, opening insight from new data sources such as clickstream, web and social, geo-location, IoT, server logs, or traditional data sets from ERP, CRM, SCM or other existing data systems.…