The Hortonworks Blog

It is that time of the year again!

Annual Apache HBase conference, HBaseCon 2015, is around the corner, and as always, it is packed with action and illuminating talks.

The conference is this Thursday, May 7th. As in the previous years, there will be 4 tracks covering Operations, Internals, Ecosystem and Use Cases.

Here are a few sessions that I am personally excited about:

This year, SQL solutions are well represented.…

This week we are participating in the Microsoft Ignite conference in Chicago. Microsoft Ignite focuses on all Microsoft technologies and professionals and we are excited to demonstrate all of the ways we’ve been working with Microsoft to Do Hadoop together. As a long time Microsoft partner we are glad to be participating in this event for the 3rd year in a row showing of a history of joint engineering and commitment to the Microsoft platforms and users.…

This is the third post in a series that explores the theme of supporting rolling-upgrades & downgrades of a Hadoop YARN cluster. See here for an introductory post.

Introduction

Carrying out a rolling upgrade/downgrade of all nodes in a Hadoop cluster can be a very disruptive process. Before HDP 2.2, if a NodeManager (NM) were brought down, all active containers on that node would be killed. This would significantly interrupt all applications in the cluster being upgraded/downgraded.…

It’s going to be a big week at EMC World! We’ll be exhibiting at the event and there are a number of opportunities to meet with us and hear about the partnership between EMC and Hortonworks. We look forward to seeing you there!

Booth

Hortonworks will be in booth #132, right next to the EMC Open@EMC booth. We’d love to meet with you to discuss how EMC Isilon and the Hortonworks Data Platform deliver a Modern Data Architecture.…

We at Hortonworks live by a few core principles:

  • Innovate at the core of Hadoop
  • Make Hadoop be an Enterprise Class Data Platform
  • Do it all in open source
  • Enable the ecosystem

Our vision of “Hadoop Everywhere” is shared by our partner community who bring their industry expertise, unique software value-add and passion for customer success to enable transformational change across our joint customers. We as a Hadoop community are succeeding everyday in transforming enterprises into a data-first organization.…

Having just returned from our Hadoop Summit Europe event, I was struck by the number of sessions that involved large scale businesses outlining the impact of their advanced analytic applications (built on Hadoop) and how those analytics are empowering better business decisions.

The story of business value is significant. Session after session, representatives from various industries talked about how their modern data architectures with Hadoop led to increased agility, new innovative customer experiences, and lower cost structures.…

On April 30, learn from experts at Hortonworks, Cisco, and Red Hat about accelerating the implementation of a scalable, cost-efficient and robust Big Data solution. Here is a sneak preview of what you’ll hear from our speakers:

  • Ali Bajawa, Senior Partner Solution Engineer, Hortonworks
  • Ron Graham, System Engineer for Big Data Analytics, Cisco
  • Irshad Raihan, Senior Principal, Big Data Product Marketing, Red Hat

Register Now

1. What should a company consider when looking for a big data solution?…

The Apache Hadoop community is happy to announce the release of Apache Hadoop 2.7.0! We want to express our gratitude to every contributor, reviewer and committer.

The Hadoop community fixed 923 JIRAs in total as part of the 2.7.0 release. Of the 923 fixes:

  • 259 were in Hadoop Common
  • 350 were in HDFS
  • 253 were in YARN
  • 61 were in MapReduce

Hadoop 2.7.0 is the first Hadoop release in 2015, following late last year’s 2.6.0.…

Interest in Hadoop as a transformational data platform continues to grow around the world, as more enterprises are building and deploying Hadoop solutions. Hortonworks has been a leader in this regard, as evidenced by the growth of the Hortonworks Data Platform (HDP), with both new and renewing customers worldwide. Customer demand for HDP applications and creative use cases is reaching ever-increasing levels. As such, demand for skilled professional services resources to guide HDP development and deployment represents a tremendous business opportunity for partners.…

Waterline Data is a Hortonworks Technology Partner and recently earned HDP Certification and YARN Ready with their solution that automates the inventory of data assets in the data lake, enables data governance, and provides self-service to data engineers and data scientists to find and understand their data. Learn more by joining the upcoming webinar on May 6, download the Sandbox tutorial or joint whitepaper. Our guest blogger is Oliver Claude, CMO at Waterline Data.…

Can you identify the unused data in your data warehouse? Are you using your “big data” efficiently? Are your data migration projects cost effective? Is your data in compliance with industry regulations? If you answered “no” to any or all of these questions, then you may want to learn more about how to optimize your data warehouse.

On April 23rd at 11:00 am PST, Adis Cesir, Big Data Solution Engineer at Hortonworks, Ramu Kalvakuntla, Principal at RCG Global Services Big Data Practice, and Santosh Chitakki, Director of Product Management at Attunity, will be telling us more about rebalancing data warehouses and integrating your current enterprise data warehouse with a Modern Data Architecture.…

Introduction

Apache Spark is a fast, in-memory data processing engine with elegant and expressive development APIs in Scala, Java, and Python that allow data workers to efficiently execute machine learning algorithms that require fast iterative access to datasets. Spark on Apache Hadoop YARN enables deep integration with Hadoop and other YARN enabled workloads in the enterprise.

In this blog, we will introduce the basic concepts of Apache Spark and the first few necessary steps to get started with Spark on Hortonworks Sandbox.…

Enterprises across all major industries adopt Apache Hadoop for its ability to store and process an abundance of new types of data in a modern data architecture. This “Any Data” capability has always been a hallmark feature of Hadoop, opening insight from new data sources such as clickstream, web and social, geo-location, IoT, server logs, or traditional data sets from ERP, CRM, SCM or other existing data systems.…

Hortonworks is pleased to announce the general availability of Apache Spark in Hortonworks Data Platform (HDP)— now available on our downloads page. With HDP 2.2.4 Hortonworks now offers support for your developers and data scientists using Apache Spark 1.2.1.

HDP’s YARN-based architecture enables multiple applications to share a common cluster and dataset while ensuring consistent levels of service and response. Now Spark is one of the many data access engines that works with YARN and that is supported in an HDP enterprise data lake.…

Hortonworks Data Platform (HDP) provides centralized enterprise services for comprehensive security to enable end-to-end protection, access, compliance and auditing of data in motion and at rest. HDP’s centralized architecture—with Apache Hadoop YARN at its core—also enables consistent operations to enable provisioning, management, monitoring and deployment of Hadoop clusters for a reliable enterprise-ready data lake.

But comprehensive security and consistent operations go together, and neither is possible in isolation.

We published two blogs recently announcing Ambari 2.0 and its new ability to manage rolling upgrades.…