cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
May 26, 2017 | Tom Hastain | Hortonworks Case Study

Precision Medicine: a 5 Million Person Case Study

May 26, 2017 | Carole Gum | Hortonworks Community Connection

Don’t miss the Business of Data at DataWorks Summit

May 26, 2017 | Anna Yong

Open Source Talent Powers Big Data Success

Viewing posts: From the Dev Team« Back to all

X
FILTERS
ALL
TECHNICAL
BUSINESS

All Topics















All Channels











CLEAR FILTERS

With Apache Hadoop YARN as its architectural center, Apache Hadoop continues to attract new engines to run within the data platform, as organizations want to efficiently store their data in a single repository and interact with it for batch, interactive and real-time streaming use cases. Apache Storm brings real-time data processing capabilities to help capture […]

Hortonworks architects vertically integrate the projects within our Hadoop distribution with YARN and HDFS in order to enable HDP to span workloads from batch, interactive, and real time—across both open source and other data access technologies. In HDP 2.2, we deliver work to vertically integrate Apache Storm, Apache Accumulo and Apache HBase so that all […]

On November 13th, Hortonworks presented the fourth of 8 Discover HDP 2.2 webinars: Rohit Bakhshi, Jitendra Pandey, and Justin Sears hosted this 4th webinar in the series. Rohit Bakhshi and Jitendra Pandey introduced HDP and discussed how to use HDFS for reliable, scalable, cost-efficient, and fault tolerant as a distributed data storage platform for your […]

The Stinger.next initiative, with its focus on transactions, sub-second queries and SQL:2011 Analytics evolves Apache Hive to allow it to run most of the analytical workloads that are typical within a data warehouse, but now at petabyte scale. The first phase of Stinger.Next, delivered in Apache Hive 0.14 and in HDP 2.2, delivers transactions with […]

With Apache Hadoop YARN as its architectural center, Apache Hadoop continues to attract new engines to run within the data platform, as organizations want to efficiently store their data in a single repository and interact with it in different ways. As YARN propels Hadoop’s emergence as a business-critical data platform, the enterprise requires more stringent […]

The architecture of Hortonworks Data Platform (HDP) matches the blueprint for Enterprise Apache Hadoop, with data management, data access, governance, operations and security. This post focuses on one of those core components: security. Specifically, we will focus on Apache Knox Gateway for securing access to the Hadoop REST APIs. Pseudo Federation Provider This blog will […]

With Apache Hadoop YARN as its architectural center, Apache Hadoop continues to attract new engines to run within the data platform, as organizations want to efficiently store their data in a single repository and interact with it for batch, interactive and real-time streaming use cases. More and more independent software vendors (ISVs) are developing applications […]

Introduction In this 2nd part of the blog post and its accompanying IPython Notebook in our series on Data Science and Apache Hadoop, we continue to demonstrate how to build a predictive model with Apache Hadoop, using existing modeling tools. And this time we’ll use Apache Spark and ML-Lib. Apache Spark is a relatively new […]

Hadoop Operations for provisioning, managing and monitoring a cluster are critical to the success of a Hadoop project and having an intuitive and effective set of tooling has become a foundational element of a Hadoop distribution. Within HDP, we provide completely open source Apache Ambari to help you be successful with Hadoop operations. The rate […]

Our customers have many choices of infrastructure to deploy HDP: on premise, cloud, virtualized and even as an appliance. Further, our customers have a choice of deploying on Linux and Windows operating systems. You can easily see this creates a complex matrix. At Hortonworks, we believe you should not be limited to just one option […]

It gives me great pleasure to announce that the Apache Hadoop community has released Apache Hadoop 2.6.0 ! In particular, we are excited about three major pieces in this release: heterogeneous storage in HDFS with SSD & Memory tiers, support for long-running services in YARN and rolling upgrades—the ability to upgrade your cluster software and […]

With YARN as its architectural center, Apache Hadoop continues to attract new engines to run within the data platform, as organizations want to efficiently store their data in a single repository and interact with it simultaneously in different ways. Apache Tez supports YARN-based, high performance batch and interactive data processing applications in Hadoop that need […]

While YARN has allowed new engines to emerge for Hadoop, the most popular integration point with Hadoop continues to be SQL and Apache Hive is still the defacto standard. Although many SQL engines for Hadoop have emerged, their differentiation is being rendered obsolete as the open source community surrounds and advances this key engine at […]

A Cosmopolitan Metropolis Brussels, Belgium, conjures images of a cosmopolitan metropolis, where geopolitical summits are held, where world economic forums are debated, where global European institutions are headquartered, and where citizens and diplomats fluently converse in more than three languages—English, French, Dutch or German, along with other non-official local flavors. To this colorful collage, add […]

Two weeks ago Hortonworks presented the third in series of 8 Discover HDP 2.2 webinars: Discover HDP 2.2: Discover HDP 2.2: Apache Falcon for Hadoop Data Governance. Andrew Ahn, Venkatesh Seetharam, and Justin Sears hosted this 3rd webinar in the series. After Justin Sears set the stage for the webinar by explaining the drivers behind […]