The Hortonworks Blog

Posts categorized by : HDP

Today we are proud to announce the delivery of Apache Ambari 1.4.1. Ambari 1.4.1 combines many months of work in the community advancing the Ambari codebase. Over 760 JIRAs have been resolved since the Ambari 1.2.5 release. We would like to thank the nearly 40 engineers who contributed to help make this release possible.

Hello Hadoop 2, Meet Apache Ambari The most important addition to Ambari 1.4.1 is support for installing, managing and monitoring a cluster based on the Hadoop 2 stack.…

The Hortonworks HBase team is excited to see HBase 96 released.  It represents a broad community effort and massive amount of work that has been building for more than a year.

HBase 96 closes out over 2000 issues (2134 Jira tickets to be exact) and it represented the collective work from a VERY active community. Kudos to everyone involved! As the authors in a recent Apache blog alluded to, the HBase community is very healthy and includes developers from many companies including Hortonworks, Yahoo!, Cloudera, Salesforce, eBay, Intel, and Facebook, just to name just a few.…

This post is authored by Omkar Vinit Joshi with Vinod Kumar Vavilapalli and is the 8th post in the multi-part blog series on Apache Hadoop YARN – a general-purpose, distributed, application management framework that supersedes the classic Apache Hadoop MapReduce framework for processing data in Hadoop clusters. Other posts in this series: 

Introduction

In YARN, applications perform their work by running containers, which today map to processes on the underlying operating system.…

You did it! Last Sunday we challenged you to “Learn Hadoop in 7 days”. We hope that you have risen to the test and kept up with the tutorials we’ve posted each day through Twitter and Facebook. These tutorials should have helped you delve into:

By now, you should feel comfortable with Hadoop clickstream analysis, Hortonworks ODBC driver configuration, and many other important components of Hadoop.…

Stinger is not a product.  Stinger is a broad community based initiative to bring interactive query at petabyte scale to Hadoop. And today, as representatives of this open, community led effort we are very proud to announce delivery of Apache Hive 0.12, which represents the critical second phase of this project!

Only five months in the making, Apache Hive 0.12 comprises over 420 closed JIRA tickets contributed by ten companies, with nearly 150 thousand lines of code! …

An important tool in the Hadoop developer toolkit is the ability to look at key metrics for a MapReduce job – to understand the performance of each job and to optimize future job runs.

In this blog article, we’ll explore how HDP 2.0 stores and provides insight into the performance of a MapReduce job on YARN.

Change from MapReduce v1 and HDP 1.x

In MapReduce-v2 on YARN in HDP 2.0, the JobTracker no longer exists.…

Apache Storm and YARN extend Hadoop to handle real time processing of data and provides the ability to process and respond events as they happen. Our customers have told us many use cases for this technology combination and below we present a demo example complete with code so you can try it yourself.

For the demo below, we used our Sandbox VM which is a full implementation of the Hortonworks Data Platform.…

Hortonworks will be making a preview of Apache Storm integration available in Q4 of this year and will be including Apache Storm in the Hortonworks Data Platform in first half of 2014.

Any time now, the Apache Hadoop community will declare the General Availability of Hadoop 2.0 which includes the much anticipated Apache Hadoop YARN.  The YARN-based architecture of Hadoop 2 is the most significant change to Hadoop introduced in the past six years and enables Hadoop to expand from a single-purpose, batch-oriented data platform based on MapReduce into a truly multi-purpose platform supporting a wide range of data processing approaches.…

As part of a modern data architecture, Hadoop needs to be a good citizen and trusted as part of the heart of the business. This means it must provide for all the platform services and features that are expected of an enterprise data platform.

The Hadoop Distributed File System is the rock at the core of HDP and provides reliable, scalable access to data for all analytical processing needs. With HDP 2.0, built into the platform itself, HDFS now has automated failover with a hot standby, with full stack resiliency.…

Security is one of the biggest topics in Hadoop right now. Historically Hadoop has been a back-end system accessed only by a few specialists, but the clear trend is for companies to put data from Hadoop clusters in the hands of analysts, marketers, product managers or call center employees whose numbers could be in the hundreds or thousands. Data security and privacy controls are necessary before this transformation can occur. HDP2, through the next release of Apache Hive introduces a very important new security feature that allows you to encrypt the traffic that flows between Hadoop and popular analytics tools like Microstrategy, Tableau, Excel and others.…

On October 16, we’ve been invited to join our partner SAP to talk Big Data and how the integrated SAP HANA + Hadoop approach can solve your big data challenges. This chat will be a live Google Hangout with:

  • Irfan Khan, SVP & GM SAP Global Big Data at SAP (@i_kHANA)
  • Ari Zilka,  CTO at Hortonworks (@ikarzali)
  • Timo Elliot, Innovation Evangelist at SAP (@timoelliott)

When: Wednesday, October 16, 8am PT / 11am ET / 5pm CET…

We’re continuing our series of quick interviews with Apache Hadoop project committers at Hortonworks.

This week Enis Soztutar discusses Apache HBase, built for random read/write access to data in billions of rows and millions of columns.

Enis began using Apache Hadoop in 2006. Now, Enis is a Hortonworks engineer and Apache HBase project management chair. He has also been a committer to Apache Hadoop since 2007 and to HBase since 2012.…

This is a guest blog post from our partner, Actuate. They’ve been generous enough to create some great Hadoop tutorials on the Open Source BIRT project that use the Hortonworks Sandbox.

By now, Apache™ Hadoop® has become synonymous with the first stage of Big Data: storing, processing and managing huge volumes and varieties of structured and unstructured data. Yet the data stored by Hadoop remains unreadable to the average business user.…

A crucial requirement of any enterprise technology is to ensure simplest possible management and operation. We think that simplicity means two things: 1) integration with existing infrastructure and tools and 2) leveraging existing knowledge and skills.

Download the beta release of Ambari SCOM Management Pack here.

Ambari (http://incubator.apache.org/ambari/) was introduced as an Apache incubator project with the aim of developing the best management tool for Hadoop applying our principles of open source community development for rapid innovation and solving the right problems for enterprises.…

Thanks to all those who joined in person and virtually for the Apache Ambari Meetup at Hortonworks this week. We talked tech, we saw demos, we laughed, we cried, we ate pizza.

The central theme of the night was the newly added support for Hadoop 2. Ambari now has:

  • Hadoop 2 Stack: Ambari adds support for installing, managing and monitoring a Hadoop 2 Stack.
  • NameNode HA: Configure NameNode High Availability based on QJM support built-into HDFS2
  • YARN: Ambari manages YARN Service lifecycle and automatically deploys the MapReduce2 framework.
Go to page:« First...89101112...Last »