The Hortonworks Blog

Posts categorized by : Operations & Management

Last Thursday we hosted the last of our seven Discover HDP 2.1 webinars, Using Apache Ambari to Manage Hadoop Clusters. Over 140 people attended and joined in the conversation.

The speakers gave an overview of Apache Ambari, discussed new features, and showed an end-to-end demo.

Thanks to our presenters Justin Sears (Hortonworks’ Product Marketing Manager), Jeff Sposetti (Hortonworks’ Senior Director of Product Management), and Mahadev Konar (Hortonworks’ Co-founder, Committer, and PMC Member for Apache Hadoop, Apache Ambari, and Apache Zookeeper) who presented the webinar.…

IBM InfoSphere Guardium has certified with HDP 2.1. The  Hortonworks Certified Technology Program simplifies big data planning by providing pre-built and validated integrations between leading enterprise technologies and HDP. 

Kathryn Zeidenstein, InfoSphere Guardium Evangelist, is our guest blogger and describes security, Hadoop, and the Guardium solution.

Those of us in the data security and privacy space tend to worry a lot. With each new breaking story on the latest data breach, and with the subsequent fallout, people higher and higher up the food chain are also worrying a lot.…

Customers’ Hadoop Journey

We’ve all had two weeks to reflect on Hadoop Summit 2014. One of the biggest differences that stood out in this year’s Summit (as compared to Summit 2013) was the presence of large enterprise customers that are using Apache Hadoop as an important part of their modern data architectures.

Hadoop has gone beyond its original Yahoo use case—indexing the web via a nightly batch MapReduce process —and into the mainstream of daily data processing and analytics with real-time, online, interactive, and batch applications at many notable companies.…

Apache Ambari has always provided an operator the ability to provision an Apache Hadoop cluster using an intuitive Cluster Install Wizard web interface, guiding the user through a series of steps:

  • confirming the list of hosts
  • assigning master, slave, and client components to configuring services, and
  • installing, starting and testing the cluster.

With Ambari Blueprints, system administrators and dev-ops engineers can expedite the process of provisioning a cluster. Once defined, Blueprints can be re-used, which facilitates easy configuration and automation for each successive cluster creation.…

Since the partnership between Hortonworks and Splunk and the release of Hunk last year, we have created some awesome assets (i.e., Hunk sandbox tutorial, 360-degree customer view webinar) that have enabled Hadoop and Big Data enthusiasts’ hands-on training with Big Data. You can find more details around our partnership and resources here: http://hortonworks.com/partner/splunk/

As part of our HDP 2.1 certification series, I would like to introduce Brett Sheppard, Director of Product Marketing for Big Data at Splunk.…

We recently hosted the fourth of our seven Discover HDP 2.1 webinars, entitled Apache 2.4.0, HDFS and YARN. It was very well attended and a very informative discourse. The speakers outlined the new features in YARN and HDFS in HDP 2.1 including:

  • HDFS Extended ACLs
  • HTTPs support for WebHDFS and for the Hadoop web UIs
  • HDFS Coordinated DataNode Caching
  • YARN Resource Manager High Availability
  • Application Monitoring through the YARN Timeline Server
  • Capacity Scheduler Preemption

Many thanks to our presenters, Rohit Bakhshi (Hortonworks’ senior product manager), Vinod Kumar Vavilapalli (co-author of the YARN Book, PMC, Hadoop YARN Project Lead at Apache and Hortonworks), and Justin Sears (Hortonworks’ Product Marketing Manager).…

Traditionally, HDFS, Hadoop’s storage subsystem, has focused on one kind of storage medium, namely spindle-based disks.  However, a Hadoop cluster can contain significant amounts of memory and with the continued drop in memory prices, customers are willing to add memory targeted at caching storage to speed up processing.

Recently HDFS generalized its architecture to include other kinds of storage media including SDDs and memory [1]. We also added support for caching hot files in memory [2].…

Informatica is a Hortonworks Certified Technology Partner. This partnership makes it possible for organizations to use all the data internal and external to an enterprise to achieve the full predictive power that drives the success of modern data-driven businesses. 

That is why we’re excited to have John Haddad, Senior Director, Informatica to be our guest blogger. In this blog, John explores the benefits of certification on HDP 2.1.

When I was in high school, one of my best friends had a water ski boat we often took out on California lakes (what are friends for?).…

On Wednesday May 21, Himanshu Bari (Hortonworks’ senior product manager), Venkatesh Seetharam (committer to Apache Falcon), and Justin Sears ( Hortonworks’ Product Marketing Manager), hosted the third of our seven Discover HDP 2.1 webinars. Himanshu and Venkatesh discussed data governance in Hadoop through Apache Falcon that is included in HDP 2.1. As most of you know, ingesting data into Hadoop is one thing; having data governed, by dictating and defining data-pipeline policies, is another thing—a necessity in the enterprise.…

This is the first post in our series on the motivations and architecture for improvements to the Apache Hadoop YARN’s Resource Manager Restart resiliency. Other in the series are:

Resource Manager (RM) is the central authority of Apache Hadoop YARN for resource management and scheduling. It is responsible for allocation of resources to applications like Hadoop MapReduce jobs, Apache TEZ DAGs, and other applications running atop YARN.…

Yesterday the Apache Ambari community proudly released version 1.5.1. This is the result of constant, concerted collaboration among the Ambari project’s many members. This release represents the work of over 30 individuals over 5 months and, combined with the Ambari 1.5.0 release, resolves more than 1,000 JIRAs.

This version of Ambari makes huge strides in simplifying the deployment, management and monitoring of large Hadoop clusters, including those running Hortonworks Data Platform 2.1.…

Three weeks ago, we announced availability of the technical preview of Hortonworks Data Platform (HDP) version 2.1 and since then we have had thousands of downloads of this preview.  We also promised delivery of GA bits on April 22nd  and we are delighted to deliver as stated. HDP 2.1, which includes countless new features across seven new components, is available today from our download page

YARN unlocks the Data Lake

YARN, the resource management layer of Hadoop 2 is delivering value as it has unlocked the data lake vision for many.…

The pace of innovation within the Apache Hadoop community is truly remarkable, enabling us to announce the availability of Hortonworks Data Platform 2.1, incorporating the very latest innovations from the Hadoop community in an integrated, tested, and completely open enterprise data platform.

Download HDP 2.1 Technical Preview Now

What’s In Hortonworks Data Platform 2.1?

The advancements in HDP 2.1 span every aspect of Enterprise Hadoop: from data management, data access, integration & governance, security and operations. …

Compuware is a Hortonworks Technology Partner and this week announced the availability of the newest release of APM for Big Data.  This release provides enhanced support for Hadoop 2.0 and Hortonworks Data Platform (HDP) 2.0

Compuware’s APM for Big Data now provides greater visibility into Hadoop job details with out-of-the-box dashboards that require no configuration. The graphical dashboards expand insight and ease of analyzing Hadoop deployments.  With the Hadoop focused dashboards, customers can get information about any Hadoop cluster and summarized overviews of cluster utilization across users, jobs, pools, queues and more.…

In this post, we will explore how to quickly and easily spin up our own VM with Vagrant and Apache Ambari. Vagrant is very popular with developers as it lets one mirror the production environment in a VM while staying with all the IDEs and tools in the comfort of the host OS.

If you’re just looking to get started with Hadoop in a VM, then you can simply download the Hortonworks Sandbox.…

Go to page:123