The Hortonworks Blog

Posts categorized by : Innovation from Hortonwoks

On October 15 we announced that we would support Apache Hadoop as an Infrastructure as a Service (IaaS) on Microsoft Azure. This made us the first Hadoop vendor to give customers and prospects access to that flexible and scalable cloud infrastructure for their big data deployments.

This guide walks you through using the Azure Gallery to quickly deploy Hortonworks Data Platform (HDP) clusters on Microsoft Azure IaaS.

What you need is:

  • A Microsoft Azure account
  • That’s it!

Arsalan Tavakoli-Shiraji, customer engagement lead overseeing business development activities at Databricks, is our guest blogger today. In this blog, he discusses our expanded partnership built around Apache Spark on Apache Hadoop in three areas: customers, engineering, and open source.

Today Databricks and Hortonworks are announcing an expanded partnership built around Apache Spark; allow me to explain why we’re thrilled to be embarking on this journey with them.

When we started Databricks last summer, Apache Spark was in the early stages of enterprise adoption.…

A few weeks back, we outlined a broad initiative to invest in Spark in the context of the Hadoop ecosystem. We intend to facilitate a more efficient utilization of Hadoop cluster resources for ETL and/or Data Pipeline workloads when using Spark. Many of the lessons learned while building out MapReduce, Apache Tez and other YARN data-processing frameworks can be applied to the Spark project in order to optimize its resource utilization and to make it a good multi-tenant citizen within a YARN-based Hadoop cluster.…

Exponential increases in data volumes have forced the data architect and analyst to build much larger and distributed data environments — potentially comprised of hundreds, and sometimes even thousands of servers and switches. Scaling to these cluster sizes does not come without challenges in terms of costs, security and integration with existing infrastructure.

The combination of the Hortonworks Data Platform (HDP) and Cisco UCS allows IT departments and business decisions makers to adopt a new, cost-effective management, massively scalable and secure approach to data within the enterprise.…

Last week Hortonworks presented the first of 8 Discover HDP 2.2 webinars: Comprehensive Hadoop Security with Apache Ranger and Apache Knox. Vinay Shukla and Balaji Ganesan hosted this first webinar in the series.

Balaji discussed how to use Apache Ranger (for centralized security administration, to set up authorization policies, and to monitor user activity with auditing. He also covered Ranger innovations now included in HDP 2.2:

  • Support for Apache Knox and Apache Storm, for centralized authorization and auditing
  • Deeper integration of Ranger with the Apache Hadoop stack with support for local grant/revoke in HDFS and HBase
  • Ranger’s enterprise readiness, with the introduction of REST APIs for policy management, and scalable storage of audit in HDFS

Vinay presented Apache Knox and API security for Apache Hadoop.…

As a Hortonworks Certified Technology Partner, Nimble Storage delivers solutions that help enterprises scale their Big Data solutions, simply and cost-effectively. Ibby Rahmani, Nimble’s product and solutions marketing manager, is our guest blogger. He discusses the importance of certification for Nimble and Hortonworks customers.

Recently, Nimble Storage proudly announced our partnership with Hortonworks, and the certification of Nimble’s Adaptive Flash solutions on HDP. Nimble offers an exceptional return on investment (ROI) for the Hortonworks Data Platform (HDP), offering accelerated performance, capacity efficiency, integrated data protection in solutions that are easy to deploy, operate and maintain.…

We recently hosted a Spark webinar as part of the YARN Ready series, aimed at a technical audience including developers of applications for Apache Hadoop and Apache Hadoop YARN. During the event, a number of good questions surfaced that we wanted to share with our broader audience in this blog. Take a look at the video and slides along with these questions and answers below.

You can listen to the entire webinar recording here.…

Merv Adrian, the widely respected Gartner analyst, recently remarked on the continuing evolution of Apache Hadoop:

YARN is the one that really matters because it doesn’t just mean the list of components will change, but because in its wake the list of components will change Hadoop’s meaning. YARN enables Hadoop to be more than a brute force, batch blunt instrument for analytics and ETL jobs. It can be an interactive analytic tool, an event processor, a transactional system, a governed, secure system for complex, mixed workloads.…

HDFS metadata represents the structure of HDFS directories and files in a tree. It also includes the various attributes of directories and files, such as ownership, permissions, quotas, and replication factor. In this blog post, I’ll describe how HDFS persists its metadata in Hadoop 2 by exploring the underlying local storage directories and files. All examples shown are from testing a build of the soon-to-be-released Apache Hadoop 2.6.0.

WARNING: Do not attempt to modify metadata directories or files.…

Joe Travaglini, director of product marketing at Sqrrl and Ely Kahn, vice president of business development at Sqrrl, are our guest bloggers. They explain Sqrrl’s integration with Hortonworks Data Platform (HDP).

There Is No Secure Perimeter

With the dawn of phenomena such as Cloud Computing and Bring Your Own Device (BYOD), it is no longer the case that there is a well-defined perimeter to secure and defend. Data is able to flow inside, outside, and across your network boundaries with limited interference from traditional controls.…

Last week’s release of Hortonworks Data Platform 2.2 is packed with countless new features for Enterprise Hadoop. These included the results of Hortonworks investment in VERTICAL integration with YARN and HDFS and also HORIZONTAL innovation to ensure the key enterprise services of governance, security, and operations can be applied consistently and reliably across all the components within the Apache Hadoop platform.

To guide you through these capabilities, Hortonworks is hosting a new series of eight Thursday webinars beginning on October 23 and running to December 18.…

Enterprise Apache Hadoop provides the fundamental data services required to deploy into existing architectures. These include security, governance and operations services, in addition to Hadoop’s original core capabilities for data management and data access. This post focuses on recent work completed in the open source community to enhance the Hadoop security component, with encryption and SSL certificates.

Last year I wrote a blog summarizing wire encryption options in Hortonworks Data Platform (HDP).…

Introduction

Hortonworks University announces a new operationally focused course for Apache Hadoop administrators. This two-day training course is designed for Hadoop administrators who are familiar with administering other Hadoop distributions and are migrating to the Hortonworks Data Platform (HDP). Through a combination of lecture and hands-on exercises you will learn how to install, configure, maintain and scale an HDP cluster

Target Audience

This course is designed for experienced Hadoop administrators and operators who will be responsible for installing, configuring and supporting the Hortonworks Data Platform.…

Hortonworks Data Platform Version 2.2 represents yet another major step forward for Hadoop as the foundation of a Modern Data Architecture. This release incorporates the last six months of innovation and includes more than a hundred new features and closes thousands of issues across Apache Hadoop and its related projects.

Our approach at Hortonworks is to enable a Modern Data Architecture with YARN as the architectural center, supported by key capabilities required of an enterprise data platform — spanning Governance, Security and Operations.…

More and more enterprises are looking to the cloud as a place to handle a variety of their data processing and backup needs. Apache Hadoop lends itself to running in cloud environments because of the alignment around scalability and flexibility for compute and storage offered with today’s cloud infrastructures. Today, we are excited to announce that the Hortonworks Data Platform (HDP) is the first platform to be certified to run on Azure Infrastructure as a Service.…

Go to page:12345...10...Last »