The Hortonworks Blog

Posts categorized by : Innovation from Hortonwoks

Merv Adrian, the widely respected Gartner analyst, recently remarked on the continuing evolution of Apache Hadoop:

YARN is the one that really matters because it doesn’t just mean the list of components will change, but because in its wake the list of components will change Hadoop’s meaning. YARN enables Hadoop to be more than a brute force, batch blunt instrument for analytics and ETL jobs. It can be an interactive analytic tool, an event processor, a transactional system, a governed, secure system for complex, mixed workloads.…

HDFS metadata represents the structure of HDFS directories and files in a tree. It also includes the various attributes of directories and files, such as ownership, permissions, quotas, and replication factor. In this blog post, I’ll describe how HDFS persists its metadata in Hadoop 2 by exploring the underlying local storage directories and files. All examples shown are from testing a build of the soon-to-be-released Apache Hadoop 2.6.0.

WARNING: Do not attempt to modify metadata directories or files.…

Joe Travaglini, director of product marketing at Sqrrl and Ely Kahn, vice president of business development at Sqrrl, are our guest bloggers. They explain Sqrrl’s integration with Hortonworks Data Platform (HDP).

There Is No Secure Perimeter

With the dawn of phenomena such as Cloud Computing and Bring Your Own Device (BYOD), it is no longer the case that there is a well-defined perimeter to secure and defend. Data is able to flow inside, outside, and across your network boundaries with limited interference from traditional controls.…

Last week’s release of Hortonworks Data Platform 2.2 is packed with countless new features for Enterprise Hadoop. These included the results of Hortonworks investment in VERTICAL integration with YARN and HDFS and also HORIZONTAL innovation to ensure the key enterprise services of governance, security, and operations can be applied consistently and reliably across all the components within the Apache Hadoop platform.

To guide you through these capabilities, Hortonworks is hosting a new series of eight Thursday webinars beginning on October 23 and running to December 18.…

Enterprise Apache Hadoop provides the fundamental data services required to deploy into existing architectures. These include security, governance and operations services, in addition to Hadoop’s original core capabilities for data management and data access. This post focuses on recent work completed in the open source community to enhance the Hadoop security component, with encryption and SSL certificates.

Last year I wrote a blog summarizing wire encryption options in Hortonworks Data Platform (HDP).…

Introduction

Hortonworks University announces a new operationally focused course for Apache Hadoop administrators. This two-day training course is designed for Hadoop administrators who are familiar with administering other Hadoop distributions and are migrating to the Hortonworks Data Platform (HDP). Through a combination of lecture and hands-on exercises you will learn how to install, configure, maintain and scale an HDP cluster

Target Audience

This course is designed for experienced Hadoop administrators and operators who will be responsible for installing, configuring and supporting the Hortonworks Data Platform.…

Hortonworks Data Platform Version 2.2 represents yet another major step forward for Hadoop as the foundation of a Modern Data Architecture. This release incorporates the last six months of innovation and includes more than a hundred new features and closes thousands of issues across Apache Hadoop and its related projects.

Our approach at Hortonworks is to enable a Modern Data Architecture with YARN as the architectural center, supported by key capabilities required of an enterprise data platform — spanning Governance, Security and Operations.…

More and more enterprises are looking to the cloud as a place to handle a variety of their data processing and backup needs. Apache Hadoop lends itself to running in cloud environments because of the alignment around scalability and flexibility for compute and storage offered with today’s cloud infrastructures. Today, we are excited to announce that the Hortonworks Data Platform (HDP) is the first platform to be certified to run on Azure Infrastructure as a Service.…

Apache Hadoop has taken a mission critical role in the Modern Data Architecture (MDA) with the advent of Apache Hadoop YARN. YARN has enabled enterprises to store and process data across many execution engines at a scale that has not been possible earlier. This in turn has made security a crucial component of enterprise Hadoop. At Hortonworks we have broken the problem of enterprise security into four key areas of focus: authentication, authorization, auditing and data protection.…

Since its first deployment at Yahoo in 2006, HDFS has established itself as the defacto scalable, reliable and robust file system for Big Data. It has addressed several fundamental problems of distributed storage at unparalleled scales and with enterprise grade robustness.

As more and more enterprises adopt Apache Hadoop, it is becoming a unified central storage aka Data Lake for all kinds of enterprise data. Many of these storage use cases are for file storage for classic big data applications, where HDFS is the perfect fit.…

Computers are getting smarter and we are not.

–Tim Berners Lee, Web Developer

Google, Amazon and Netflix have conditioned us. As consumers, we expect intelligent applications that predict, suggest and anticipate our every move. We want them to sift through the millions of possibilities and suggest just a few that suit our needs. We want applications that take us on a personalized journey through a world of endless possibilities.

These personalized journeys require systems to store and make sense of huge data volumes in an acceptable amount of time.…

Hortonworks and VMware have been working jointly for more than two years. We worked with VMware on the initial launch of Serengeti, on Apache Hadoop High Availability and on projects to do with validating and performance testing the Hortonworks Data Platform (HDP) software on the VMware vSphere platform. One of the results of this activity is that HDP has been a fully certified product on VMware vSphere version 5.1 and later.…

SequenceIQ is a new Hortonworks Technology Partner and recently achieved HDP and YARN Ready certification for Cloudbreak, the SequenceIQs Hadoop as a Service API. In this guest blog, SequenceIQ Co-founder and CTO Janos Matyas (@sequenceiq), describes provisioning and autoscaling HDP cluster with Cloudbreak.

During our daily work at SequenceIQ, we are provisioning HDP clusters on different environments. Be it for a random cloud provider or on bare metal, we were looking for a common solution to automate and speed up the process.…

Heading to Strata next week? Interested in learning more about Apache Hadoop and how to integrate your existing infrastructure and applications with your Big Data solution? Join our partner presentations at the Hortonworks booth #117, on Thursday, October 16 and Friday, October 17. Many of the sessions will feature demonstrations.

You will hear directly from partners who are embracing 100% open source Apache Hadoop, including: Actian, CISCO, CSC, HP, Informatica, Microsoft, Red Hat, Revolution Analytics, SAP, SAS, and Teradata.…

Today’s guest blog comes from Matt Davies at Splunk, where he is the Director of Marketing.

You can’t really escape the fact that we’re in the age of the customer. From CRM to the “long tail” to multi-channel to social media brand sentiment to Net Promoter Scores – it is all about customer experience. Big Data has an important part to play – no great revelation there but how do you actually do it?…

Go to page:12345...10...Last »