The Hortonworks Blog

Few industries depend as heavily on data as financial services. Insurance companies, retail and investment banks aggregate, price and distribute capital with the aim of increasing their return on assets with an acceptable level of risk.

To do that, financial decision-makers need data. Apache Hadoop helps them store new data sources, then process the larger combined dataset for batch, interactive and real-time analysis. More data and better analysis improves bottom-line results.…

The Journey

Almost to the date, two years ago the Apache Hadoop community voted to make YARN a sub-project of Apache Hadoop followed by the GA release nearly a year ago last fall.

Since then, it’s becoming plainly obvious that Apache Hadoop 2.x, powered by YARN as its architectural center, is the best platform for workloads such as Apache Hadoop MapReduce, Apache Pig, Apache Hive etc., which were designed to process data on Apache Hadoop HDFS.…

This week we continue our YARN webinar series with detailed introduction and a developer overview of Apache Tez.  Designed to express fit-to-purpose data processing logic, Tez enables batch and interactive data processing applications spanning TB to PB scale datasets.  Tez offers a customizable execution architecture that allows developers to express complex computations as dataflow graphs and allows for dynamic performance optimizations based on real information about the data and the resources required to process it.…

We are in the midst of a data revolution. Hadoop, powered by Apache Hadoop YARN, enables enterprises to store, process, and innovate around data at a scale never seen before making security a critical consideration. Enterprises are looking for a comprehensive approach to security for their data to realize the full potential of the Hadoop platform unleashed by YARN, the architectural center and the data operating system of Hadoop 2.

Hortonworks and the open community continue to work tirelessly to enhance security in Hadoop.…

The world’s top telecommunications firms adopt Hadoop to gain competitive advantage and to respond to technology-driven changes like increases in both network traffic and the telemetry data captured by network sensors.

The majority of North America’s and Europe’s telcos have chosen Hortonworks Data Platform (HDP) to meet these challenges. Read the new Hortonworks white paper for a detailed discussion of twenty-one common telco and cable company use cases.

Download the White Paper

With their Modern Data Architectures based on HDP, these firms improve efficiency and capture opportunities in some of these ways:

  • Analyze call detail records (CDRs).

ScaleOut joined the Hortonworks Technology Partner Program and has recently achieved Hortonworks Certified status for ScaleOut hServer. ScaleOut Software is a pioneer in in-memory data grid software and the ScaleOut hServer can be installed directly on Hadoop nodes and runs in-memory. In this guest blog, William Bain, Founder and CEO, talks about certification and a use case.

Recently, ScaleOut Software announced technical certification of its ScaleOut hServer® product on Hortonworks Data Platform 2.1.…

This is a guest blog from Protegrity, a Hortonworks certified partner.

As Hadoop transitions to take on a more mission critical role within the data center, so the top IT imperatives of process innovation, operational efficiency, and data security naturally follow. One such imperative in particular now tops the requirement list for Hadoop consideration within the enterprise: a well-developed framework to secure data.

The open source community has responded. Work is underway to build out a comprehensive and coordinated security framework for Hadoop that can work well with existing IT security investments.…

Introduction

HDP 2.1 ships with Apache Knox 0.4.0. This release of Apache Knox supports WebHDFS, WebHCAT, Oozie, Hive, and HBase REST APIs.

Hive is a popular component used for SQL access to Hadoop, and the Hive Server 2 with Thrift supports JDBC access over HTTP. The following steps show the configuration to enable a JDBC client to talk to Hive Server 2 via Knox (Beeline > JDBC over HTTPS > Knox > HTTP > Hive Server2).…

This is a quest blog from Voltage Security, a Hortonworks partner.

Data Security for Hadoop is a critical requirement for adoption within the enterprise. Organizations must protect sensitive customer, partner and internal information and adhere to an ever-increasing set of compliance requirements. The security challenges these organizations are facing are diverse and the technology is evolving rapidly to keep pace. 

An Open Community For Platform Security

The open source community, including Hortonworks, has invested heavily in building enterprise grade security for Apache Hadoop. …

In May, Hortonworks acquired XA Secure and made a promise to contribute this technology to the Apache Software Foundation.  In June, we made it available for all to download and use from our website and today we are proud to announce this technology officially lives on as Apache Argus, an incubator project within the ASF.

This podling has been formed and now the process of graduating Argus to a top-level project (TLP) has begun.…

This is a guest post from Hortonworks partner, Dataguise. Dataguise is a HDP 2.1 certified technology partner providing sensitive data discovery, protection and reporting in Hadoop.

According to a 2013 Global Data Breach study by the Ponemon Institute, the average cost of data loss exceeds $5.4 million per breach, and the average per person cost of lost data approaching $200 per record in the United States. That said, no industry is spared from this threat and all of our data systems, including Hadoop, need to address the security concern.…

A transformation is occurring in the data center.  Enterprises are turning to a modern data architecture in order to derive maximum value from both big and small data across their organization.  They are building new analytic apps that unlock opportunity and allow them to maintain or create competitive edge. Apache Hadoop is at the center of this architecture and integrates with the technologies that run your business to augment and extend this new value.…

Hortonworks Software Engineers Vinod Kumar Vavilapalli (Apache Hadoop YARN committer) and Jian He (Apache YARN Hadoop committer) discuss Apache Hadoop YARN’s Resource Manager resiliency upon restart in this blog.This is their third blog post in our series on motivations and architecture for improvements to the Apache Hadoop YARN’s Resource Manager (RM) resiliency. Others in the series are:

Introduction Phase II – Preserving work-in-progress of running applications

ResourceManager-restart is a critical feature that allows YARN applications to be able to continue functioning even when the ResourceManager (RM) crash-reboots due to various reasons.…

HP and Hortonworks recently announced a strategic partnership that included a $50 million equity investment by HP. While the investment is important, there is an equally important joint commitment to help accelerate the adoption of Enterprise Apache Hadoop by deeply integrating the Hortonworks Data Platform (HDP) with the HP HAVEn big data platform.

Below are some thoughts on our joint work from the HP OMi Team…

The first area of joint engineering strategy between our companies will be to integrate Apache Ambari with HP Operations Manager i (OMi) which provides tools and APIs to provision, manage and monitor Hadoop clusters.  …

“Data is to information society what fuel was to the industrial economy: the critical resource powering the innovations that people rely on,” write Victor Mayer-Schönberger and Kenneth Cukier, in Big Data. Today, big data fuels and engenders innovation of new products and services, according to Forrester.

Just as countries’ fuel repositories need protection and security because they can come under attack, so do companies’ big data repositories. “Companies, markets, and countries are increasingly under attack from cyber-criminals.…

Go to page:12345...102030...Last »