The Hortonworks Blog

Posts categorized by : HDP 2

Just yesterday, we talked about our roadmap for Security in Enterprise Hadoop. At our Security labs page you can see in one place the security roadmap and efforts underway across Hadoop and their timelines.

Security is often described as rings of defense. Continuing this analogy the Apache community has been working to create a perimeter security solution for Hadoop. This effort is Apache Knox Gateway (Apache Knox) and we are happy to announce the Technical Preview of Apache Knox.…

Security is a top agenda item and represents critical requirements for Hadoop projects. Over the years, Hadoop has evolved to address key concerns regarding authentication, authorization, accounting, and data protection natively within a cluster and there are many secure Hadoop clusters in production. Hadoop is being used securely and successfully today in sensitive financial services applications, private healthcare initiatives and in a range of other security-sensitive environments. As enterprise adoption of Hadoop grows, so do the security concerns and a roadmap to embrace and incorporate these enterprise security features has emerged.…

The Apache Tez team is proud to announce the first release of Apache Tez – version 0.2.0-incubating.

Apache Tez is an application framework which allows for a complex directed-acyclic-graph of tasks for processing data and is built atop Apache Hadoop YARN. You can learn much more from our Tez blog series tracked here.

Since entering the Apache Incubator project in late February of 2013, there have been over 400 tickets resolved, culminating in this significant release.…

We are very excited to announce that Apache Ambari has graduated out of Incubator and is now an Apache Top Level Project! Hortonworks introduced Ambari as an Apache Incubator project back in August 2011 with the vision of making Hadoop cluster management dead simple.  In little over two years, the development community grew significantly, from a small team in Hortonworks, to a large number of contributors from various organizations beyond Hortonworks; upon graduation, there were more than 60 contributors, 37 of whom had become committers.…

We believe the fastest path to innovation is the open community and we work hard to help deliver this innovation from the community to the enterprise.  However, this is a two way street. We are also hearing very distinct requirements being voiced by the broad enterprise as they integrate Hadoop into their data architecture.

Take a look at the Falcon Technical Preview and the Data Management Labs.

Open Source, Open Community & An Open Roadmap for Dataset Management

Over the past year, a set of enterprise requirements has emerged for dataset management.  …

In just a few years, interest in Hadoop has enjoyed a meteoric rise. It is everywhere… and it should be available everywhere.

Here at Hortonworks we have worked to provide the widest range of deployment options for Hadoop… from on-premises to the cloud, Linux and Windows, and from commodity server clusters to high-end appliances. Deployment options are critical to the adoption of Hadoop and a key factor to adoption.

Today, we add Ubuntu to the list of options we support for HDP 2.0.…

We have heard plenty in the news lately about healthcare challenges and the difficult choices faced by hospital administrators, technology and pharmaceutical providers, researchers, and clinicians. At the same time, consumers are experiencing increased costs without a corresponding increase in health security or in the reliability of clinical outcomes.

One key obstacle in the healthcare market is data liquidity (for patients, practitioners and payers) and some are using Apache Hadoop to overcome this challenge, as part of a modern data architecture.…

User logs of Hadoop jobs serve multiple purposes. First and foremost, they can be used to debug issues while running a MapReduce application – correctness problems with the application itself, race conditions when running on a cluster, and debugging task/job failures due to hardware or platform bugs. Secondly, one can do historical analyses of the logs to see how individual tasks in job/workflow perform over time. One can even analyze the Hadoop MapReduce user-logs using Hadoop MapReduce(!) to determine any performance issues.…

This is the second of two posts examining the use of Hive for interaction with HBase tables. This is a hands-on exploration so the first post isn’t required reading for consuming this one. Still, it might be good context.

“Nick!” you exclaim, “that first post had too many words and I don’t care about JIRA tickets. Show me how I use this thing!”

This is post is exactly that: a concrete, end-to-end example of consuming HBase over Hive.…

Join Hortonworks and Pactera for a Webinar on Unlocking Big Data’s Potential in Financial Services Thursday, November 21st at 12:00 EST.

Have you ever had your debit or credit card declined for seemingly no reason? Turns out, the rejections are not so random. Banks are increasingly turning to analytics to predict and prevent fraud in real-time. That can sometimes be an inconvenience for customers who are traveling or making large purchases, but it’s necessary inconvenience today in order for banks to reduce billions in losses due to fraud.…

This post is authored by Omkar Vinit Joshi with Vinod Kumar Vavilapalli and is the ninth post in the multi-part blog series on Apache Hadoop YARN – a general-purpose, distributed, application management framework that supersedes the classic Apache Hadoop MapReduce framework for processing data in Hadoop clusters. Other posts in this series:

Introduction

In the previous post, we explained the basic concepts of LocalResources and resource localization in YARN.…

When I first started to understand what YARN is, I wanted to build an application to understand its core. There was already a great example YARN application called Distributed Shell that I could use as the shell (pun intended) for my experiment. Now I just needed an existing application that could provide massive reuse value by other applications. I looked around and I decided on MemcacheD.

This brief guide shows how to get MemcacheD up and running on YARN – MOYA if you will…

Prerequisites

You’re going to need a few things to get the sample application operational.…

We had a lot of fun in NYC and hope you did too. Thanks to the hundreds of you who dropped by the booth, attended dinners, parties, meetups and sessions.

As we have known for some time, Hortonworks customers are already building a modern data architecture with Hadoop as the technology of choice for handling the data they have streaming in from all directions. They care that it matches their needs, integrates with their existing infrastructure and solves real problems with flexibility.…

We’re delighted to announce that our Hadoop 2.0 Courseware is available now!

According to a 2013 Education Services Bench Mark Study conducted by The Technology Services Industry Association, “the lag time between product and Instructor Lead Content release is 68 business days, or more than three months” but not at Hortonworks!  All of our Developer Courses and Certifications are now based on Apache Hadoop 2.0 and available at the same time as the Hortonworks Data Platform 2.0.…

With the attention of the Hadoop community on Strata/Hadoop World in New York this week, it’s seems an appropriate time to give everyone an early update on continued community development of Apache Hive. This progress well and truly cements Hive as the standard open-source SQL solution for the Apache Hadoop ecosystem for not just extremely large-scale, batch queries but also for low-latency, human-interactive queries.

You can catch me at our session ‘Apache Hive & Stinger: Petabyte Scale SQL, IN Hadoop’ along with Owen and Alan where we’ll be happy to dive into more of the details.…

Go to page:« First...23456...Last »