The Hortonworks Blog

In February 2014, the Apache Storm community released Storm version 0.9.1. Storm is a distributed, fault-tolerant, and high-performance real-time computation system that provides strong guarantees on the processing of data. Hortonworks is already supporting customers using this important project today.

Many organizations have already used Storm, including our partner Yahoo! This version of Apache Storm (version 0.9.1) is:

  • Highly scalable. Like Hadoop, Storm scales linearly
  • Fault-tolerant. Automatically reassigns tasks if a node fails
  • Reliable. 

LDAP provides a central source for maintaining users and groups within an enterprise. There are two ways to use LDAP groups within Hadoop. The first is to use OS level configuration to read LDAP groups. The second is to explicitly configure Hadoop to use LDAP-based group mapping.

Here is an overview of steps to configure Hadoop explicitly to use groups stored in LDAP.

  • Create Hadoop service accounts in LDAP
  • Shutdown HDFS NameNode & YARN ResourceManager
  • Modify core-site.xml to point to LDAP for group mapping
  • Re-start HDFS NameNode & YARN ResourceManager
  • Verify LDAP based group mapping

Prerequisites: Access to LDAP and the connection details are available.…

Hortonworks would like to congratulate Leslie Lamport on winning the 2013 Turing Award given by the Association of Computing Machinery. This award is essentially the equivalent of the Nobel Prize for computer science.  Among Lamport’s many and varied contributions to the field computer science are: TLA (Temporal Logic for Actions)LaTeX and PAXOS.

The latter of these, the PAXOS three phase consensus protocol, inspires the Zookeeper coordination service, and powers HBase and highly available HDFS.…

We’ve said many times that in order for Apache Hadoop to succeed as the next generation data platform and power the Modern Data Architecture there must be a thriving ecosystem of enterprise vendors around it. It’s been core to our strategy from day one to foster these ecosystem vendors and work collaboratively to make them successful.

Download our Whitepaper: Hadoop and a Modern Data Architecture.

Our strategy of delivering enterprise Hadoop  as 100% open source has resulted in close alignment and tight partnerships with a broad Hadoop ecosystem with vendors large and small including major data management leaders like SAP and Teradata.…

If there’s one thing my interactions with our customers has taught me, it’s that Apache Hadoop didn’t disrupt the datacenter, the data did. The explosion of new types of data in recent years has put tremendous pressure on the datacenter, both technically and financially, and an architectural shift is underway where Enterprise Hadoop is playing a key role in the resulting modern data architecture.

Download our Whitepaper: Hadoop and a Modern Data Architecture.

Due to the flourish of Apache Software Foundation projects that have emerged in recent years in and around the Apache Hadoop project, a common question I get from mainstream enterprises is: What is the definition of Hadoop?

Download our Whitepaper: Hadoop and a Modern Data Architecture.

This question goes beyond the Apache Hadoop project itself, since most folks know that it’s an open source technology borne out of the experience of web scale consumer companies such as Yahoo!, Facebook and others who were confronted with the need to store and process massive quantities of data.…

This blog post originally appeared here and is reproduced in its entirety here. Part 1 can be found here.

The HBase BlockCache is an important structure for enabling low latency reads. As of HBase 0.96.0, there are no less than three different BlockCache implementations to choose from. But how to know when to use one over the other? There’s a little bit of guidance floating around out there, but nothing concrete.…

This is the seventh in our series on modern data architectures across industry verticals. Others in the series are:

Any financial services business cares about minimizing risk and maximizing opportunity. Banks weigh the risk of opening accounts versus the opportunity to hold deposits.…

Luminar is one of Hortonworks’ original customers. Apache Hadoop is a pillar of their modern data architecture, and since choosing Hortonworks in 2012, the Luminar team became expert users of Hortonworks Data Platform version 1.

They were eager to migrate to HDP2 after it launched in October 2013.

I recently spoke with Juan Manuel Alonso, Luminar’s Manager of Insights. Juan Manuel worked with the Hortonworks professional services team to plan and execute the migration from HDP1 to HDP2.…

We love to hear examples from the ecosystem of how organizations are benefiting from Hadoop and today Hortonworks partner Microsoft posted a great detailed case study on how one of their partners – Ascribe – is using Microsoft’s HDInsight Service, their cloud based 100% Apache Hadoop service to transform healthcare in the UK.

Ascribe is a UK based company focused on solutions for the healthcare industry and was an early adopter of HDInsight which is built using the Hortonworks Data Platform.…

The Apache Tez community has voted to release 0.3 of the software.

Apache™ Tez is a replacement of MapReduce that provides a powerful framework for executing a complex topology of tasks. Tez 0.3.0 is an important release towards making the software ready for wider adoption by focussing on fundamentals and ironing out several key functions. The major action areas in this release were

  • Security. Apache Tez now works on secure Hadoop 2.x clusters using the built-in security mechanisms of the Hadoop ecosystem.…
  • The Apache Software Foundation (ASF) provides valuable stewardship and guide-rails for projects interested in attracting the broadest community of involvement as possible, especially across a wide range of vendors and end users. While the ASF’s role is not about guaranteeing wild success for every project, they do a great job of providing a place where the broadest community of people, ideas, and code can come together and raise an elephant, so to speak.…

    This is the sixth in our series on modern data architectures across industry verticals. Others in the series are:

    The United States is enjoying resurgent fossil fuel production. In fact, the International Energy Agency estimates that by 2016, the U.S. will surpass Saudi Arabia and Russia to become the world’s largest oil producer.…

    We are delighted to host this is a guest blog from John Schitka at SAP.

    Join us on March 12 to learn how SAP HANA and Hortonworks Data Platform combine to help you achieve Instant Insight and Infinite Scale – Register Here 

    Big Data is changing our world – enabling previously impossible insights and transforming the way we do business, work with others, and live our lives. To be competitive you need to lever Big Data and the business value it brings.…

    Compuware is a Hortonworks Technology Partner and this week announced the availability of the newest release of APM for Big Data.  This release provides enhanced support for Hadoop 2.0 and Hortonworks Data Platform (HDP) 2.0

    Compuware’s APM for Big Data now provides greater visibility into Hadoop job details with out-of-the-box dashboards that require no configuration. The graphical dashboards expand insight and ease of analyzing Hadoop deployments.  With the Hadoop focused dashboards, customers can get information about any Hadoop cluster and summarized overviews of cluster utilization across users, jobs, pools, queues and more.…

    Go to page:« First...56789...203040...Last »