The Hortonworks Blog

Posts categorized by : HDP

In this blog, Paul Phillips, EMEA Sales Director at Hortonworks, discusses the importance of extending big data science courses to PhD students and scientists. This joint venture with KPMG provides an opportunity to “bring excellent basic skills that are useful in data science and this programme aims to commercialize these skills and ease the path to a data science profession.”

At Hortonworks, we encourage our team members to innovate and as the Open Source community grows, it is also vital that we play our part to ensure the community is continually reinvigorated with new ideas and innovation. …

According to New York Observer, there were couple of major social reasons that spurred the genesis and growth of Meetup.com. First, it was Robert Putman’s book Bowling Alone, in which he talks about the collapse of communities in America. And the second was an event that not only changed the world but changed New York: it was the aftermath of September 11, where strangers cared about greeting, meeting, and talking.…

MongoDB is an open-source NoSQL database, used by companies of all sizes, across all industries and for a wide variety of applications. MongoDB – the company – is a Hortonworks Certified Technology Partner.

Sheena Badani, Director of Business Development at MongoDB, talks about the value of obtaining HDP 2.1 certification.

MongoDB is thrilled to announce the certification of the MongoDB Hadoop Connector on Hortonworks latest release HDP 2.1.  Customers now have validation from both MongoDB, Inc.…

On May 15, Owen O’Malley and Carter Shanklin hosted the second of our seven Discover HDP 2.1 webinars. Owen and Carter discussed the Stinger Initiative and the improvements to Apache Hive that are included in HDP 2.1:

  • Faster queries with Hive on Tez, vectorized query execution and a cost-based optimizer
  • New SQL semantics and datatypes
  • SQL-standard authorization
  • The Hive job visualizer in Apache Ambari
  • And many more

Here is the complete recording of the webinar.…

Today we’re delighted to announce our acquisition of XA Secure to provide comprehensive security capabilities for Enterprise Hadoop. Please join us in welcoming XA Secure to the Hortonworks family.

Register for the Webinar

Hortonworks Data Platform has seen phenomenal adoption across an ever-growing number of organizations. As part of that adoption, and thanks to Apache Hadoop YARN, businesses are moving from single-purpose Hadoop clusters to a versatile, integrated data platform hosting multiple business applications – combining data sets with diverse processing needs in one place.…

Last week Vinay Shukla and Kevin Minder hosted the first of our seven Discover HDP 2.1 webinars. Vinay and Kevin covered three important topics related to new Apache Hadoop security features in HDP 2.1:

  • REST API security with Apache Knox Gateway
  • HDFS security with Access Control Lists (ACLs)
  • SQL security and next-generation Hive authorization

Here is the complete recording of the webinar.

Here are the presentation slides: http://www.slideshare.net/hortonworks/discoverhdp21security

Attend our next Discover HDP 2.1 webinar tomorrow, Thursday, May 15 at 10am Pacific Time: Interactive SQL Query in Hadoop with Apache Hive

We’re grateful to the many participants who joined and asked excellent questions.…

Rainstor is a Hortonworks Certified Technology Partner and provides an efficient database that reduces the cost, complexity and compliance risk of managing enterprise data. RainStor’s patented technology enables customers to cut infrastructure costs and scales anywhere; on-premise or in the cloud and natively on Hadoop. RainStor’s customers are 20 of the world’s largest communications providers and 10 of the biggest banks and financial services organizations. 

Rainstor’s Mark Cusack, Chief Architect, writes about the benefits of certification on HDP 2.1.…

There’s no denying that the information collected by Big Data architectures such as Hortonworks Data Platform (HDP) is revolutionizing how enterprises view and understand their business. The data contains deep insights into many aspects of the business such as sales, customer trends and buying patterns.

The problem has been not only how to extract those insights from the data but how to get it quickly and easily into the hands of the people who need it the most. …

Fino Consulting is a new Consulting and Systems Integration Partner of Hortonworks serving Fortune 1000 companies with winning business solutions through data science. Fino is an early mover in cloud computing, challenging clients to “Re-think what they know about cloud-computing” to build high-performance sustainable applications and stretch the boundaries of enterprise data. Fino uses HDInsight from Microsoft for client solutions because of its versatile, cloud-based data platform that manages data of any type, while leveraging all the features and functionality of Microsoft’s resources.…

Hadoop 2 and its YARN-based architecture has increased the interest in new engines to be run on Hadoop and one such workload is in-memory computing for machine learning and data science use cases. Apache Spark has emerged as an attractive option for this type of processing and today, we announce availability of our HDP 2.1 Tech Preview Component of Apache Spark.  This is a key addition to the platform and brings another workload supported by YARN on HDP.…

The first use of the term BoF session was used at the Digital Equipment Users’ Society (DECUS) conference in the 1960s. Its essence was to bring together like minds and thought leaders—just as birds of the feather flock together— to share and exchange computing ideas, in an informal yet spirited way. Since then, the organizers and sponsors of most computing conferences have been loyal to its essence and spirit.

For ideas and innovation happen in collaboration—not in isolation. …

This is the second in our series on the motivations and architecture for improvements to the Apache Hadoop YARN’s Resource Manager Restart resiliency. Other in the series are:

Introduction: Phase I – Preserve Application-queues

In the introductory blog, we previewed what RM Restart Phase I entails. In essence, we preserve the application-queue state into a persistent store and reread it upon RM restart, eliminating the need for users to resubmit their applications.…

Hortonworks Data Platform 2.1 for Windows is the 100% open source data management platform based on Apache Hadoop and available for the Microsoft Windows Server platform. I have built a helper tool that automates the process of deploying a multi-node Hadoop cluster – utilizing the MSI available in HDP 2.1 for Windows.

Download HDP 2.1 for Windows

HDP on Windows MSI Overview

HDP on Windows installation package comes in the format of MSI, Microsoft’s MSI format utilizes the installation and configuration service provided with Windows called Windows Installer.…

The Apache Knox Gateway team is pleased to announce Knox’s first release as an Apache top-level project: Apache Knox Gateway 0.4.0. The team resolved approximately 100 JIRAs for this release and Knox Gateway is now better positioned to provide complete security for REST API access to a Hadoop cluster.

The new features in Knox Gateway 0.4.0 are the features that enterprise security officers expect in a gateway solution:

  • Perimeter security for a Hadoop cluster
  • Support for enterprise group lookup
  • Audit log of all gateway activity
  • Command line tooling for CMF provisioning
  • Protection for web application vulnerabilities
  • Pre-authentication via SSO token
  • And many more…

As a top-level project, Apache Knox Gateway is fully endorsed by the Apache Software Foundation, and this improves coordination between development of Knox and the other core Hadoop projects with which it interacts.…

Three weeks ago, we announced availability of the technical preview of Hortonworks Data Platform (HDP) version 2.1 and since then we have had thousands of downloads of this preview.  We also promised delivery of GA bits on April 22nd  and we are delighted to deliver as stated. HDP 2.1, which includes countless new features across seven new components, is available today from our download page

YARN unlocks the Data Lake

YARN, the resource management layer of Hadoop 2 is delivering value as it has unlocked the data lake vision for many.…

Go to page:12345...10...Last »