Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

closeClose button

On Demand Training

Self-Paced Learning Library

Sales, Architecture, Developer and Adminstrator Training
Online, self paced training and accreditations are available through the Learning Library at no cost. The courses include Sales, Architecture, Developer and Adminstration. For access directions, please visit the Partner Portal, Get Started tab, and search for the Self Paced Learning Library instructions. The sales and architecture courses help prepare teams for sales and architecture accreditation, a requirements for Hortonworks partners.

On Demand Library of Technical Workshops - Table of Contents

These two hour technical sessions are content-rich online webinars to broaden your Hadoop knowledge and build your HDP and HDF skills. Click on a workshop below to get the details including the recording, slides and labs.

Ambari Operations: Technical Workshop, Recorded Thursday May 28, 2015
Recording  Slides  Labs

Ambari continues on its journey of provisioning, monitoring and managing enterprise Hadoop deployments. With 2.0, Apache Ambari brings a host of new capabilities including updated metric collections; Kerberos setup automation and developer views for Big Data developers. The session will provide an in-depth look into Apache Ambari 2.0 and showcase security setup automation using Ambari 2.0.

Ambari Stacks, Views and Blueprints: Technical Workshop
Recording  Slides  Labs

Apache Ambari is the only 100% open source management and provisioning tool for Apache Hadoop. Recent innovations of Apache Ambari have focused on opening Apache Ambari into a pluggable management platform that can automate cluster provisioning, deploy 3rd party software and provide custom operational and developers views to the end user.  In this session we will cover 3 key integration points of Apache Ambari including Stacks, Views and Blueprints and deliver working examples of each. (back to On-Demand Library List)

Cloud considerations using Cloudbreak: Technical Workshop
Recording  Slides

As Hadoop becomes the defacto big data platform, enterprises deploy HDP across wide range of physical and virtual environments spanning private and public clouds. This session will cover key considerations for cloud deployment and showcase Cloudbreak for simple and consistent deployment across cloud providers of choice. (back to On-Demand Library List)

In Memory Processing with Apache Spark: Technical Workshop
Recording  Slides  Additional info on Spark  Labs

Apache Spark offers unique in-memory capabilities and is well suited to a wide variety of data processing workloads including machine learning and micro-batch processing.  With HDP 2.2, Apache Spark is a fully supported component of the Hortonworks Data Platform.  In this session we will cover the key fundamentals of Apache Spark and operational best practices for executing Spark jobs along with the rest of Big Data workloads.  We will also provide a working example to showcase micro-batch and machine learning processing using Apache Spark. (back to On-Demand Library List)

Build YARN Ready Applications with Apache Slider: Technical Workshop
Recording  Slides

YARN has fundamentally transformed the Hadoop landscape.  It has opened Hadoop from a single workload system to one that can now support a multitude of fit for purpose processing.  In this workshop we will provide an overview of Apache Slider that enables custom applications to run natively in the cluster as a YARN Ready Application. The workshop will include working examples and provide an overview of work being pursued in the community around YARN Docker integration. (back to On-Demand Library List)

HDF: Hortonworks DataFlow: Technical Workshop, Recorded Nov. 19, 2015

Lab 1: Twitter dashboard using Nifi and Solr   Lab 2:  Nifi Expression language and building custom Nifi processor

Learn how Hortonworks Data Flow (HDF), powered by Apache Nifi, enables organizations to harness IoAT data streams to drive business and operational insights. We will use the session to provide an overview of HDF, including detailed hands-on lab to build HDF pipelines for capture and analysis of streaming data. (back to On-Demand Library List)

Operational Best Practices: Technical Workshop
Recording  Slides  Labs

Hortonworks Data Platform is a key component of Modern Data Architecture.  Organizations rely on HDP for mission critical business functions and expects for the system to be constantly available and performant. In this session we will cover the operational best practices for administering the Hortonworks Data Platform including the initial setup and ongoing maintenance. (back to On-Demand Library List)

HDP Search: Technical Workshop
Recording   Slides   Document Crawler Lab   Hbase Indexing Lab

Enterprise Data Lake has become the defacto repository of both structured and unstructured data within an enterprise.  Being able to discover information across both structured and unstructured data using search is a key capability of enterprise data lake.  In this workshop, we will provide an in-depth overview of HDP Search with focus on configuration, sizing and tuning.  We will also deliver a working example to showcase the usage of HDP Search along with the rest of platform capabilities to deliver real world solution. (back to On-Demand Library List)

HDP 2.3: What's New in HDP 2.3 Technical Workshop, Recorded Thursday Aug. 28, 2015
Recording   Slides

The recently launched HDP 2.3 is a major advancement of Open Enterprise Hadoop. It represents the best of community lead development with innovations spanning Apache Hadoop, Apache Ambari, Ranger, HBase, Spark and Storm. In this session we will provide an in-depth overview of new functionality and discuss it's impact on new and ongoing big data initiatives.

HBase for Mission Critical Applications
Recording   Slides   Time-Series Labs    Hbase Indexing Lab   Phoenix - Spark Labs

HBase adoption continues to explode amid rapid customer success and unbridled innovation. HBase with its limitless scalability, high reliability and deep integration with Hadoop ecosystem tools, offers enterprise developers a rich platform on which to build their next generation applications. In this workshop we will explore HBase SQL capabilities, deep Hadoop ecosystem integrations and deployment & management best practices.
(back to On-Demand Library List)

Machine Learning with Hadoop: Technical Workshop
Recording   Slides   Labs

It is almost impossible to escape the topic of Data Science.  While the core of Data Science has remained the same over the last decade, it’s emergence to the forefront  is spurred by both the availability of new data types and a true realization of the value that it delivers.  In this session, we will provide an overview of data science, the different classes of machine learning algorithm and deliver an end-to-end demonstration of performing Machine Learning Using Hadoop. Audience: Developers, Data Scientist Architects and System Engineers. (back to On-Demand Library List)

Machine Learning using Spark: Technical Workshop
Recording   Slides

Predictive Analysis is a key use case of Big Data. Today, data driven organizations use advanced machine learning algorithms to understand and improve their business operations. In this session we will provide an overview of Spark's machine learning capabilities and leverage Apache Zeppelin's web based notebook for interactive data science analysis. Audience: Developers, Data Scientist Architects and System Engineers. (back to On-Demand Library List)

Spark: Deep Learning with Hortonworks and Apache Spark: Technical Workshop
Recording Slides

Rich media is exploding all around us. From our personal usage to retailers monitoring store traffic for optimized associate placement, there is wide and growing application of rich media. Despite the pervasive usage, enterprises have had limited choice of generally available tools to analyze rich media. In this session we will look into leveraging deep learning algorithms for rich media analysis and provide practical hands on example of image recognition using Apache Hadoop and Spark. (back to On-Demand Library List)

Interactive Query with Apache Hive: Technical Workshop
Recording   Slides  Labs

Apache Hive is the defacto standard for SQL queries over petabytes of data in Hadoop. It is a comprehensive and compliant engine that offers the broadest range of SQL semantics for Hadoop, providing a powerful set of tools for analysts and developers to access Hadoop data.   The session will cover the latest advancements in Hive and provide practical tips for maximizing Hive Performance. Audience: Developers, Architects and System Engineers from the Hortonworks Technology Partner community. (back to On-Demand Library List)

Real Time Monitoring with Hadoop: Technical Workshop
Recording   Slides   Labs

Real Time Monitoring requires a high scalable infrastructure of message bus, database, distributed event processing and scalable analytics engine.  By bringing together leading open source projects of Apache Kafka, Apache HBase, Apache Storm and Apache Hive, the Hortonworks Data Platform offers a comprehensive Real Time Analysis platform.  In this session, we will provide an in-depth overview all the key technology components and demonstrate a working solution for monitoring a fleet of trucks. Audience: Developers, Architects and System Engineers from the Hortonworks Technology Partner community. (back to On-Demand Library List)

SQL on HBASE with Apache Phoenix: Technical Workshop
Recording    Slides

HBASE is the leading NoSQL database.  Tightly integrated with Hadoop ecosystem, it offers random, real-time read/write capabilities on billions of rows and millions of columns. Apache Phoenix offers a SQL interface to HBASE, opening HBase to large community of SQL developers and enabling inter-operability with SQL compliant applications. The session will cover the essentials of HBASE and provide an in-depth insight into Apache Phoenix. Audience: Developers, Architects and System Engineers from the Hortonworks Technology Partner community. (back to On-Demand Library List)

SQL on Hadoop: Technical Workshop (Recorded Sept 24, 2015)

SQL continues to be the most widely used language for big data analysis. It is no surprise that the SQL on Hadoop ecosystem is vibrant and robust, with many commercial and open source alternatives in the market. It is also an area of active innovation with different tools optimized for varying use cases. In this session we will look into SQL alternatives in HDP spanning Hive, Spark-SQL and Apache Phoenix. We will analyze key strengths and weaknesses of each option and provide guidance on the optimal usage of each tool. (back to On-Demand Library List)

Streamline Hadoop Development: Technical Workshop (Recorded Sept. 10, 2015)
Recording     Mini Clusters Lab

As developers, we like to build and test applications in our favorite IDE. We prefer to debug applications using checkpoints and be able to trace through our code. This paradigm of development is disrupted in a distributed compute environment with execution spread across multiple hosts and JVMs. In this session we look into using Hadoop Mini Cluster to efficiently develop and test applications in a local environment, prior to distributed deployment. We will walk through a real life example of using Hadoop Mini Cluster to develop a streaming application using Storm, Kafka and HBase. Audience: Developers, Data Scientist Architects and System Engineers. (back to On-Demand Library List)

Securing the Hadoop Data Lake: Technical Workshop (Recorded Nov. 6, 2014)
Presentation Recording     Lab Recording     Labs

As Hadoop becomes a critical part of Enterprise data infrastructure, securing Hadoop has become critically important. Enterprises want assurance that all their data is protected and that only authorized users have access to the relevant bits of information. In this session we will cover all aspects of Hadoop security including authentication, authorization, audit and data protection. We will also provide demonstration and detailed instructions for implementing comprehensive Hadoop security. (back to On-Demand Library List)