cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
cta

Get Started with Hortonworks Sandbox

cloud Ready to Get Started?

DOWNLOAD SANDBOX

Hortonworks Sandbox is a personal, portable Apache Hadoop® and its ecosystem environment that comes with dozens of interactive tutorials and the most exciting developments from the Apache community. Get up and running in 15 minutes!

Download Sandbox

Overview

If you are new to the Hortonworks Sandbox and using Apache open source tools to build modern data applications we suggest you use the following tutorials.

Sandbox Basics

Getting Started with HDP®

Begin your Apache Hadoop journey with this tutorial aimed for users with limited experience in using the Sandbox.
Explore Sandbox on virtual machine and cloud environments and learn to navigate the Apache Ambari user interface.

This tutorial provides a section that describes the key concepts and series of tutorials where you move data into HDFS, explore the data with SQL in Apache Hive, do transformations with Apache Pig or Apache Spark and at the end generate a report with Apache Zeppelin.

Getting Started with HDP

Hands-on Tour of Apache Spark in 5 minutes

This will provide a quick introduction to Spark by creating an RDD for wikipedia inside an Apache Zeppelin notebook.

After you have gone through this tutorial you can find additional Spark tutorials here:

Apache Spark in 5 minutes

IoT Realtime Event Processing

Apache Hadoop is often used to process unstructured data, new data types or data at scale at rest. However, you can also process data-in-motion and this tutorial will introduce you to tools like Apache Nifi, Apache Kafka, Apache Storm and Apache HBase.

IoT Realtime Event Processing

Learning the ropes of Apache NiFi

NiFi provides the data acquisition, simple event processing, transport and delivery mechanism designed to accommodate the diverse dataflows generated by a world of connected people, systems, and things. In this tutorial you will be introduced to how Apache NiFi connects and conducts streaming transportation data.

Apache NiFi

Try More Tutorials

You can find additional tutorials here:

WHAT'S NEW IN HORTONWORKS DATA PLATFORM 2.6

administrator

Innovation & Performance

  • Access to Latest Data Science Functionality. Extensive support for machine learning algorithms available in Spark 2.1, Spark 1.6.3, Zeppelin 0.7 and Livy REST API
  • Hive LLAP for Production. Gain 10x faster join performance with dynamic runtime filtering
  • ACID Compliance. Greatly speed up and enable micro-batch/ streaming changes to Hive data warehouse through incremental updates
  • Sub-second Query Performance for BI tools. Customers no longer need to replicate data in Hadoop by first storing it in a SQL-based analytic database

administrator

Enterprise Ready

  • Export/ Import of Ranger Security Policies. Enhance productivity by moving security policies in bulk from one environment to another
  • Extend Atlas Tag-based Policy Support Across the Ecosystem. Enable classification based security workflows coverage for HDFS, Kafka and HBase
  • Row / Column Security. Implement granular data access control at every level of the Hadoop stack including Spark and Hive
  • SSL Support for Spark Streaming Connections to Kafka. Provide secure environments for Spark Streaming & Kafka

administrator

Ease of Use

  • Service Auto Start. Easily configure the services and components that should be automatically started if a cluster node restarts, or if the daemon exits unexpectedly
  • Simplified Log Rotation Configuration. Quickly configure the number and size of backup files for all components
  • HDFS TopN User & Operation Visualization. Gain visibility into the most frequent operations being performed on the NameNode, and who’s performing those operations
  • Package support for PySpark (Spark Python API) & SparkR: Data scientists using Spark with R language can now deploy their favorite R package with their Spark job

Hortonworks Sandbox in the cloud

Hortonworks Sandbox in the cloud

Explore cloud vendors that can help you get started with Hadoop with minimum system requirements.
Learn More
Hortonworks Sandbox on a VM Download

No data center, no cloud service and no internet connection needed! Full control of the environment. Easily extend with additional components or try the various Hortonworks technical previews. Always updated with latest edition.

Try Hortonworks Sandbox on Azure

Azure provides an easy way to get started with Hadoop with minimum system requirements. This is a great solution if your personal machine doesn’t meet the minimum system requirements to run locally.