Tutorials for Hadoop with HDP 2.1: Hive, Tez, Falcon, Knox, Storm

Give HDP 2.1 a test drive with the Technical Preview VM

If you’re excited to get started with the new features in Hortonworks Data Platform 2.1, then we’ve included 4 tutorials for you try out – Sandbox-style.

You can download the HDP 2.1 Technical Preview here, and then get stuck into these great tutorials.

Interactive Query with Apache Hive and Apache Tez

OK, so you’re not going to get huge performance out of a one-node VM, but you can try out Hive on Tez, and see the performance gains versus MapReduce, and also try out features such as Vectorized Query, and the host of new SQL features. Get supercharged here.

Defining and Processing Data Pipelines with Apache Falcon

Sometimes, it’s not all about speed. Sometimes you want surety and governance on the data movements across the cluster. In this tutorial, we simulate a dataset movement from one cluster to another and perform cleansing as we do that. Define your pipeline here.

Processing Stream data in near real-time with Apache Storm

But then who am I kidding? Of course it’s all about speed. In this case, speed of response to incoming stream data. This tutorial sets up Apache Storm to read and react to incoming sentences. Process your streams here.

Secure your Hadoop infrastructure with Apache Knox

With data flying around in all directions, its probably worth taking a look at Apache Knox to provide perimeter security for your cluster – even if it is just one node. Batten down the hatches here.

We hope you have some fun testing out the new features of HDP 2.1 with these tutorials, and that they provide the inspiration for your own production work. If you have any comments, let us know below, or in the forums. And if you’d like a Hortonworks elephant, be sure to add your own tutorial over here.


Categorized by :
Administrator CIO & ITDM Data Analyst & Scientist Developer Falcon HDP 2 Hive Knox Gateway Storm Tez

Leave a Reply

Your email address will not be published. Required fields are marked *

If you have specific technical questions, please post them in the Forums

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Try it with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.

Get Sandbox
HDP 2.1 Webinar Series
Join us for a series of talks on some of the new enterprise functionality available in HDP 2.1 including data governance, security, operations and data access :
Contact Us
Hortonworks provides enterprise-grade support, services and training. Discuss how to leverage Hadoop in your business with our sales team.
Explore Technology Partners
Hortonworks nurtures an extensive ecosystem of technology partners, from enterprise platform vendors to specialized solutions and systems integrators.