The Hortonworks Blog

Posts categorized by : Sandbox

Using Hadoop as an enterprise data platform means great integration with other technologies in the data center.

To that end, the Hortonworks Sandbox Partner Gallery showcases how our partners’ solutions integrate with Hadoop and provide you with easy access to learn how to use those solutions with the Hortonworks Data Platform via the Sandbox.

Don’t have the Sandbox? Get your free download of this single node Hadoop environment that’s delivered as a Virtual Machine that you can run on your laptop.…

When I first started to understand what YARN is, I wanted to build an application to understand its core. There was already a great example YARN application called Distributed Shell that I could use as the shell (pun intended) for my experiment. Now I just needed an existing application that could provide massive reuse value by other applications. I looked around and I decided on MemcacheD.

This brief guide shows how to get MemcacheD up and running on YARN – MOYA if you will…

Prerequisites

You’re going to need a few things to get the sample application operational.…

We had a lot of fun in NYC and hope you did too. Thanks to the hundreds of you who dropped by the booth, attended dinners, parties, meetups and sessions.

As we have known for some time, Hortonworks customers are already building a modern data architecture with Hadoop as the technology of choice for handling the data they have streaming in from all directions. They care that it matches their needs, integrates with their existing infrastructure and solves real problems with flexibility.…

Last week we announced the availability of the Hortonworks Data Platform 2.0. Today, we’re delighted to announce the availability of the Hortonworks Sandbox 2.0.

New Features

  • Based on HDP 2.0
  • Easy enablement of Ambari and Hbase
  • Updated tutorial navigation

HDP 2.0

This version of the Sandbox provides you a complete HDP 2.0 environment. Your own personal single-node Hadoop cluster where you can explore the new features and enhancements of HDP 2.0, including YARN, the improvements to Hive that were delivered by the Stinger initiative, along with the updates to Hbase, Pig, and Ambari.In fact, our Sandbox has all of the most current releases of the various Apache Projects — like Hive 12, HBase 96, and Hadoop 2.2.…

You’re a Java developer, you use Spring and you’re just itching to get your arms around some big data. Well, now you can do that even easier than before as we announced this morning that Spring is now certified for Hortonworks Data Platform.

To celebrate this development, we have a community tutorial for Sandbox (1.3 currently) that shows you how to use Spring XD to collect data streamed from Twitter, load into HDFS and then run simple sentiment analysis with Apache Hive.…

You did it! Last Sunday we challenged you to “Learn Hadoop in 7 days”. We hope that you have risen to the test and kept up with the tutorials we’ve posted each day through Twitter and Facebook. These tutorials should have helped you delve into:

By now, you should feel comfortable with Hadoop clickstream analysis, Hortonworks ODBC driver configuration, and many other important components of Hadoop.…

Security is one of the biggest topics in Hadoop right now. Historically Hadoop has been a back-end system accessed only by a few specialists, but the clear trend is for companies to put data from Hadoop clusters in the hands of analysts, marketers, product managers or call center employees whose numbers could be in the hundreds or thousands. Data security and privacy controls are necessary before this transformation can occur. HDP2, through the next release of Apache Hive introduces a very important new security feature that allows you to encrypt the traffic that flows between Hadoop and popular analytics tools like Microstrategy, Tableau, Excel and others.…

This is a guest blog post from our partner, Actuate. They’ve been generous enough to create some great Hadoop tutorials on the Open Source BIRT project that use the Hortonworks Sandbox.

By now, Apache™ Hadoop® has become synonymous with the first stage of Big Data: storing, processing and managing huge volumes and varieties of structured and unstructured data. Yet the data stored by Hadoop remains unreadable to the average business user.…

Are you a Hadoop hot shot?  Are you the one everyone looks to for help on their Hadoop projects? Are you looking to showcase your talent to the world?

Then just maybe we have a great option for you. We recently published the Hortonworks Sandbox tutorials on GitHub. Now it’s your turn. We invite you to add your own Hadoop tutorials or to improve on the ones that we’ve published.…

There’s an old proverb you’ve likely heard about blind men trying to identify an elephant. Depending on the version of the proverb you’ve heard the elephant is misidentified variously as rope, walls, pillars, baskets, brushes and more. Oddly, no-one identified it as a next-generation enterprise data platform but I guess it is an old proverb.

The Hadoop elephant is a platform though, and as such the proverb holds true. Depending on your perspective, it has different capabilities, components and integration points to meet your requirements.…

This is a guest blog post from Gary Nakamura, CEO at our partner Concurrent, Inc. discussing Cascading Pattern and the new Hadoop tutorial they have written for the Hortonworks Sandbox. This is one of the first tutorials aimed at more experienced crowd. Enjoy!

Cascading Pattern: Deploy Predictive Models on Hadoop in minutes.

Cascading Pattern signifies an important milestone for Cascading as we continue our mission of driving innovation and to simplify Big Data application development.…

Albert Einstein is credited with saying that he doesn’t worry about the future because it would arrive soon enough. We don’t worry the future either — we focus on building it. And today, we are delighted to release the Hortonworks Data Platform 2.0 Beta Sandbox. This is the single-node VM based on the HDP 2.0 Beta release. This release is in the easy-to-use Sandbox form factor and allow you to easily work with a stable, reliable v2 of Hadoop.…

Syncsort, a technology partner with Hortonworks, helps organizations propel Hadoop projects with a tool that makes it easy to “Collect, Process and Distribute” data with Hadoop. This process, often called ETL (Exchange, Transform, Load), is one of the key drivers for Hadoop initiatives; but why is this technology a key enabler of Hadoop? To find out the answer we talked with Syncsort’s Director Of Strategy, Steve Totman, a 15 year veteran of data integration and warehousing, provided his perspective on Data Warehouse Staging Areas.…

He loves me, he loves me not… using daisies to figure out someone’s feelings is so last century. A much better way to determine whether someone likes you, your product or your company is to do some analysis on Twitter feeds to get better data on what the public is saying. But how do you take thousands of tweets and process them?  We show you how in our video – Understand your customers’ sentiments with Social Media Data – that you can capture a Twitter stream to do Sentiment Analysis.…

In the last 60 seconds there were 1,300 new mobile users and there were 100,000 new tweets. As you contemplate what happens in an internet minute Amazon brought in $83,000 worth of sales. What would be the impact of you being able to identify:

  • What is the most efficient path for a site visitor to research a product, and then buy it?
  • What products do visitors tend to buy together, and what are they most likely to buy in the future?
Go to page:123