Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
October 19, 2017 | Shelby Khan | Dataworks Summit

7 Sessions From DataWorks Summit Sydney You Should See

October 18, 2017 | Kevin Jordan | Hortonworks Case Study

How Much Can You Trust Your Big Data?

October 16, 2017 | Matt Spillar | Hortonworks Case Study

Leveraging Data to Make Decisions in Financial Services

Viewing posts by: Marc Holmes« Back to all


All Topics

All Channels


Hadoop Summit Europe in Amsterdam is approaching fast. From Falcons to Pigs, we have a menagerie of meetups covering all things Hadoop – all with fantastic speakers. This year, we’re also delighted to expand the discussion with meetups from Splunk, SAS and Revolution Analytics. You can sign up for any and all of the meetups […]

On Feb 8th and 9th, Hortonworks, Microsoft and Elastacloud will be hosting a hackathon at the Microsoft Campus in Mountain View, CA. Whether you’re a newbie or ninja, developer or scientist, we’d love to see you there. Register here. The focus of the hackathon will be city datasets. For instance, we’ll be drawing on datasets […]

This guest post from Simon Elliston Ball, Head of Big Data at Red Gate and all round top bloke.  Hadoop is a great place to keep a lot of data. The data-lake, the data-hub and the data platform;  it’s all about the data. So how do you manage that data? How do you get data […]

This guest post from Eric Hanson, Principal Software Development Engineer on Microsoft HDInsight, and Apache Hive committer. Hive has a substantial community of developers behind it, including a few from the Microsoft HDInsight team. We’ve been contributing to the Stinger initiative since it was started early in 2013, and have been contributing to Hadoop since […]

Last week was a busy week for shipping code, so here’s a quick recap on the new stuff to keep you busy over the holiday season. Technical Preview of Storm. This preview includes the latest release of Storm with instructions on how to install Storm on Hortonworks Sandbox and run a sample topology to familiarize yourself with the […]

You’re a Java developer, you use Spring and you’re just itching to get your arms around some big data. Well, now you can do that even easier than before as we announced this morning that Spring is now certified for Hortonworks Data Platform. To celebrate this development, we have a community tutorial for Sandbox (1.3 […]

It’s been a huge couple of weeks for us at Hortonworks HQ. We’ve talked about the GA of Hadoop 2, the subsequent release of Hortonworks Data Platform 2.0, and a little of the future with Apache Storm. We’ve been staggered by the support, goodwill and enthusiasm we’ve seen from you all. We hope you’re as […]

There’s an old proverb you’ve likely heard about blind men trying to identify an elephant. Depending on the version of the proverb you’ve heard the elephant is misidentified variously as rope, walls, pillars, baskets, brushes and more. Oddly, no-one identified it as a next-generation enterprise data platform but I guess it is an old proverb. […]

How big is big anyway? What sort of size and shape does a Hadoop cluster take? These are great questions as you begin to plan a Hadoop implementation. Designing and sizing a cluster is complex and something our technical teams spend a lot of time working with customers on: from storage size to growth rates, […]

Just a couple of weeks ago we published our simple SQL to Hive Cheat Sheet. That has proven immensely popular with a lot of folk to understand the basics of querying with Hive.  Our friends at Qubole were kind enough to work with us to extend and enhance the original cheat sheet with more advanced features […]

If you’re heading back to work today after a long hot summer then here’s some notes on last week here at Hortonworks. Building a modern data architecture. We kicked off the week with some discussion on what it means to implement Hadoop alongside existing data architecture components. Jim covered 3 essential requirements: integration with existing systems, […]

Continuing our series of quick interviews with Apache Hadoop project committers and contributors at Hortonworks. To follow on from yesterday’s Server Log processing with Apache Flume tutorial we talk with Roshan Naik, Hortonworks engineer and Apache Flume contributor, about what Flume is, how it works and where it’s going. Learn more about Flume here or at the Apache Hadoop project […]

The best architecture diagrams are those that impart the intended knowledge with maximum efficiency and minimum ambiguity. But sometimes there’s a need to add a little pizazz, and maybe even draw a picture or two for those Powerpoint moments. Download stencils for Omnigraffle and Visio, and the Hi Res PNG and EPS files from Github. […]

As summer comes to a close, we bid a fond farewell (again!) to our excellent marketing intern, Tanya Maslyanko. Tanya has been a terrific help to us with her can-do attitude and marketing intuition so the tears we shed are because we’ll miss our friend and because we’ll have to start doing our own work […]

The next in our series of quick interviews with Apache Hadoop project committers at Hortonworks. In this video, we talk with Sanjay Radia, Hortonworks co-founder and Apache Hadoop committer, about the initiation of HDFS, the cost benefits it brings to data storage and future directions for the project. Learn more about HDFS here or at the […]