While we are quite a far way away from hearing “Houston, tranquility base here… the eagle has landed”, the HP moonshot is definitely pushing us all toward a new class of infrastructure to run more efficient workloads, like Apache Hadoop. Hortonworks applauds the development of flexible Big Data appliances like Moonshot. We are excited about this development as it signals alignment across development, operations and infrastructure within organizations. For quite some time, our team has been accustomed to a natural balance required across these three constituents and now the server the market is joining in on the game.…
Industry news, partner stories, buzz and happenings
Over the last 10 years or so, large web companies such as Google, Yahoo!, Amazon and Facebook have successfully applied large scale machine learning algorithms over big data sets, creating innovative data products such as online advertising systems and recommendation engines.
Apache Hadoop is quickly becoming a central store for big data in the enterprise, and thus is a natural platform with which enterprise IT can now apply data science to a variety of business problems such as product recommendation, fraud detection, and sentiment analysis.…
We’re cooking up some new tutorials for you to play with in your Hortonworks Sandbox to help you learn more about the Hortonworks Data Platform, Apache Hadoop, Hive, Pig and HCatalog, with maybe a smattering of Mahout in there as well.
While you’re anxiously awaiting, we thought we’d give you some pointers to some resources so that you can experiment and play. After all, that’s what a Sandbox is all about, right?…
“OK, Hadoop is pretty cool, but exactly where does it fit and how are other people using it?” Here at Hortonworks, this has got to be the most common question we get from the community… well that and “what is the airspeed velocity of an unladen swallow?”
We think about this (where Hadoop fits) a lot and have gathered a fair amount of expertise on the topic. The core team at Hortonworks includes the original architects, developers and operators of Apache Hadoop and its use at Yahoo, and through this experience and working within the larger community they have been privileged to see Hadoop emerge as the technological underpinning for so many big data projects.…
More of a 2 weeks in review this time around owing to the Easter break. So what’s been happening?
Falcon bringing Data Lifecycle Management for Hadoop. The big news this week was the newly approved Apache Software Foundation incubator project – Falcon. The project was initiated by the team at InMobi and engineers from Hortonworks towers with the intent of simplifying data management through a data lifecycle management framework. Something for everyone then. …
With any enterprise software implementation, the challenge is often the integration of a chosen system with existing enterprise systems architecture. One such existing investment may be an ERP (and related) systems such as those provided by SAP. In this real-world instance, SAP partnered with Hortonworks to enable integration of Apache Hadoop into SAP Real-Time Data Platforms using Hortonworks Data Platform to facilitate business intelligence and analysis of Big Data.
The business challenges at hand will be familiar to everyone and are a great fit for a Hadoop solution.…
Today we are excited to see another example of the power of community at work as we highlight the newly approved Apache Software Foundation incubator project named Falcon. This incubation project was initiated by the team at InMobi together with engineers from Hortonworks. Falcon is useful to anyone building apps on Hadoop as it simplifies data management through the introduction of a data lifecycle management framework.
All About Falcon and Data Lifecycle Management
Falcon is a data lifecycle management framework for Apache Hadoop that enables users to configure, manage and orchestrate data motion, disaster recovery, and data retention workflows in support of business continuity and data governance use cases.…
The slides and videos from Hadoop Summit in Amsterdam have begun to flow so you can enjoy the sessions.
Whilst you’re thinking about which sessions to watch and read, then we suggest taking a look at the keynotes for the event:
- What is the point of Hadoop? (Video, Slides)
- Matt Aslett, Research Director, Data Management and Analytics, 451 Research
- Hadoop’s Role in Enterprise Architecture (Video, Slides)
- Shaun Connolly, VP Corporate Strategy, Hortonworks
- Real-World insight into Hadoop in the Enterprise (Video)
- Panel featuring HSBC, eBay, Neustar and More
We hope you enjoy these sessions, and the content from the tracks.
On 27th March, the Wall Street Journal published an article ‘VCs Bet Big Bucks on Hadoop’ and it seems clear that the market is going to be huge. But what does that mean to you and your personal skills investment? Here’s our view:
Hadoop is HOT
Hadoop is incredibly hot right now as the number of available jobs continues to grow enormously (hey – we even have a bunch of our own right here).…
With over 300 sessions, and around 6000 users casting more than 15000 votes there was a lot of excitement to participate and influence the results - thanks to everyone for your contribution. At the end of the process, the selectees are:
- Application and Data Science Track: Watching Pigs Fly with the Netflix Hadoop Toolkit (Netflix)
- Deployment and Operations Track: Continuous Integration for the Applications on top of Hadoop (Yahoo!)
- Enterprise Data Architecture Track: Next Generation Analytics: A Reference Architecture (Mu Sigma)
- Future of Apache Hadoop Track: Jubatus: Real-time and Highly-scalable Machine Learning Platform (Preferred Infrastructure, Inc.)
- Hadoop (Disruptive) Economics Track: Move to Hadoop, Go Fast and Save Millions: Mainframe Legacy Modernization (Sears Holding Corp.)
- Hadoop-driven Business / BI Track: Big Data, Easy BI (Yahoo!)
- Reference Architecture Track: Genie – Hadoop Platformed as a Service at Netflix (Netflix)
Congratulations to the selectees for each track, and a further honorable mention to Sears for winning the ‘Longest Session Title So Far’ which was a surprisingly hard fought contest!…
In this post, we’ll explain the difference between Hadoop 1.0 and 2.0. After all, what is Hadoop 2.0? What is YARN?
For starters – what is Hadoop and what is 1.0? The Apache Hadoop project is the core of an entire ecosystem of projects. It consists of four modules (see here):
- Hadoop Common: The common utilities that support the other Hadoop modules.
- Hadoop Distributed File System (HDFS™): A distributed file system that provides high-throughput access to application data.
We want to take a moment to thank everyone who attended the Hadoop Summit in Amsterdam - THANK YOU! With nearly 500 people registered for the event we think we can safely say is was a big success. We’ve had overwhelming support to do it again next year – so watch this space.
The awesome Beurs Van Berlage venue set us up for a series of fantastic conversations and really well attended sessions and talks as Hadoop continues to explode onto the enterprise scene .…
Over the last several weeks, Hortonworks has made a number of announcements regarding the Hortonworks Data Platform (HDP), including the upcoming release of HDP on Windows, the only Apache Hadoop distribution available on Microsoft Windows. We’ve been busy expanding out Hadoop training offerings: we now offer classes for HDP on Windows, you can find training in Europe through our global training partners and you can join us for Apache Hadoop courses in our new corporate headquarters, where you can have lunch with one of the committers.…
We are excited to tell you about the newest release of the Hortonworks Sandbox.
The Hortonworks Sandbox provides the fastest onramp to Apache Hadoop with an easy-to-use, integrated learning environment and a functional personal Hadoop environment. The Sandbox takes the complexity out of Hadoop installation and set up by providing a fully functional virtual image. If you are evaluating Apache Hadoop or need an easy way to prove out use cases then the Sandbox is for you.…
This post co-authored by Arun Murthy.
It’s been an exciting time for the Apache Hadoop community with new and innovative projects happening around performance (Apache Tez) — part of the Stinger initiative — and security (Apache Knox). In addition Hortonworks recently announced the availability of the beta version of Hortonworks Data Platform for Windows.
One of the things we believe strongly in here at Hortonworks is community driven open source and, obviously, the bigger the community, the better.…