Apache Hadoop: Seven Predictions for 2013

At Thanksgiving we took a moment to reflect on the past and give thanks for all that has happened to Hortonworks the past year.  With the New Year approaching we now take time to look forward and provide our predictions for the Hadoop community in 2013.  To compile this list, we queried and collected big data from our team of Hadoop committers and members of the community.

We asked a few luminaries as well and we surfaced many expert opinions and while we had our hearts set on five predictions, we ended up with SEVEN. So, without further adieu, here are the top Top 7 Predictions for Hadoop in 2013

1. “Big Data” becomes “data”

Over the past 18 months the term “big data” has emerged and has defined a space for swath of new (and existing) technologies.  It has been called transformative and many have even said it will replace everything we do today.  Well, we have a bit of realistic eye on big data.  We feel big data is just data.  As Apache Hadoop has evolved it has become a standard platform for this new world and by the end of 2013, the “big” moniker will no longer be necessary… it is all just data after all.  Big data and all the predictions for this space will collapse into data management by the analysts and all those following, including a lot of the “big” vendors.

2. Emergence of vertically aligned Apache Hadoop “solutions”

At the keynote of Hadoop Summit last year, Geoffrey Moore characterized Apache Hadoop as currently crossing the chasm and that we would know it has landed on the other side and is enjoying adoption by the mainstream when vertical solutions arise.  As more and more companies gain success we will see patterns and solutions arise that are custom-fit for a challenge found in a particular industry.  As the system integrators and consultants become more and more expert on Apache Hadoop, they will wrap solutions in packages and we will see the emergence of these vertical solutions

3. “Right-time” query of Apache Hadoop becomes reality

Much has been made about the batch nature of Apache Hadoop in the past few months.  This is understandable as it was, after all, architected this way.  In 2013 we will see Apache Hadoop v2 finally deemed stable and reliable and with this we will see advances in the surrounding Apache projects to make the platform more interactive.  The enterprise is asking for it and the community will naturally answer.  Some will try to “fix” this with proprietary extensions on Hadoop, but ultimately the community will resolve this challenge.  We will see technology emerge that allows you to get the “right” time applied to the “right” business requirements.

4. More Hadoop startups

As there has been a lot of hype around Apache Hadoop and for as many new business ideas it presents, well there are new companies popping up all around to support these ideas. As the emergence of vertical based solutions progresses so too will the emergence of a new batch of startups ready to take advantage of the mainstream adoption of Apache Hadoop.

5. Apache Hadoop v2 (YARN and MR2) becomes the standard for Hadoop data management

Hadoop has already established itself as the next generation data platform, however, with Apache Hadoop v2, the enterprise will adopt it for more than pilot and small projects.  It will become the data backbone for many because of the advances in Apache Hadoop v2 make it more reliable and more stable.  Personally, our Hortonworkers are excited and proud of this new architecture as our team has been busy building and testing it and can’t wait to see it prove value.

6. The big data ecosystem expands

Related to number four prediction, existing application vendors will all clamor to make their products Hadoop-compatible.  Led by Teradata and Microsoft and many others, application vendors are waking up to the reality that their applications must run on Hadoop.  Already, it seems everyone is building a reference architectures which incorporate Hadoop and HDP to leverage all the goodness they already provide around data lifecycle management, data governance, security, etc. Meanwhile the Hadoop community is doing everything it can to foster adoption by the ISVs.  In 2013, nearly everyone will be speaking big data.

7. Apache Ambari sets the standard for Hadoop operations

This prediction is admittedly a little self-serving as Hortonworks employs the founder and many of the contributors behind Apache Ambari, but we are believers.  We believe that Ambari will set the standard for operational services for Enterprise Hadoop as it allows organizations to more easily easy to consume, deploy and manage a cluster.  It has already reached parity with the proprietary solutions available and with the power of the community it is accelerating and adding new features at an astonishing rate.  Again, this fully dedicated open source approach not only provides the right tools but also showcases the extraordinary rate at which the democracy of the community can innovate.

Now this is just our opinion…  What are YOUR predictions?  please comment!

Happy new year and we look forward to seeing all of these come true!

Categorized by :
Apache Hadoop Other

Comments

|
January 20, 2013 at 5:25 am
|

“Big” data for predictive analytics is driving the advancement of interconnected clouds for research and more. Hearing technologists and Vint Cerf, a father of the Internet, talk about progress and prospects at NIST’s Cloud and Big Data Forum this week was staggering! The more we get Hadoop out there, the bigger questions we can ask and problems we can solve! Summary here: onforb.es/W4jVwh

|
January 6, 2013 at 9:33 am
|

These are excellent predictions. If I added two, it would be:
- Tiered storage will increase in probability in Hadoop environments for the master servers.
- By end of year 2013, virtualization will start to gather some traction around Hadoop environments. Virtualization adds additional advantages to Hadoop servers. For the data nodes, lots of commodity servers bring space, electricity and heating challenges that virtualization can address. Hadoop environments also have data servers that are underutilized. Virtualization leverages better utilization and adds a lot of management capability to the data nodes.

|
December 20, 2012 at 11:25 pm
|

I believe T12 was the time where all focus in industry was set on technologies around cloud computing and big data analytics. These technologies shall be used to take enterprises to next level. T13 to me feel shall be an era when SMEs will target PAASifying their enterprise. So in short focus on PaaS (platform as a service) is what I can predict for 2013.

|
December 20, 2012 at 10:47 pm
|

These predictions seem to be spot on with where we are leaving off in 2012 and how things are shaping up for 2013.

The prediction no.6 would be a key industry mover as ISVs yearn to make Apache Hadoop more enterprise friendly. However, we will need to educate the customers in distinguishing an arbitrary Hadoop connector mentioned in charts versus a real Hadoop implementation.

Another key trend which could gain momentum is the emergence of better data visualization products. The focus would shift to how business makes sense of data churned by MapReduce and for that, better visual aids will emerge and offer insights.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Join the Webinar!

YARN Ready – Integrating to YARN natively (part 1 of 3)
Thursday, July 24, 2014
12:00 PM Eastern / 9:00 AM Pacific

More Webinars »

Integrate with existing systems
Hortonworks maintains and works with an extensive partner ecosystem from broad enterprise platform vendors to specialized solutions and systems integrators.
Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.