Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Sign up for the Developers Newsletter

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

* I understand I can unsubscribe at any time. I also acknowledge the additional information found in Hortonworks Privacy Policy.
closeClose button
December 30, 2014
prev slideNext slide

Top Ten Popular Hadoop Blog Posts of 2014

We take pride in producing valuable technical blogs and sharing it with a wider audience. Of all the blogs published in 2014 on our website, the following were most popular:

  1. Improving Spark for Data Pipelines with Native YARN Integration.

    Gopal Vijayaraghavan and Oleg Zhurakousky demonstrate improved Apache Spark, which with the help of the pluggable execution context.

  2. HDP 2.2 A Major Step Forward for Enterprise Hadoop

    Tim Hall outlines six months of innovation and new features across Apache Hadoop and its related projects.

  3. Evolving Apache Hadoop YARN to Provide Resource and Workload Management for Services

    Arun Murthy explains YARN’s extended capabilities for resource and workload management for long-running services.

  4. Data Science with Apache Hadoop: Predicting Airline Delays Series: Part I and Part II

    Ofer Mendelevitch and Beau Plath illustrate how to build predictive models using Apache Hadoop and Data Science’s Machine Learning Algorithms.

  5. Docker & Kubernetes on Apache Hadoop YARN

    Using Apache Hadoop YARN’s extensible capabilities and multiple workloads resource management, Sidharta Seethana explains how to enable PaaS.

  6. HBase and Hive—Better Together

    Devaraj Das et al., discuss an integrated architecture for closed-loop operational and analytical processing.

  7. Discardable Memory and Materialized Queries (DMMQ) in a Hadoop Cluster

    To put your memory into its right place in the storage hierarchy for efficient queries, Julian Hyde proposes a solution for a new kind of data set: Discardable, In-Memory, Materialized Query (DIMMQ).

  8. Heterogeneous Storages in HDFS.

    For heterogeneous storage support in HDFS, Arpit Agrawal explores scenarios that aim to achieve this capability.

  9. Benchmarking Apache Hive 13 for Enterprise

    Carter Shanklin shares the initiative that delivers batch and interactive SQL query workloads in a single engine.

  10. How to Think about Partnerships in the Enterprise Ecosystem

    What it takes to build a thriving Enterprise ecosystem with your partners and why key initiatives—partner, certify, engineer, and resell—are crucial for the ecosystem’s success, explains John Kreisa

Happy New Year!



Alexander Bath says:

Items 1 & 2 are the principal reasons why I recommend HDP to my clients, remarkable that you managed to include such a mature stack before the year was out. Excellent work, a fine end to 2014.

Suman says:

hiii youe article is very good for me .thanks for giving me suchthe information.

mohit patidar says:
Your comment is awaiting moderation.

I have file.txt that contain the number from 1 to 10000 i want to add that no. for this like word count program i have to write three program that as follow
driver program
mapper program
reducer program
so my first doubt is that what is my (key, value) input and (key , value )output pair for mapper or reducer.can i explicitly mention the no of inputsplit or not? ………………….thanking you.

prasanna says:

In HDP 2.2 A Major Step Forward for Enterprise Hadoop the highlights are very informative and useful.

In Benchmark configuration the description of software and hardware are mentioned is very good to understand.

sap hana training in Hyderabad says:

keep sharing this type of good informationsap hana training in

suresh says:
Your comment is awaiting moderation.

thank tou for offering such a nice content very very unique blog .one of the recommanded blog for students and professionals

Data science training in hyderabad

Data science training in ameerpet

emergers says:

Just found your post by searching on the Google, I am Impressed and Learned Lot of new thing from your post. I am new to blogging and always try to learn new skill as I believe that blogging is the full time job for learning new things day by day.
“Emergers Technologies”

Learn Hadoop says:

Good collection of information regarding Big Data and Hadoop. Thank you for sharing such valuable information.

Big Data Analytics Training In Hyderabad says:

Thanks for sharing such a wonderful article on Hadoop.
Such a useful blog on Hadoop .We are expecting more blogs from you on Hadoop/BigData Analytics.

veergupta says:

Useful article you’re shared I Really Need This Post Thank you.

sindhu says:

This is my first time to write a comment,this blog gives such a valuable information .In future hadoop and big data gives more job opportunities in top most industries.this blog gives me to strong knowledge about hadoop concept.

Harshali Patel says:

I have not read all articles yet but some and they are really very informative. Especially “Evolving Apache Hadoop YARN to Provide Resource and Workload Management for Services”. Well Explained Arun Sir! I would Like to add one more on Hadoop Limitations and Solutions. Here is the link – … Hope it helps the readers.

prathyusha says:
Your comment is awaiting moderation.

Thanks for sharing this info.I found this one very helpful.There is one more site that I have been following
business analytics course in Bangalore

ludo says:
Your comment is awaiting moderation.

Ludo board game the players then take turns to throw the die. A player will have to throw a 6 before he or she is able to navigate a colored piece from its starting point to the starting square

shaikimam says:
Your comment is awaiting moderation.

Brilliant blog I visit this blog it’s incredibly awesome. Curiously, in this blog content formed doubtlessly and sensible. The substance of information is helpful.
Oracle Fusion HCM Online Training
Oracle Fusion SCM Online Training
Oracle Fusion Financials Online Training
Big Data and Hadoop Training In Hyderabad
Oracle Fusion HCM Training In Hyderabad
Oracle Fusion SCM Training In Hyderabad
Oracle Fusion Financials Training In Hyderabad
oracle fusion financials classroom training
Oracle Fusion HCM Classroom Training
oracle cpq online training / Oracle CPQ Class Room Training
Oracle Taleo Online Training
Workday HCM Online Training

aman sarviya says:
Your comment is awaiting moderation.

Yes, these were the good Hadoop blogs but the above blogs were published before 2014 and they are getting outdated. You could visit DataFlair-Hadoop tutorials for more amazing articles on Hadoop for 2019.

Leave a Reply

Your email address will not be published. Required fields are marked *

If you have specific technical questions, please post them in the Forums