Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
February 16, 2016
prev slideNext slide

Top 3 articles for every Hadoop Developer

We started Hortonworks Community Connection at the end of 2015, and there is some amazing content that any data developer or data administrator should read and bookmark. I will publish this blog weekly and highlight the top technical articles that are on HCC based on community activity and votes. 

Top 3 articles on the site: 

  1. Sample HDF/NiFi flow to Push Tweets into Solr/Banana, HDFS/Hive This article provides an overview on how to create a simple event processing flow. This guide starts with  installing the software, walking trough all the necessary setup, and setting up the event flow. Must read for anyone interested in data ingestion and streaming. 
  2. Unofficial Storm and Kafka Best Practices Guide Are you using Storm or Kafka for data processing. Then learn from the experts in the trenches on the best practices and implementation guidelines. This should be required reading for novices and experts wondering on best way to tune and monitor Kafka and Storm.
  3. Ambari Rolling & Express Upgrade Are you tired of the risk and monotony of doing upgrades. This article covers how to use Ambari and setting up the necessary automated steps and procedures to allow for express and rolling upgrades.

Top 3 questions last week:

  1. HDFS replication and impact on concurrencyIf I have a 100gig data set and the same data set is hit concurrently is one of the options to increase the replication factor to support high concurrency hits? I have long understood this to be true but can’t specifically articulate clearly why? Any details would be appreciated.
  2. Hive metastore issue in HDp2.3.4.0We have configured HDP with Ambari in CentOS 6.4. Post installation we can see that the Hive Metastore service is getting stopped everytime it is started through Ambari. We had chosen MySQL for Hive metastore but in logs we can see it tries to connect with Derby. Looking for your help.
  3. amabari server 2.1.2 setup – Error while creating database accessor com.mysql.jdbc.Communications — then a log dump..


  • In case they want to use Hadoop like pig they don’t need to have an in-depth knowledge of java. Whereas is they are planning to pursue their career in MapReduce they with have to have a hive knowledge of java. For information regarding the importance of java in Hadoop.

  • Thanks for your contribution. Bigdata in hadoop is the interseting topic and to get some important information. Bigdat is having the requirement of many top industries.

  • And indeed, I’m just always astounded concerning the remarkable things served by you. Some four facts on this page are undeniably the most effective I’ve had

  • Those guidelines additionally worked to become a good way to recognize that other people online have the identical fervor like mine to grasp great deal more around this condition.

  • I am commenting to let you know what a terrific experience I enjoyed reading through your web page. I noticed a wide variety of pieces, with the inclusion of what it is like to have an awesome helping style to have the rest without hassle grasp some grueling matters

  • Leave a Reply

    Your email address will not be published. Required fields are marked *