How To Refine and Visualize Server Log Data with Hadoop

When they’re not planning to overthrow their human overlords, most servers can be found spewing out vast amounts of data in the form of server logs. As we showed in our video - Deliver responsive IT from events in Server Logs - these logs contain a lot of value.

16_firewall_logs_tableSo if you fire up the Hortonworks Sandbox today, you’ll be delighted to find Tutorial 12: Refining and Visualizing Server Log Data as a step-by-step guide to the video. In this Hadoop tutorial, we will show you how you can take the logs from your servers and visualize it in Excel 2013 or you could use your own favorite visualization tool.

This tutorial will cover some new ground, as it will walk you through how to install and use Apache Flume. Essentially Flume is a service for collecting, aggregating, and moving large amounts of streaming data into HDFS which makes it ideal for handling Server Logs. It has a simple and flexible architecture based on streaming data flows; and is robust and fault tolerant with tunable reliability mechanisms for failover and recovery. You can read more about Flume here.

In the tutorial, you’ll go through these steps:

  1. Install, configure, and start Flume
  2. Generate the server log data
  3. Import the server log data into Excel.
  4. Visualize the server log data using Excel Power View

Once you’ve completed the tutorial, continue to the Appendix. We go into more discussion about Flume and we give you some instructions on creating and collection your own dataset.

Don’t have the Sandbox? You can download it here. It’s our free, single node HDP environment that can run on your laptop.

Already have the Sandbox and want to play with this new tutorial? On start up, the Sandbox will pull the new tutorial into your version or you can tell the Sandbox to “Update” the tutorials from the “About Hortonworks Hue” button.

Enjoy the new tutorial!

Categorized by :
Flume HDP Sandbox

Leave a Reply

Your email address will not be published. Required fields are marked *

If you have specific technical questions, please post them in the Forums

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Integrate with existing systems
Hortonworks maintains and works with an extensive partner ecosystem from broad enterprise platform vendors to specialized solutions and systems integrators.
Contact Us
Hortonworks provides enterprise-grade support, services and training. Discuss how to leverage Hadoop in your business with our sales team.

Thank you for subscribing!