The Hadoop network: social media and big data

One of the benefits of Apache Hadoop is that it can help users in the process of sorting out useful, salient information from a glut of useless, potentially confounding data. In few other arenas is this dichotomy more necessary than in the processing of data gleaned from social media. The massive, densely connected and constantly updating web of social sites offers a lot of significant and compelling information, but creates problems at every step of its analysis and use.

From a data accumulation perspective, there are many facets of social media that intelligent data analytics programs can help users to parse in order to make useful insights. Oracle Financial Services vice president and Finextra contributor Ambreesh Khanna recently outlined the new challenges that big data from sources with high levels of production create for the structure of data management.

"Unlike traditional data management where the structure of the data is decided upon its arrival – 'schema on write' – big data mandates the realization of metadata at time of consumption – 'schema on read,'" he wrote. "This creates a new series of challenges in determining not only which data to persist and where, but also how to locate the persistent data when it is needed, all in real-time."

To effectively confront the burgeoning quantities of social data and turn them into useful insights, there need to be set algorithms in place that can digest and sort the data on the back end, so that the analytics team can stay ahead of the curve. The Hadoop HDFS system can be set up to engineer this data accumulation and synthesis upon a reading of the data, so that analysts can start working with Hadoop clusters filled with intelligently structured information.

Making insights actionable with Hadoop
The purpose of having such a large, complex system of social media analytics is so organizations can create actionable strategies to better target consumers. This approach must contend with another problem of social media, which is that the rivers of data from different sources and the connected tissue of disparate platforms and applications can make it easy for promotional and business maneuvers to get lost in the shuffle. These challenges effect businesses in any industry that interact directly with consumers – Insurance Networking News contributor Terry Golesworthy recently summed up the current landscape of socially mediated information.

"Social media is now a combination of highly complex micromarketing, multiplatform objectives," he wrote. 

Without an effective business analytics strategy that can ably respond to the challenges of social media data, companies will lack the sort of sophisticated micro-strategies that can win the battle for social media. 

Categorized by :
New Analytics Apps

Leave a Reply

Your email address will not be published. Required fields are marked *

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.