A few weeks back we posted a definition of “big data”. There was definitely some internal conversation about the term and if this definition had captured what the term means. Sum finding: it is a loaded term. It means a lot of different things to a lot of different people.
When I first joined Hortonworks, I bought in to the three V’s (volume velocity and variety) definition of big data. It works for the most part, but is more a descriptor of the data. It explains the characteristics of the data. The definition is cold and lacks soul. Afterall, “big data” represents promise of “big” business value.
I gravitate to this because it outlines WHAT the data is, not just the characteristics. It points to areas that we should focus on as businesses. It lends to the value a bit more. Each of the three components are important.
This “value” definition of big data gets interesting when you substitute the plus signs in Shaun’s definition with intersections…
Big Data = Transactions ∩ Interactions ∩ Observations.
With big data technology (one of these being Apache Hadoop) we can now efficiently store and process all of this data. We can refine observation data down to the salient details that may be interesting in the context of our EDW. But even more interesting we can ask these big data systems new questions. We can combine data across all these types and come up with new value for organizations. There is a world of data in our organizations that are used for an explicit purpose. When we start to combine things, the big data world gets really interesting.
If you’re using Hadoop to create value from your big data, why not check out our Hadoop Patterns of Use whitepaper and see how it can work for you.