‘The world is being digitized’ proclaimed Geoffrey Moore in his keynote at Hadoop Summit 2012 over a year ago. His belief is that we are moving away from an analog society where we collect only casual recording of events to one that is digital, where everything is captured. It is our belief that Hadoop is one of the key technologies powering this shift to a digital society.
There is almost an expectation that we capture the pics, vids and conversations that run before us. Have you been to a 3rd grade play or concert lately? The glow of our mobile displays is ubiquitous as we compete to get the best shot of our kids. We capture it all. We have also grown accustom to capturing our continuous business data such as clickstreams and transactions … but there is more.
What impact and opportunity do we find here?
Today, we not only capture these life events but we store them for a longer period of time. Wouldn’t it be great to analyze clickstream over years of black Friday information? And not only are we capturing data for longer periods of time, we are also capturing all of it. A year ago, my friend John Kreisa used the word “exhaust data” when explaining these concepts and it has stuck with me. How much data have we thrown on the floor of our data centers over the past 30 or 40 years? It is possible that a nugget of value was hidden in it. Sensor data has been exhaust for years,
All of this data is interesting, but there is a world of microprocessors out there creating data today that might prove valuable. We can walk through a “day in the life” and see sensors everywhere. From the refrigerator that stores our half & half for the morning coffee, to the car we drive to work in, to the supply chain we may work on and the phones we have in our back pockets. Nearly every process and tool around us is creating data… massive amounts of it.
Making sense of the mountain
We now think of this data in terms of petabytes and zettabytes of captured information. The challenge we face is to formulate the “right” questions to ask of that information. Every set of data is different, but we can look at what others are doing to help find our own north star. Here are some examples of how others are using sensor data today:
There are hundreds, maybe thousands of uses of sensor data. My favorite example is the rail operator who has equipped rails with sensors that collect the sound frequencies as a train travels over a section of rail, looking for discrepancies to identify any potential issues within the system. Awesome.
How to get started?
With Hadoop, it is much easier to refine the data and explore it to find the meaningful patterns and exclude the trivial ones. As Hadoop extends the storage and analysis of big data to processes beyond commercial Internet use cases, it can augment and assist other efforts to even save lives through the process of prediction, identification and prevention.
The bigger question is what sensor data do you have and what questions can you ask of it? The technology is ready.