I started my journey at Hortonworks a little over five years ago and have been to many Hadoop Summits in the US, Europe and now Asia. I just kicked off a two day sold out show in Tokyo and shared my thoughts and stories on architectures and business transformation.
John Kreisa’s post here has more details on the event itself and customer comments on their own transformations.
To provide context, over the past five years, the rise of the hyperconnected world is now effectively doubling the amount of data every two years. The internet of things is spawning new forms of data everywhere, way beyond just mobile phones. The are sensors on literally everything, spawning around 1.1B data points, or more than two exabytes every day.
There has been much written about this growth of data. Analyst research shows that around 64% of enterprises are investing in Big Data in some form or another, with 31% expecting to manage a petabyte or more.
With that said, they also say that 88% of the potential streaming or real time data is still not under management. By this I mean, not just captured for data’s sake but able to drive value for the enterprise.
This was one of the hot themes of the conference — how can organizations in Asia Pacific bring all of this data together in in a way that help them drive transformative business value? This is impacting almost every industry globally, including Asia. Many companies are already realizing they must be able to harness both their data at rest and data in motion and drive value out of it.
But how? For the most part, existing systems were created for a structured world. Data was in mainframes or in relational databases in rows and columns. It was very difficult to capture these new forms of data, and data tended to be siloed by application with no ability to derive insights across applications that was both technically and economically viable.
The result was limited in its usefulness. The value a company could derive from the traditional data systems was mostly rear view mirror insights. A week old. A month old. They weren’t really driving the business. They were reporting on what had happened. You weren’t able to get a single view of your customers, your products, your supply chains, or in healthcare a single view of a patient.
So the question now is, how do you change your architecture from this siloed application-centric view to bring data across all applications into one place? And how can you do it in a way that factors into today’s top megatrends of cloud computing, internet of things, and of course big data and analytics.
The answer lies in what we call a ‘connected data architecture’ that can embrace all three megatrends while also leveraging the data within your data centers.
Before I give a couple of examples, let me say that fundamentally, we believe that data is becoming the new currency for business. It’s now all about finding the value via a new generation of modern data apps that span the cloud and the data center, and can analyze data in motion as it is streaming in real time before you have even stored it anywhere, connecting it to your historical data at rest in data lakes and data centers so you have access to the full fidelity of all available data.
I’ll use connected car and the energy sector as just two examples.
All of the major automakers, companies like Mitsubishi at the Summit, have connected car initiatives, and it is transforming how they do business. But this is transforming associated industries too.
Insurance for example. Companies like Progressive Insurance, is an automotive insurer in the US that has devices in cars offering usage-based policies, optimizing premiums based on data they collect in the car. So far they have collected over 10B miles of driving data in order to do that analysis to augment existing data.
At the same time, government agencies are also rolling out sensors in connected cities, creating a connected infrastructure that automobiles can interact with.
This is an example of the types of applications that can be unlocked via a connected data architecture by being able to act on data when it’s in motion and then bringing that data all in one place via the cloud or to our data centers, so we can do historical analysis too.
In the upper left here, for example, we show these sensors and control systems apps in the car, and the infotainment systems. We want to get that data and blend it with this connected city and connected infrastructure data via the cloud. Then we may want to do real time analytics and land it back in the cloud to do some machine learning for time-based route optimization and enhanced traffic patterns. We also want to watch for operational issues based on exception ‘use cases’ that are happening and bring those down into the data center so we can have a 360 degree view of the operational data from the cars and manufacturing line data for root cause analysis on the issues we are detecting.
Another example is the energy sector which has its own transformation happening based on millions of smart grid sensors and smart meter technologies in homes that are being used to be enable the optimization of energy delivery.
We are seeing centuries-old utility companies transforming their relationships with their customers based on these smart technologies that enable them to deliver energy more efficiently and enable their customers to use that energy and spend their money more wisely.
Here is is a similar connected data architecture for the energy sector. On the top, we have the sensors and control systems and other connected devices doing a time series analysis and creating data streams.
On the bottom, there are exception use cases around equipment failure and proactive management and engagement of the devices, so we can replace faulty equipment before it fails. As in the connected car example, the impact of connecting data goes far beyond just the energy companies themselves, but the entire value chain around the industry.
We see a shift to this connected data architecture that is happening today across many industries; based on apps like these within the data center and outside the data center in the public cloud, leveraging all the data in motion and data at rest.
Apache Hadoop plays a key role in managing, storing and analyzing this data, along with other technologies like Apache Spark, Apache NiFi, Apache Kafka, Apache Storm and more that enable you to get the value out of that data.
We live in an exciting time because of this open source innovation. The challenge for enterprises is how do you apply these new technologies in a way that actually solves business needs and unlocks new revenue streams based on actionable intelligence.
The future of data as we see it is about harnessing all of this data and bringing it in under management so we can derive value out of it. Events like Hadoop Summit help you to do that.
You can find out much more about content at the Summit here.
See you next year!