We were really excited to welcome a sold out crowd at the first Hadoop Summit in Tokyo last week. This was a fantastic response, based on the huge interest around a technology that is transforming industries across Asia and Pacific.
We could not put this kind of conference on without the help of our sponsors who help us to both present the event and provide content. A warm thank you our co-hosts, Recruit Technologies, and our wider sponsors.
For those of you who attended, we want to encourage you to share your experiences. Hadoop is an open community and we actively promote sharing using Facebook or Twitter. Use the hashtag for the conference: #hs16tokyo.
This was the fourth Hadoop Summit that we’ve run globally this year and the first time in Tokyo. Attendees are part of a growing global community of more than 6,600 attendees who have attended a Hadoop Summit worldwide this year.
That’s a tremendous community coming together, sharing information and ideas, learning about how data is transforming their businesses and industries and learning about how they progress on their journey with Hadoop. If you were there, thank you very much for participating.
Hadoop Summit is an event for the community, by the community. So many of the session are given by community members who are actually working on the software, by partners that have solutions that plug in and enhance it, and real end users of the technology.
As we like to say, it’s also a conference that appeals to both the hoodies – the technology people – and the suits. What they have in common is that they are people who want so learn how Hadoop and Connected Data Architectures are transforming their business.
For a first event, there were a huge 205 sessions submitted from 87 companies in our call for papers. That’s a tremendous response. In the end, 48 sessions across five tracks in total were scheduled over two days, and 11 of these were business. So regardless of the kind of information attendees wanted to learn about attendees had a lot of choices in both Japanese and English with translation.
On the main stage, Shaun Connolly, VP of Strategy here at Hortonworks kicked the event off with a talk about how entire industry ecosystems are being transformed by connected data. You can read about his talk [here] in his blog post.
Technologists from Recruit Technologies, Coca Cola East Japan and Mitsubishi Fusu also commented on their own transformations and insights for others doing the same.
Nobuyuki Ishikawa from the Big Data Product Development Group at Recruit Technologies talked about their six year journey with Hadoop, how it has fundamentally changed their business and how they help their clients and customers. A pretty fantastic journey. He said: ‘Our big data storage division operates cross functionally. We started in 2009 simply by examining the utility of data applications. A year later Hadoop and HBase were chosen as the data infrastructure and ecosystem for our data. Now we also use Spark. In 2015, data usage was more than one petabyte across more than 200 use cases and it’s still growing rapidly. Now we collect all kinds of data we can effectively make use of and we can proudly say that Hadoop is a key technology in Recruit. We can see a great future for data solutions, with a vision of creating an environment where people focus on creative work while the more simple work is done by machines.’
Damien Contreras, Enterprise Architect & Innovation Project Manager at Coca-Cola East Japan (CCEJ) commented: ‘We produce and distribute more than 50 different brands that people love, from Coke to Fanta to Coffee. We’re a young company and three years ago we inherited five different information systems, built on top of mainframes and had a lot of challenges around data and integration. My job is to harmonize and standardize the way we operate.’
‘One important operation is the replenishment of about 550,000 vending machines. Which means vending machines at every corner in Japan. It can be on the top of Mount Fuji or in an office, in a wide variety of environments. We leverage external as well as internal data and I’m proud to say that right now that all the CCEJ vending machines are actually managed through Spark and a program running every night to understand exactly the consumption in each machine for the next two days. This means trucks can go out with exactly the right amount of product and replenish only when they need it. It’s definitely transforming the operation and the business.’
‘We also have many data sources and many silos of data and frankly prior solutions had been left to vendors which had built something that is very nice but doesn’t really play well with the rest of the environment. They have their own sets of data and are very silo’d. Using Hadoop we are now migrating all of that data into a single instance to makes us aware of where we have data, what kind of data it is, and understanding the business meaning. Hadoop in Azure, Hortonworks HDP and the BI tools that we have put behind it is enabling one system just showing different dimensions of the data. We also started last year using Nifi for big data integration.’
‘Right now we have how around 20GB that we stream every day to Hadoop more than 1,000 tables and containers and we’re at about 20TB of data. My insight is that have a lot of discussion around predictive and prescriptive analytics experience; the problem is those technologies can’t tell you everything like some vendors say so you have to manage expectations.’
Erik Spitzer is Manager IT Process Design and Innovations / IoT and Cloud for Mitsubishi Fuso Truck and Bus Corporation. This is part of Daimler Chrysler Asia and Daimler Trucks, which is the the biggest commercial vehicle producer in the world.
He commented: ‘I am responsible for building a common data platform. Our use case was initially simple: our trucks had unnecessary downtime. Commercial vehicle owners are utilizing their trucks to make money, so every time they can’t use the truck, they are losing money.’
‘We started by collecting live telematic data from our trucks with all kinds of historical data, such as engineering data from our assembly line, warranty data, sales data and so on. We brought all of this into the same data lake, the same source of truth and were able to go from reactive to proactive. Now we want to actually predict before a fault is happening and try to avoid it.’
‘I see three topics we can solve with this. First, data driven engineering; we do a lot of testing for reliability, previously just on test driving. Now our engineers are getting all the information they need from all trucks on the road and can directly use it to improve our products. Second what we call indirect feedback on how our customers actually using our trucks. Not every truck driver is the same and, for example, if a truck is driving in Okaido or Okinawa it’s completely different weather which impacts maintenance. The third is digital services. We are building an interactive model so give our customers new value-based services to increase their business and increase the utilization rate of their trucks.’
‘What Hadoop allows us to do it find the single source of truth so that decision makers all have the information available when they need to make their decisions.’
‘My insight is is that you have to have buy in from the stakeholders. I have encountered people who have, say, been doing their job for 20 years and feel you are saying that software is doing your job better than them. You really have to say, it’s really not replacing what you are doing but supporting you.’
‘It’s also good start very small. Just first go to the business and hear their pain. Just take one of the ‘quick ones’ and show them how data could support them. As soon as you build trust they will come with other problems they want to solve.’
You can find out much more about content at the Summit at this link. See you next year!