Yahoo! Japan is the largest Internet portal site in Japan. Yahoo! Japan adopted early Apache™ Hadoop® in 2008 for its unique data storage and analysis challenges, including capturing detailed user activity history.
Data proliferated rapidly in Yahoo! Japan’s initial Hadoop clusters, with data streaming in from sources such as access logs, search keywords, product information, purchase histories, and auction bidding information. When the combined clusters grew to more than 3,000 nodes, the business value was too great for the team to support with internal Hadoop expertise.
To stabilize this mission-critical data analysis resource, the company adopted Hortonworks Data Platform (HDP)—supported by the original architects and ongoing innovators of Hadoop technology.
HDP has enabled Yahoo! Japan to keep pace with the proliferation of data while also improving performance. Additionally, HDP has ensured stable operation of this influx of data, despite the extreme scale of data under management.
Today, Yahoo! Japan stores, analyzes and gains value from over 75PB of data.
By analyzing this variety of data at such scale, Yahoo! Japan has developed the Yahoo! Data Management Platform and provided it as a service to its customers, so that they can precisely target their customers.
Yahoo! Japan is now exploring the possibility of real-time processing and other deep learning capabilities that the open source community has added to the Hadoop ecosystem since Yahoo! Japan began its journey years ago.
The company is nine years into its adoption of Hadoop, but its Big Data journey is really just getting started.
The Yahoo! Japan case study was recently translated into English, you can read it here.
To learn more about Yahoo! Japan, check out their website.