With the release last week of Hortonworks Data Platform (HDP) 1.3 for Windows the Big Data ecosystem takes a large step forward to broad adoption in enterprise environments. As a systems integrator at West Monroe Partners, I work with medium and large enterprises as they address their technology challenges on a daily basis. Hadoop is increasingly becoming a part of that environment, but not always for the reasons that everyone may guess.
While Hadoop garners much attention in the Internet startup community as a powerful and scalable analysis platform it is actually the ultra-low cost storage aspect of Hadoop that makes this platform appealing to the enterprise market. Most mid to large size organizations will say they don’t have “Big Data” and in the sense that Google, Facebook, or Yahoo does they are correct, but they do have an enormous amount of small data. These small data – like web logs, chat logs, call center logs, and CRM information are often stored in smaller sets distributed throughout the organization. This is particularly true in the traditional enterprise market, which has been dominated by Microsoft at the departmental level for a long time. This causes several problems for the enterprise market space that Hadoop is helping to solve, and why I spoke about this topic at the Hadoop Summit in Amsterdam in March.
First, although many of these data sources themselves are small, in aggregate they end up being rather large. Most corporate storage these days ends up on Storage Area Networks (SANs) or Network Attached Storage (NAS). These are both rather expensive options (SAN in particular) and have inherent scalability limits. Hadoop as a pure storage platform delivers tremendous value just from reducing storage costs – think order of magnitude. This alone is enough to justify the investment in Hadoop. The ROI on this type of Hadoop deployment tends to be very short as instead of buying more SAN an organization can just deploy Hadoop instead saving money at the next increment in storage. Ultimately this is a storage consolidation initiative as well – consolidating these disparate file shares and storage locations into a single platform.
The second issue is that the data in these documents is stored in a wide variety of formats and is diverse in its content. This can be CSV files, Excel workbooks, many other log formats, and even Word, PowerPoint, and freeform text documents. There is often tremendous value hidden in these disparate data sources, but there generally is not the budget or will to make a traditional data warehouse for performing analysis on this data at a global or organization wide level. This actually a more subtle issue and I often see projects addressing this sort of analysis grow out of an existing storage consolidation initiative. The Hadoop platform allows a relatively small amount of effort to be used to build a dynamic data warehousing and analysis capability. This level of analysis, which can provide deeper customer insights than previously imagined, is really the cutting edge of the Big Data revolution in the enterprise market space.
Until the release of HDP on Windows there was no way for the Microsoft Centric enterprise to leverage Hadoop for these initiatives, but that has all changed thanks to Hortonworks and their strong partnership with Microsoft. The performance increases of HDP 1.3 in particular make analysis of stored data faster and easier than ever. Now any enterprise using Windows or Linux can leverage the latest features of the Hadoop platform to enable operational cost savings, bottom line, and insightful analytics that can add to the top line.
To learn more, visit West Monroe Partners