Is a Lake Big Enough to House Your Ocean of Data?
Contrary to popular belief, Hadoop was not the elephant-in-the-china-shop that marauded and disrupted the data center. The real culprit is data and how it has exploded in volume. The past two or three years have seen a rise in the number of successful Hadoop projects in enterprises to tackle this explosion of big data. These large volumes of data, the emergence of the Hadoop technology and the need to store all the siloed data in one place have prompted the phenomenon called the Data Lake among enterprises.
Is the Data Lake an effective catchment for all of the enterprise data?
Yes and No. Data lakes are good to house the current, inter-related data but they don’t address the need for an enterprise-wide data management system
Big Data Virtualization to harness the power of the Data Lake
VHA is the largest member-owned healthcare company in the US delivering industry-leading supply chain management services and clinical improvement services to its members. The company had its product, supplier, and member information, and other data, spread across multiple sources, residing in silos.
The value of consolidating their disparate data into a data lake was not missed on VHA. This resulted in the company using the Hortonworks Data Platform, to enable the business users to discover the related data and provide services to their members. Because of their previous success with data virtualization using the Denodo Platform, VHA decided to use data virtualization to enable their business users to discover data using the familiar SQL, and thus abstract their access directly to Hadoop. With the Denodo Platform users can combine several types of data that float in a data lake and offer them all as one integrated data set to the consuming application.
Related big data technologies such as Pig, MapReduce and Impala allow users to query a Hadoop cluster through SQL but they involve a steep learning curve and extensive training. The Denodo Platform offers a data abstraction layer over Hadoop, NoSQL and traditional enterprise repositories, and allows the creation of virtual, canonical business views of data to address a broad spectrum of use cases, including big data analytics and agile BI solutions.
Are you still learning about the Data Lake? Wondering how it can help your organization manage and leverage massive amounts of data? Learn from VHA experts who detail their use of Hadoop and Data Virtualization in this webinar titled “Hadoop and Data Virtualization – A Case Study by VHA”. Watch it below: