Are you still learning about the Data Lake? Wondering how it can help your organization manage and leverage massive amounts of data? On September 8th, VHA, the largest member-owned health care company delivering supply chain management services and clinical services to its members, will share their experience and explain how they simplified data management and enabled faster data discovery with Hadoop and data virtualization.
At VHA, product, supplier and member information, among other data, was siloed across multiple sources. VHA sees the value in consolidating their disparate data into a data lake, using the Hortonworks Data Platform, to enable the business users to discover the related data and provide services to their members. Because of their previous success with data virtualization using the Denodo Platform, VHA decided to rely on data virtualization to enable their business users to discover data using the familiar SQL, and thus abstract their access directly to Hadoop.
The Denodo Platform offers a data abstraction layer over Hadoop, NoSQL and traditional enterprise repositories, and allows the creation of virtual, canonical business views of data to address a broad spectrum of use cases, including big data analytics and agile BI solutions.
New technologies such as Apache Pig, MapReduce and Impala allow the enterprise to query a Hadoop cluster through SQL but such technologies involve a steep learning curve and extensive training. Data Virtualization make it possible for companies to access any type of data from any system, in any format, and integrate it in real or near real time and deliver it in the format that they need. With the Denodo Platform, they can combine several types of data that float in a data lake and offer them all as one integrated data set to the consuming application.
The data lake enables companies to store large volumes of data from multiple locations in a cost-effective Hadoop cluster, taking advantage of relational technologies such as Apache Hive, Impala or Hawk. In a data lake, data does not go through big time transformations or modeling, as it would in a data warehouse. Instead, it is usually replicated preserving the format of the original sources. Many companies are expanding their infrastructures to use the data lake as a way to combine internal and external data sources and derive new insights.
VHA will detail their use of Hadoop and Data Virtualization in the live webinar on September 8 at 10am PT titled “Hadoop and Data Virtualization – A Case Study by VHA”. You can register for it here.