Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Sign up for the Developers Newsletter

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

* I understand I can unsubscribe at any time. I also acknowledge the additional information found in Hortonworks Privacy Policy.
closeClose button
August 28, 2015
prev slideNext slide

Big Data Virtualization to Harness the Power of the Data Lake

Are you still learning about the Data Lake? Wondering how it can help your organization manage and leverage massive amounts of data? On September 8th, VHA, the largest member-owned health care company delivering supply chain management services and clinical services to its members, will share their experience and explain how they simplified data management and enabled faster data discovery with Hadoop and data virtualization.

Register Now

At VHA, product, supplier and member information, among other data, was siloed across multiple sources. VHA sees the value in consolidating their disparate data into a data lake, using the Hortonworks Data Platform, to enable the business users to discover the related data and provide services to their members. Because of their previous success with data virtualization using the Denodo Platform, VHA decided to rely on data virtualization to enable their business users to discover data using the familiar SQL, and thus abstract their access directly to Hadoop.

The Denodo Platform offers a data abstraction layer over Hadoop, NoSQL and traditional enterprise repositories, and allows the creation of virtual, canonical business views of data to address a broad spectrum of use cases, including big data analytics and agile BI solutions.

New technologies such as Apache Pig, MapReduce and Impala allow the enterprise to query a Hadoop cluster through SQL but such technologies involve a steep learning curve and extensive training. Data Virtualization make it possible for companies to access any type of data from any system, in any format, and integrate it in real or near real time and deliver it in the format that they need. With the Denodo Platform, they can combine several types of data that float in a data lake and offer them all as one integrated data set to the consuming application.

The data lake enables companies to store large volumes of data from multiple locations in a cost-effective Hadoop cluster, taking advantage of relational technologies such as Apache Hive, Impala or Hawk. In a data lake, data does not go through big time transformations or modeling, as it would in a data warehouse. Instead, it is usually replicated preserving the format of the original sources. Many companies are expanding their infrastructures to use the data lake as a way to combine internal and external data sources and derive new insights.

VHA will detail their use of Hadoop and Data Virtualization in the live webinar on September 8 at 10am PT titled “Hadoop and Data Virtualization – A Case Study by VHA”. You can register for it here.



Christian Tzolov says:

I guess in the text you’ve meant “HAWQ” instead of “Hawk”?

Ravi Shankar says:

Interesting use case – looking forward to the webinar. We, at Denodo , are beginning to see many use cases combining Data Virtualization with Hadoop. As data volumes and velocities explode, companies are looking to use data virtualization to either provide a simpler, familiar interface to increase business user adoption, like VHA, or combining the data in hadoop with the rest of the enterprise data sitting in back-office or cloud systems and powering new analysis and business processes that were not possible before.

Leave a Reply

Your email address will not be published. Required fields are marked *