Accelerate Your Data Lake Strategy with the Cutting Edge of Informatica on HDP 2.1

Informatica and HDP 2.1 facilitate the predictive power of your data lake

Informatica is a Hortonworks Certified Technology Partner. This partnership makes it possible for organizations to use all the data internal and external to an enterprise to achieve the full predictive power that drives the success of modern data-driven businesses. 

That is why we’re excited to have John Haddad, Senior Director, Informatica to be our guest blogger. In this blog, John explores the benefits of certification on HDP 2.1.

When I was in high school, one of my best friends had a water ski boat we often took out on California lakes (what are friends for?). The key to acceleration and powerful flowing turns is how well you transition from the cutting edge to the turning edge.  Similarly, you can easily accelerate your Data Lake strategy with the cutting edge technology of Informatica on Hortonworks 2.1 to turn Big Data into awesomeness.

Last September Informatica and Hortonworks delivered a webinar on the Modern Data Architecture for a Data Lake with Informatica and Hortonworks Data Platform.  Since then we’ve seen tremendous demand for building out a Hadoop-based data lake as a single place to manage the supply and demand of data.  The data lake is the cornerstone of a big data strategy providing cost-effective and efficient data storage, access, processing, and provisioning of data for predictive analytics and information-based products and services.


Our vision for the data lake adheres to the following principles:

  1. Data is useful to everyone and as such is managed as a collaborative, social community
  2. Data of all types is ‘on-boarded’ by anyone intuitively in minutes
  3. Data is immediately available to the broadest community while adhering to security and privacy regulations
  4. Data is easy to find, retrieve, and share – metadata is built-in
  5. Data curation, refinement, and usage is simple, transparent, and documented
  6. Data is managed as an accountable asset
  7. Data quality is fit for use
  8. Data management scales out with minimal human administration

With Informatica 9.6 running on HDP 2.1, together we have made significant progress in delivering on this vision. Today, customers can use Informatica to ingest all types of data in batch and real-time into HDP 2.1 to build out a data lake.  Once the data is in the data lake then Informatica developers can profile, parse, integrate, cleanse, and refine the data in the lake using a visual development environment that makes them up to 5x more productive on Hadoop. Informatica is also providing data analyst self-service capabilities  to find and prepare data for analysis. One healthcare customer has built many data marts over the years and is now moving to a data lake model to empower their data science team with a single location to manage the supply and demand of data. Instead of spending 80% of their time accessing and preparing the data they can spend more of their time analyzing the data.

Discover More

We invite you to learn how to get started with Informatica on HDP 2.1 by visiting our booth #P1 at the Hadoop Summit San Jose and checking out Informatica demos at the Hortonworks Hadoop Cafe in the Pavilion area, June 3 at 1:40 pm.  I also encourage you to attend the Informatica sponsored session by Kunal Jain, Big Data Solutions Architect at Informatica and learn about “Dealing with changed data on Hadoop – an old data warehouse problem in a new world” at 11:00 am on Thursday June 5.

Categorized by :
Business Values of Hadoop Hadoop Ecosystem Modern Data Architecture New Analytics Apps Operations & Management

Leave a Reply

Your email address will not be published. Required fields are marked *

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.