Waterline Data is a Hortonworks Technology Partner and recently earned HDP Certification and YARN Ready with their solution that automates the inventory of data assets in the data lake, enables data governance, and provides self-service to data engineers and data scientists to find and understand their data. Learn more by joining the upcoming webinar on May 6, download the Sandbox tutorial or joint whitepaper. Our guest blogger is Oliver Claude, CMO at Waterline Data.
Apache Hadoop promises to unlock new business value for enterprises. Hadoop provides a powerful platform for data science and analytics, where data engineers and data scientists can leverage myriad data from external and internal data sources to uncover new insight. Data stored in Hadoop is available via a centralized architecture allowing access from any application and for any user. This type of deployment is often called a data lake.
Such power is also presenting a few new challenges, in particular as data lakes grow. On the one hand, the business wants more and more self-service, and on the other hand IT is trying to keep up with the demand for data, while maintaining architecture and data governance standards. In other words, there is a need to combine self-service with automation and governance.
The metaphor that comes to mind that illustrates such a solution is Amazon.com.
Amazon.com is supported by a complete and automated inventory and catalog of all the products. Amazon.com also makes it very easy for users to find, understand, and get the products they want. Lastly, there is end-to-end governance to ensure accurate product information and secure transactions.
At Waterline Data, Amazon.com inspired us, and we built a product that is like Amazon.com for data in Hadoop and the Hortonworks Data Platform (HDP).
Waterline Data provides a unique combination of automation and machine learning in order to
Waterline Data also invested in optimizing the product with the Hortonworks Data Platform, and is an HDP Certified Technology Partner. As a result, Waterline Data running on HDP helps turn the data lake into a business-ready data lake, and prevents a data swamp from forming.
You can get hands-on with Waterline Data Inventory over a Hortonworks cluster, by downloading the Waterline on Hortonworks Sandbox and tutorial to find, understand, and govern data in Hadoop.
Waterline Data and Hortonworks host an upcoming webinar on May 6 at 10 am PT “Implementing a Data Lake with Enterprise Grade Data Governance.” Register Here.