Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
prev slide
How the Benefits of Hadoop 3.0 Can Supercharge Your Business
December 20, 2017
The Advantages of Blockchain Technology
Next slide

Using a Big Data Lake to Change Business for the Better

Consultant McKinsey suggests that a key component of digital success is the identification and protection of an organization’s digital crown jewels—the data that’s essential to business operations, customer service, and market competitiveness.

Your business might be aware of the importance of the information you collect, but if big data is really the crown jewel of your organization, then you’re going to want to ensure your information is kept safe and organized. A data lake could be the key to success because it helps you create a centralized data lake management infrastructure that lets you consistently manage, store, analyze, and classify your data.

The Evolution of Raw Data Storage

In the simplest terms, a data lake is a storage repository that holds a huge amount of raw data in its native format until a business use is identified. Gartner Vice President Andrew White refers to it as a staging area for unmanaged data. However, what separates modern data lakes from the simplest form is the move from operational management to proactive business decision-making.

IT leaders must recognize this as data lakes continue to evolve in conjunction with changing technology demands. The first version is best viewed as a single-use system for batch applications, while the second model provided a multi-use platform for interactive, online, and streaming components.

The Modern Repository

Data Lake 3.0, the most modern form of the technology, gives pioneering IT leaders the chance to take things further by allowing businesses to deploy prepackaged applications with minor customizations. In this third iteration, the focus has shifted from platform management to the modern IT leadership challenge of solving business problems with technological solutions.

Nowadays, there’s an increasing range of modern applications that exploit big data. These applications require a large amount of processing power and are supported by some key infrastructure components, such as microservices and containers. Data lakes—particularly the third, most contemporary version—sit at the interface of this modern data application setup, providing the base for data-led transformation.

What Are the Business Benefits?

By using this type of global data storage repository, your organization can do more than simply store volumes of information in their native format. By embracing Data Lake 3.0, your employees can create insight that helps change the business for the better.

For a start, keeping data organized and safe is vital to any business. With a consolidated platform, your organization can react quickly when new business challenges are identified, drawing on data to reduce the time to insight creation and service deployment.

By deploying prepackaged applications with minor customizations, some businesses have been able to use Data Lake 3.0 to reduce the time to insight and deployment from days to minutes. Significant reductions in the total cost of ownership (TCO) are another potential plus.

Storing data sets in a centrally managed Apache Hadoop–based data lake infrastructure allows you to cut the number of information silos that often reside around a business. A data plane that consolidates data management and access, in short, helps promote the effective use of information storage.

Using Your Repository the Right Way

The developments in Data Lake 3.0 highlight how information management has transformed into a proactive, real-time practice. Business insight can now be derived wherever data resides throughout its entire life cycle. However, if you’re not careful, these advances can create new complexity.

Organizations must focus on three key areas as they look to use the consolidated information in Data Lake 3.0. First, don’t introduce unwanted barriers—exploit the adaptability of open source technologies wherever possible. Second, create a data management platform that allows your business to deliver services on top of a series of shared capabilities. Finally, keep flexibility front and center; don’t change the way you architect data.

Choosing the Right Technology

Look for a next-generation approach that allows your business to manage, govern, and secure data and workloads across multiple sources, including the data lake. Advanced data management technology can give enterprises the opportunity to create quick value from data in an intuitive manner.

Your chosen data plane technology should include a catalog of available services, easy-to-use security controls, and the tools to help integrate sources. Delivered as a service, your data plane provider should have strong partner organizations to help implement useful extensions as they become available. In combination with Data Lake 3.0, your business can use this capability to deploy the future of analytics.

Gartner Analyst Doug Laney encourages IT leaders to treat business information like an asset. A smart combination of data lake technology and data plane services will give your organization the opportunity to exploit insight in new and potentially game-changing ways.

Find out more about this data storage approach and its impact on your business.


Leave a Reply

Your email address will not be published. Required fields are marked *