Hadoop’s part in a Modern Data Architecture
The shift to a data-oriented business is happening. The inherent value in established and emerging big datasets is becoming clear. Enterprises are building big data strategies to take advantage of these new opportunities and Hadoop is the platform to realize those strategies.
Hadoop is enabling a modern data architecture where it plays a central role: built to tackle big data sets with efficiency while integrating with existing data systems. As champions of Hadoop, our aim is to ensure the success of every Hadoop implementation and improve our own understanding of how and why enterprises tackle big data initiatives. It never ceases to amaze when we hear of a new business case for Hadoop – the possibilities are endless, and as with all technology change, it’s an exciting time.
What does an enterprise want from Hadoop?
To aid this transformation to a modern data architecture, our customers have outlined three key requirements of Hadoop that we continue to focus on every day:
1) Integrate with the existing ecosystem
Hadoop must interoperate with the existing tools already present in the enterprise. Nobody wants – or needs – to rip and replace an entire ecosystem of applications/tools from partners that are already trusted. More often than not, Hadoop is used to augment a data warehouse or supply data to an analyst using an existing visualization tool. Further, existing data integration tools, storage and file systems are extremely valuable and dependable. For Hadoop to enjoy widespread adoption within an enterprise it needs to work with these systems.
2) Use Existing Skills
The most valuable asset in most businesses is their people and Hadoop effects a wide range of resources across the data worker, developer and the operations specialist. To quickly and efficiently adopt Hadoop, these roles have certain needs:
- Developers should be able to build applications using familiar languages and frameworks such as Java, .NET, SQL, Spring and many others.
- Data workers should be able to translate their existing SQL and analytics skill sets and working environments into Hadoop.
- Operations specialists should be able to use a ‘single pane of glass’ to manage an integrated, cross-system architecture.
Hadoop needs to allow teams and businesses to use these skillsets and let’s not forget, also leverage the current investment in surrounding tools as well.
3) Meet the Essential Requirements of the Enterprise
As part of a modern data architecture, Hadoop needs to be a good citizen and trusted as part of the heart of the business. This means it must provide for all the platform services and features that are expected of an enterprise data platform. These features range from security, reliability, and availability through data portability and lifecycle to manageability, interoperability and essential usability.
How are we focused on delivering these requirements?
To begin with, we partner with some of the biggest names in the industry – Teradata, Microsoft, Rackspace and many others – to help ensure Hadoop is easy to consume and use as part of broader big data solution architectures. Moreover, our commitment to a community-driven development model enables our partners and customers to participate directly and transparently with Hortonworks in the process of enabling and enhancing Hadoop within specific solution architectures.
With that essential process, together we’re able to deliver the right capabilities to enable Hadoop integration in the enterprise, with the tools and features needed for developers, data workers and operations specialists to quickly and efficiently adopt and realize value from Hadoop.
We’re focused on making Apache Hadoop the foundation for a modern data architecture that integrates and interoperates with key technologies and tools within the data center.
Find out more about Hadoop as part of a modern data architecture, and join us for a series of webinars to discuss the realities of integrating Hadoop with existing tools and solutions.
Get Started using Hadoop to Analyze Data. This guide includes tutorials, videos and advice on integrating Hadoop with popular analytics packages.
The Stinger Initiative is a broad, community-based effort to drive the future of Apache Hive, delivering 100x performance improvements at petabyte scale with familiar SQL semantics. More »