There’s an old proverb you’ve likely heard about blind men trying to identify an elephant. Depending on the version of the proverb you’ve heard the elephant is misidentified variously as rope, walls, pillars, baskets, brushes and more. Oddly, no-one identified it as a next-generation enterprise data platform but I guess it is an old proverb.
The Hadoop elephant is a platform though, and as such the proverb holds true. Depending on your perspective, it has different capabilities, components and integration points to meet your requirements. To that end, we’ve reorganized some of our technical content around 3 groups of activities, each of which also has specific needs. Of course, you’ll recognize these needs and roles from your existing teams:
DEVELOP. Developers can use Hadoop in multiple ways. It may be that you are building alongside Hadoop to collect data: moving it around from a point of creation to the cluster for analysis. Or perhaps you’re processing and refining that data for improved analysis. Or maybe you’re considering building next-generation apps atop Hadoop, or taking advantage of the derived insights from those teams.
ANALYZE. Data in Hadoop can be vast and various and so data analysts may be exploring as data scientists utilizing analytics techniques, or operationalizing queries over known data for repeated use, and then delivering those insights in ways that a business or end user can act upon.
OPERATE. Crucially, Hadoop operates within a modern data architecture and so administrators need to be able to provision, manage and monitor clusters that integrate and interoperate with existing components of the data architecture.
Sound familiar? Of course it does. Getting started with Hadoop is about taking your existing skills and tools and applying them to data at a new scale. Enjoy.