cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
cta

APACHE PROJECTS

cloud Ready to Get Started ?

DOWNLOAD WHITE PAPER

Open Source Big Data: An Ecosystem of Projects

Numerous Apache Software Foundation projects make up the services required by an enterprise to deploy, integrate and work with massive amounts of structure and unstructured data. Each project has been developed to deliver an explicit function and each has its own community of developers and individual release cycles.

HDP PROJECTS

Apache Hadoop® is an open source framework for distributed storage and processing of large sets of data on commodity hardware. Hadoop enables businesses to quickly gain insight from massive amounts of structured and unstructured data.

GOVERNANCE INTEGRATION

Data Lifecycle & Governance

Data workflow

OPERATIONS

Provisioning, Managing, & Monitoring

Scheduling

SECURITY

Administration Authentication Authorization Auditing Data Protection

DATA ACCESS

S T
HDFSHadoop Distributed File System

DATA MANAGEMENT

No results found...
Apache Accumulo
Apache Accumulo

A sorted, distributed key-value store with cell-based access...

learn more
Apache Ambari
Apache Ambari

A completely open source management platform for provisioning,...

learn more
Apache Atlas
Apache Atlas

Agile enterprise compliance through metadata Atlas is designed...

learn more
Apache Falcon
Apache Falcon

A framework for managing data life cycle in...

learn more
Apache Flume
Apache Flume

A service for streaming logs into Hadoop Apache...

learn more
Apache Hadoop
Apache Hadoop

Apache Hadoop is an open source software platform...

learn more
Apache Hadoop HDFS
Apache Hadoop HDFS

A distributed Java-based file system for storing large...

learn more
Apache Hadoop MapReduce
Apache Hadoop MapReduce

A framework for writing applications that process large...

learn more
Apache Hadoop YARN
Apache Hadoop YARN

The Architectural Center of Enterprise Hadoop Part of...

learn more
Apache HAWQ
Apache HAWQ

Apache HAWQ (incubating) provides native SQL on Apache...

learn more
Apache HBase
Apache HBase

A non-relational (NoSQL) database that runs on top...

learn more
Apache Hive
Apache Hive

The de facto standard for SQL queries in...

learn more
Apache Kafka
Apache Kafka

A fast, scalable, fault-tolerant messaging system Apache™ Kafka...

learn more
Apache Knox Gateway
Apache Knox Gateway

Secure entry point for Hadoop clusters The Apache...

learn more
Apache Oozie
Apache Oozie

The blueprint for Enterprise Hadoop includes Apache™ Hadoop’s...

learn more
Apache Phoenix
Apache Phoenix

Apache Phoenix is an open source, massively parallel,...

learn more
Apache Pig
Apache Pig

A scripting platform for processing and analyzing large...

learn more
Apache Ranger
Apache Ranger

Comprehensive security for Enterprise Hadoop Apache Ranger delivers...

learn more
Apache Slider
Apache Slider

A Framework for YARN-based, Long-running Applications In Hadoop...

learn more
Apache Solr
Apache Solr

Rapid indexing & search on Hadoop Apache Solr...

learn more
Apache Spark
Apache Spark

Spark adds in-Memory Compute for ETL, Machine Learning...

learn more
Apache Sqoop
Apache Sqoop

Efficiently transfers bulk data between Apache Hadoop and...

learn more
Apache Storm
Apache Storm

A system for processing streaming data in real...

learn more
Apache Tez
Apache Tez

A Framework for YARN-based, Data Processing Applications In...

learn more
Apache Zeppelin
Apache Zeppelin

A completely open web-based notebook that enables interactive...

learn more
Apache ZooKeeper
Apache ZooKeeper

An open source server that reliably coordinates distributed...

learn more
Cloudbreak
Cloudbreak

A tool for provisioning and managing Apache Hadoop...

learn more

HDF PROJECTS

Apache NiFi, Kafka and Storm provide real-time dataflow management and streaming analytics. HDF enables real-time data collection, curation, analysis and delivery of data to and from any device, source or system, either on-premise and in the cloud.

Apache Kafka
Apache Kafka

A fast, scalable, fault-tolerant messaging system Apache™ Kafka...

learn more
Apache NiFi
Apache NiFi

A real-time integrated data logistics and simple event...

learn more
Apache Storm
Apache Storm

A system for processing streaming data in real...

learn more

Modern Data Application Projects

Modern enterprises run on data. Open and Connected Data Platforms from Hortonworks enable an organization to manage all data, data-in-motion and data-at-rest to empower actionable intelligence for your organization.

Apache Metron
Apache Metron

Real-Time Big Data Enabled Cybersecurity Analytics

learn more