Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Sign up for the Developers Newsletter

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

* I understand I can unsubscribe at any time. I also acknowledge the additional information found in Hortonworks Privacy Policy.
closeClose button
Open Source Projects




Cloudbreak simplifies the deployment of Hortonworks platforms in cloud environments such as Amazon Web Services, Microsoft Azure and Google Cloud Platform. Cloudbreak enables the enterprise to quickly run big data workloads in the cloud while optimizing the use of cloud resources.

What Cloudbreak Does

With Cloudbreak, the Big Data Platform Owner will get the following core benefits:

  • Simplified Cluster Provisioning. Dynamically provision and configure clusters on the cloud. With Ambari Blueprints, build the clusters you need in a consistent, repeatable fashion.
  • Automated Cluster Scaling. Optimize cloud resource usage by seamlessly adjusting the cluster as workload and activity changes. Allows you to respond faster to new business requirements.
  • Choice of Clouds. Supports Amazon Web Services, Microsoft Azure, Google Cloud Platform and OpenStack.
  • DevOps: Automate deployment using the integrated Command Line Interface (CLI) and REST API.

How Cloudbreak Works

Cloudbreak launches clusters on the cloud in 3 easy steps:

  1. Pick a Blueprint: Cloudbreak uses Ambari Blueprints to have declarative Hadoop cluster definition. Blueprints can be designed for specialized applications and workloads (such as Data Science or IoT Apps). Cloudbreak includes a few default Blueprints for common cluster configurations but you can always upload your own Blueprint to build the cluster just the way you like it.
  2. Choose a Cloud: Cloudbreak is configured to work with cloud infrastructure resources (such as servers, network setup and security options). Choose the cloud infrastructure you want to use for the cluster.
  3. Launch Cluster: In this step, Cloudbreak obtains the chosen cloud infrastructure platform, installs Apache Ambari and applies the desired Blueprint. The result: your cluster is launched and ready to go!

Internally, Cloudbreak is built on the foundation of cloud providers APIs (Amazon Web Services, Microsoft Azure, Google Cloud Platform, OpenStack), Apache Ambari, Docker containers, Swarm and Consul. 

Collaboration and Focus

Hortonworks is focused on going to market with a 100% open source solution. This focus allows us to collectively provide the product management guidance for Enterprise Grade Hadoop to mainstream enterprises, our partner ecosystem, and further innovate the core of Hadoop.

  • Open. Deliver a complete set of features for Hadoop cloud deployment, in the public and with the community, by defining the operational framework and lifecycle.
  • Flexible. Support a wider array of cloud providers with a common set of API’s to deploy Hadoop.
  • Integrated. Ensure that Hadoop cloud deployment can be integrated with existing IT tools, behind a single pane of glass, by providing Recipes, a CLI and a REST API.

Given our strong open source heritage, we believe Hortonworks is uniquely qualified to ensure that the Cloudbreak technologies continue to flourish in the open. Our strategy is squarely focused on a 100% open-source model with no proprietary extensions. With this approach, we are never conflicted about which capabilities, features, or components to incorporate. We listen to our customers’ and partners’ requirements and work together with them in the open to deliver the best the community has to offer.

Recent Improvements

Some recent improvements to Cloudbreak include:


First Class Cloud Provider Storage Configuration

Cloudbreak simplifies the configuration of your workloads to leverage cloud storage for processing. This prescriptive way to manage how cloud storage in enabled with your workloads greatly simplifies the DevOps experience for ephemeral workloads.

Protected Gateway

Protecting your cluster deployments is paramount for cloud workloads and Cloudbreak can help simplify perimeter security. By automatically “wrapping” your cluster with a secure gateway (powered by Apache Knox), you can minimize the network surface area. With Cloudbreak, you have the ability to select which cluster services are exposed through the gateway. This gives the Enterprise a prescriptive way to secure clusters at the perimeter.

Reusable External Sources

When provisioning workloads to the cloud, there are often common configuration options that need to be setup across clusters. Cloudbreak enables you to configure authentication and external database resources one time, and then re-use those options again-and-again across your workloads. This simplifies repeatability and will help power the DevOps experience.

Data Lake Shared Services

When running workloads in the cloud, it is important to maintain consistent security controls and metadata as workloads come-and-go. Cloudbreak helps you create a “Data Lake” around a set of shared services (Apache Ranger and Apache Hive) for your ephemeral workloads to inherit security policies and data schema. This powers the Modern Data Architecture.

Recent Cloudbreak Releases

Get started today and download Cloudbreak

Cloudbreak Version Notable Enhancements
  • HDP 3.1 and HDF 3.3 Support
  • ADLS Gen 2 Support
  • Workspaces for sharing resources
  • Shared VPC support on GCP
  • Volume encryption on AWS and GCP
  • First Class Cloud Provider Storage Configuration
  • Protected Gateway
  • Reusable External Sources
  • Data Lake Shared Services (TP)
  • Custom Images
  • Kerberos
  • New Recipe Types
  • New and Simplified UI
  • Support for HDP 2.6
  • Azure support for Private IPs


Cloudbreak in our Blog