A completely open source management platform for provisioning, managing, monitoring and securing Apache Hadoop clusters. Apache Ambari takes the guesswork out of operating Hadoop.
Apache Ambari, as part of the Hortonworks Data Platform, allows enterprises to plan, install and securely configure HDP making it easier to provide ongoing cluster maintenance and management, no matter the size of the cluster.
Ambari makes Hadoop management simpler by providing a consistent, secure platform for operational control. Ambari provides an intuitive Web UI as well as a robust REST API, which is particularly useful for automating cluster operations. With Ambari, Hadoop operators get the following core benefits:
Hortonworks is focused on going to market with a 100% open source solution. This focus allows us to collectively provide the product management guidance for Enterprise Grade Hadoop to mainstream enterprises and our partner ecosystem, and further innovate the core of Hadoop.
The community will continue to innovate Ambari so that its operational capabilities keep pace with Hadoop’s ever-expanding functionality for data management, data access, governance and security.
It is exciting to see Ambari come together and we are very interested in hearing feedback as these contributions mature. Therefore, we have made the Ambari Operations and User Views available within the Hortonworks Sandbox to make it easier for you to try them out. For questions and feedback on Ambari operations please post to the Ambari Forum. If you have questions or feedback on the User Views please post them to the Ambari User View Forum.
Apache Ambari 2.5 which is part of the Hortonworks Data Platform 2.6 release, serves as the management system for enterprises looking to easily and securely adopt Apache Hadoop. Ambari simplifies the experience of provisioning, managing, monitoring, securing and troubleshooting Hadoop deployments. Ambari removes the manual — often error prone — tasks associated with operating Hadoop. It also provides the necessary customization “hooks” to fit seamlessly into the enterprise, and enables the IT Operator to focus on delivering world-class service and support for their consumers of the Hortonworks Data Platform. Apache Ambari 2.5 has many new features in this release which include some of the following:
Easily configure services to start on-boot. In large clusters hardware failures happen often. When a machine restarts, operators want the services on that host to start automatically. With Ambari 2.5 we’ve introduced a centrally configurable service auto start capability that allows operators to set a cluster-wide policy to specify which services and components should start automatically.
Want to know who’s abusing HDFS? Ambari Metrics has you covered. HDFS health is critical to a healthy Data Lake, and understanding how the filesystem is being used by applications and tenants is required to efficiently operate and scale your cluster. With Ambari 2.5, TopN metrics for both HDFS operations and the users that have performed those operations have been added in 1, 5, and 25 minute sliding windows to help quickly understand how HDFS is being used.
You’ve got 99 problems and log rotation is not one of them anymore. Hadoop components produce a lot of logs, and operators frequently need to tune log file retention. With Ambari 2.5, we’ve made it extremely easy to quickly configure log file retention for each service. Each service now has the number of backup files and the size of backup files easily exposed.
We all want to get paged when something goes wrong, that’s why we have SNMP. Ambari provides alerting for key components and services to ensure if processes go down, or thresholds are reached notifications are sent out to make sure the right people can look at the problem. In Ambari 2.5, those notifications can optionally be sent using SNMP traps with an SNMP MIB included to help making integrating with existing enterprise alerting infrastructure painless.
It’s time to put a new face on Hadoop using the Ambari Views framework.A “view” is a way of extending Ambari that allows 3rd parties to plug in new resource types along with the APIs, providers and UI to support them. Ambari is the only open source and open community effort designed to provide a compelling user experience for Hadoop while delivering consistent lifecycle management and security.
Most notably, there are the Ambari User Views contributions actively being worked in the community. Ambari User Views are designed to provide capabilities that assist with the operational aspects of data application development and workload management. .
|Tez||The Tez View helps you understand and optimize your cluster resource usage. Using the view, you can optimize and accelerate individual SQL queries or Pig jobs to get the best performance in a multi-tenant Hadoop environment.|
|Hive||Hive View allows the user to write & execute SQL queries on the cluster. It shows the history of all Hive queries executed on the cluster whether run from Hive view or another source such as JDBC/ODBC or CLI. It also provides graphical view of the query execution plan. This helps the user debug the query for correctness and for tuning the performance. It integrates Tez View that allows the user to debug any Tez job, including monitoring the progress of a job (whether from Hive or Pig) while it is running. This view contribution can be found here.|
|Pig||Pig View is similar to the Hive View. It allows writing and running a Pig script. It has support for saving scripts, and loading and using existing UDFs in scripts. This view contribution can be found here.|
|Capacity Scheduler||Capacity Scheduler View helps a Hadoop operator setup YARN workload management easily to enable multi-tenant and multi-workload processing. This view provisions cluster resources by creating and managing YARN queues. This view contribution can be found here.|
|Files||Files View allows the user to manage, browse and upload files and folders in HDFS. This view contribution can be found here.|
Beyond these out of the box User Views there is a growing ecosystem of Ambari User Views that are being developed by the community. You can find community User Views in the Hortonworks Gallery.
For additional details about this release review the following resources:
|Ambari Version||Notable Enhancements|
Introduction Hadoop has always been associated with BigData, yet the perception is it’s only suitable for high latency, high throughput queries. With the contribution of the community, you can use Hadoop interactively for data exploration and visualization. In this tutorial you’ll learn how to analyze large datasets using Apache Hive LLAP on Amazon Web Services […]
A very common request from many customers is to be able to index text in image files; for example, text in scanned PNG files. In this tutorial we are going to walkthrough how to do this with SOLR. Prerequisites Download the Hortonworks Sandbox Complete the Learning the Ropes of the HDP Sandbox tutorial. Step-by-step guide […]
Apache Zeppelin on HDP 2.4.2 Author: Vinay Shukla In March 2016 we delivered the second technical preview of Apache Zeppelin, on HDP 2.4. Meanwhile we and the Zeppelin community have continued to add new features to Zeppelin. These features are now available in the final technical preview of Apache Zeppelin. This technical preview works with […]
Introduction JReport is a embedded BI reporting tool can easily extract and visualize data from the Hortonworks Data Platform 2.3 using the Apache Hive JDBC driver. You can then create reports, dashboards, and data analysis, which can be embedded into your own applications. In this tutorial we are going to walkthrough the folllowing steps to […]
Introduction In this tutorial, you will learn about the different features available in the HDF sandbox. HDF stands for Hortonworks DataFlow. HDF was built to make processing data-in-motion an easier task while also directing the data from source to the destination. You will learn about quick links to access these tools that way when you […]
The Hortonworks Sandbox is delivered as a Dockerized container with the most common ports already opened and forwarded for you. If you would like to open even more ports, check out this tutorial.
Introduction R is a popular tool for statistics and data analysis. It has rich visualization capabilities and a large collection of libraries that have been developed and maintained by the R developer community. One drawback to R is that it’s designed to run on in-memory data, which makes it unsuitable for large datasets. Spark is […]
Apache, Hadoop, Falcon, Atlas, Tez, Sqoop, Flume, Kafka, Pig, Hive, HBase, Accumulo, Storm, Solr, Spark, Ranger, Knox, Ambari, ZooKeeper, Oozie, Phoenix, NiFi, HAWQ, Zeppelin, Atlas, Slider, Mahout, MapReduce, HDFS, YARN, Metron and the Hadoop elephant and Apache project logos are either registered trademarks or trademarks of the Apache Software Foundation in the United States or other countries.