Fast Search and Analytics on Hadoop with Elasticsearch

Learn about the Elasticsearch and Hortonworks Partnership

Hortonworks customers can now enhance their Hadoop applications with Elasticsearch real-time data exploration, analytics, logging and search features, all designed to help businesses ask better questions, get clearer answers and better analyze their business metrics in real-time.

Hortonworks Data Platform and Elasticsearch make for a powerful combination of technologies that are extremely useful to anyone handling large volumes of data on a day-to-day basis. With the ability of YARN to support multiple workloads, customers with current investments in flexible batch processing can also add real-time search applications from Elasticsearch.

Use Cases

Here are just some of the use case results from Elasticsearch:

  • Perform real-time analysis of 200 million conversations across the social web each day helping major brands make business decisions based on social data
  • Run marketing campaigns that quickly identify the right key influencers from a database of 400 million users
  • Provide real-time search results from an index of over 10 billion documents
  • Power intelligent search and better inform recommendations to millions of customers a month
  • Increase the speed of searches by 1000 times
  • Instant search for 100,000 source code repositories containing tens of billions lines of code

YARN Certified

Elasticsearch became a Hortonworks Certified Technology Partner in June and is the first search tool to be certified on HDP 2 with YARN. A leader, like Hortonworks, in the open source space, this partnership will benefit users of either product. Elasticsearch is a great fit for HDP because its scalable, distributed nature allows it to search – and store – vast amounts of information in near real-time.

Elasticsearch: “We’re excited to partner with Hortonworks and to announce that Elasticsearch is now certified with Hortonworks Data Platform 2.0 to make real-time data exploration faster on Hadoop,” said Steven Schuurman, CEO of Elasticsearch. “Hadoop and Elasticsearch are among the most popular open source products currently being run in production within the Enterprise. Our advanced open source search and analytics engine combined with Hortonworks open source Hadoop makes a powerful big data solution for customers embarking on big data projects.”

Learn More

Using Elasticsearch with HDP is easy thanks to Elasticsearch integrations. Developers can write MapReduce jobs that index existing data in HDFS, enabling search through the Elasticsearch REST API and related ecosystem.  Developers can also enable MapReduce jobs to read and write the input and output datasets to and from Elasticsearch. This deep integration extends to Hive, Pig and Cascading.

Read more about the Elasticsearch-Hortonworks partnership, Elasticsearch blog, or Elasticsearch technical guides on Apache Hive, Apache Pig, Cascading and Map/Reduce.

Categorized by :
Hadoop Ecosystem New Analytics Apps Sandbox YARN

Leave a Reply

Your email address will not be published. Required fields are marked *

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.