Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Sign up for the Developers Newsletter

Once a month, receive latest insights, trends, analytics information and knowledge of Big Data.


Get Started


Ready to Get Started?

Download sandbox

How can we help you?

* I understand I can unsubscribe at any time. I also acknowledge the additional information found in Hortonworks Privacy Policy.
closeClose button
May 23, 2014
prev slideNext slide

Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop

On Wednesday May 21, Himanshu Bari (Hortonworks’ senior product manager), Venkatesh Seetharam (committer to Apache Falcon), and Justin Sears ( Hortonworks’ Product Marketing Manager), hosted the third of our seven Discover HDP 2.1 webinars. Himanshu and Venkatesh discussed data governance in Hadoop through Apache Falcon that is included in HDP 2.1. As most of you know, ingesting data into Hadoop is one thing; having data governed, by dictating and defining data-pipeline policies, is another thing—a necessity in the enterprise.

In this informative discourse, the speakers explored and discussed:

  • Why you need Apache Falcon
  • What are some key new Falcon features
  • Showed a Demo highlighting how to:
    • define  data pipelines with replication
    • declare policies for retention and late data arrival
    • manage Falcon server with Ambari
  • Answered questions.

If you missed the webinar, here is the complete recording of the webinar.

And here is the presentation deck.

Webinar Q & A

Question Answer
What version of HDP is Falcon supported in? We recently shipped HDP 2.1, and Apache Falcon is part of that GA release.
Can you use Falcon UI to manage Falcon entities and pipelines? Today, the Falcon UI is read-only. You cannot edit it. But it’s something we are working on, and it’ll be available soon.
Amabari is not supported on Ubuntu yet (AFAIK), what about Falcon? You are correct, Ambari support on Ubuntu is in the works, but Falcon already comes with debs and you could install it outside of Ambari. Note that HDP is supported on Ubuntu today; however, Ambari will have Ubuntu support in the near future.
How do I manage a Falcon server without Ambari today? Should I use Falcon UI? You cannot manage Falcon nor monitor Falcon today in Ambari on Ubuntu. You have a minimal dashboard for Falcon to monitor the jobs. But eventually, you will soon be able to create and manage the pipelines in the UI
Do we have a UI for all this configuration? We are working on the UI to enhance configuration management.
For this demo, are you using Ambari 1.5.1? Yes. We showed Ambari 1.5.1.
Does Apache Falcon run on earlier versions of HDP too?  Like, HDP 1.3, by any chance? That is not a supported config.

What’s Next?

Visit our Data Governance and Integration labs and Apache Falcon page to learn more.

Attend our next Discover HDP 2.1 webinar on Wednesday, May 28 at 9 am Pacific Time: Apache Hadoop 2.4.0., YARN, and HDFS.

And if you have any further questions pertaining to Apache Falcon—documentation, code examples, tutorials—please post them on the Community forums under Falcon.


Leave a Reply

Your email address will not be published. Required fields are marked *

If you have specific technical questions, please post them in the Forums