Get fresh updates from Hortonworks by email

Once a month, receive latest insights, trends, analytics, offering information and knowledge of the Big Data.

cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
January 07, 2015
prev slideNext slide

Announcing Apache Falcon 0.6.0

With YARN as its architectural center, Apache Hadoop continues to attract new engines to run within the data platform, as organizations want to efficiently store their data in a single repository and interact with it for batch, interactive and real-time streaming use cases. As more data flows into and through a Hadoop cluster to feed these engines, Apache Falcon is a crucial framework for simplifying data management and pipeline processing.

Falcon enables data architects to automate the movement and processing of datasets for ingest, pipeline, disaster recovery and data retention use cases.

We recently released Apache Falcon 0.6.0. With this release, the community addressed more than 220 JIRA issues. Among these many bug fixes, improvements and new features, four stand out as particularly important:

  • Authorization with ACLs for entities
  • Enhancements to lineage metadata
  • Cloud archival
  • Falcon recipes

Screen Shot 2015-01-05 at 11.05.03 AMThis blog gives an overview of these new features and how they integrate with other Hadoop services. We’ll also touch on additional innovation we plan for upcoming releases.

Authorization with ACL for entities

Now Apache Falcon supports an access control list (ACL) that provides authorization for Feed, Cluster and Process entities. This allows Falcon to leverage existing security work to maintain consist controls throughout the HDP stack. This security enhancement lays the foundation for broader enterprise adoption and a variety of new use cases that will flow from that.

Enhancements to lineage metadata

This Falcon release provides better access to lineage metadata. This facilitates the quick and efficient search and retrieval of lineage information, which makes it easier to comply with data retention and discoverability regulations.

Cloud archival

Allows leveraging of Cloud infrastructure such as Amazon S3 and Microsoft Azure. We’re excited about this change because it extends the archive use case for continuity and ad hoc analysis.

Falcon recipes

A Falcon recipe is a static process template with parameterized workflow to realize a specific use case. Recipes are defined in the user space. All recipes can be modeled as a Process within Falcon, which then periodically executes the user workflow. As the process and its associated workflow are parameterized, the user will provide a properties file with name/value pairs that are substituted by Falcon before scheduling. Falcon translates these recipes as a process entity by replacing the parameters in the workflow definition. Recipes enable non-programmers to capture and re-use very complex business logic.

Plans for the Future of Falcon

We want to thank the Apache Falcon community for all of its hard work delivering this release. Looking forward to future releases, the Apache Falcon team plans:

  • Usability improvements, with a new UI, REST API additions and enhanced documentation
  • Further strengthening of HA capabilities, allowing Falcon to meet ever more stringent SLAs
  • Integration with Apache Knox for tighter perimeter security and proxy access

Download Apache Falcon and Learn More

Tags:

Comments

  • Interesting developments. A key question is what we mean by “data lineage” as that is a term that is being applied liberally regarding the history of data itself. Are we talking how the data was used and by whom, the source of the data, its provenance? As an industry, we need to clarify what we mean and realistically manage expectations.

  • We have HDP 2.2 with Falcon 0.6.0 installed on our cluster. We are able to schedule and run falcon processes, but I can’t find the Falcon dashboards. Falcon UI on port 15000 shows nothing but three tabs with entities configurations. I”ll be very appreciated if some can point to the Lineage and other falcon dashboard. Thanks!

  • Leave a Reply

    Your email address will not be published. Required fields are marked *

    If you have specific technical questions, please post them in the Forums

    You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>