The Hortonworks Blog

Posts categorized by : Falcon

We recently hosted the fourth of our seven Discover HDP 2.1 webinars, entitled Apache 2.4.0, HDFS and YARN. It was very well attended and a very informative discourse. The speakers outlined the new features in YARN and HDFS in HDP 2.1 including:

  • HDFS Extended ACLs
  • HTTPs support for WebHDFS and for the Hadoop web UIs
  • HDFS Coordinated DataNode Caching
  • YARN Resource Manager High Availability
  • Application Monitoring through the YARN Timeline Server
  • Capacity Scheduler Preemption

Many thanks to our presenters, Rohit Bakhshi (Hortonworks’ senior product manager), Vinod Kumar Vavilapalli (co-author of the YARN Book, PMC, Hadoop YARN Project Lead at Apache and Hortonworks), and Justin Sears (Hortonworks’ Product Marketing Manager).…

On Wednesday May 21, Himanshu Bari (Hortonworks’ senior product manager), Venkatesh Seetharam (committer to Apache Falcon), and Justin Sears ( Hortonworks’ Product Marketing Manager), hosted the third of our seven Discover HDP 2.1 webinars. Himanshu and Venkatesh discussed data governance in Hadoop through Apache Falcon that is included in HDP 2.1. As most of you know, ingesting data into Hadoop is one thing; having data governed, by dictating and defining data-pipeline policies, is another thing—a necessity in the enterprise.…

The first use of the term BoF session was used at the Digital Equipment Users’ Society (DECUS) conference in the 1960s. Its essence was to bring together like minds and thought leaders—just as birds of the feather flock together— to share and exchange computing ideas, in an informal yet spirited way. Since then, the organizers and sponsors of most computing conferences have been loyal to its essence and spirit.

For ideas and innovation happen in collaboration—not in isolation. …

If you’re excited to get started with the new features in Hortonworks Data Platform 2.1, then we’ve included 4 tutorials for you try out – Sandbox-style.

You can download the HDP 2.1 Technical Preview here, and then get stuck into these great tutorials.

Interactive Query with Apache Hive and Apache Tez

OK, so you’re not going to get huge performance out of a one-node VM, but you can try out Hive on Tez, and see the performance gains versus MapReduce, and also try out features such as Vectorized Query, and the host of new SQL features.…

The pace of innovation within the Apache Hadoop community is truly remarkable, enabling us to announce the availability of Hortonworks Data Platform 2.1, incorporating the very latest innovations from the Hadoop community in an integrated, tested, and completely open enterprise data platform.

Download HDP 2.1 Technical Preview Now

What’s In Hortonworks Data Platform 2.1?

The advancements in HDP 2.1 span every aspect of Enterprise Hadoop: from data management, data access, integration & governance, security and operations. …

Apache Falcon is a data governance engine that defines, schedules, and monitors data management policies. Falcon allows Hadoop administrators to centrally define their data pipelines, and then Falcon uses those definitions to auto-generate workflows in Apache Oozie.

InMobi is one of the largest Hadoop users in the world, and their team began the project 2 years ago. At the time, InMobi was processing billions of ad-server events in Hadoop every day.…

We believe the fastest path to innovation is the open community and we work hard to help deliver this innovation from the community to the enterprise.  However, this is a two way street. We are also hearing very distinct requirements being voiced by the broad enterprise as they integrate Hadoop into their data architecture.

Take a look at the Falcon Technical Preview and the Data Management Labs.

Open Source, Open Community & An Open Roadmap for Dataset Management

Over the past year, a set of enterprise requirements has emerged for dataset management.  …

Today we are excited to see another example of the power of community at work as we highlight the newly approved Apache Software Foundation incubator project named Falcon. This incubation project was initiated by the team at InMobi together with engineers from Hortonworks. Falcon is useful to anyone building apps on Hadoop as it simplifies data management through the introduction of a data lifecycle management framework.

All About Falcon and Data Lifecycle Management

Falcon is a data lifecycle management framework for Apache Hadoop that enables users to configure, manage and orchestrate data motion, disaster recovery, and data retention workflows in support of business continuity and data governance use cases.…