Recent Webinar Replays


Improving Hive and HBase Integration

Apache Hive provides SQL-like access to your stored data in Apache Hadoop. Apache HBase stores tabular data in Hadoop and supports update operations. The combination of these two capabilities is often desired, however, the current integration show limitations such as performance issues. In this talk, Hortonworks co-founder, Owen O’Malley, will present an overview of Hive and HBase and discuss new updates/improvements from the community on the integration of these two projects. Various techniques used to reduce data exchange and improve efficiency will also be provided.

Please Register or Login to View / Download Past Webinars

HDFS Futures: NameNode Federation for Improved Efficiency and Scalability

Scalability of the NameNode has been a key issue for HDFS clusters. Because the entire file system metadata is stored in memory on a single NameNode, and all metadata operations are processed on this single system, the NameNode both limits the growth in size of the cluster and makes the NameService a bottleneck for the MapReduce framework as demand increases. HDFS Federation horizontally scales the NameService using multiple federated NameNodes/namespaces. The federated NameNodes share the DataNodes in the cluster as a common storage layer. HDFS Federation also adds client-side namespaces to provide a unified view of the file system. In this talk, Hortonworks co-founder and key architect, Suresh Srinivas will discuss the benefits, features and best practices for implementing HDFS Federation.

Please Register or Login to View / Download Past Webinars

Simplifying the Process of Uploading and Extracting Data from Hadoop

As the volume of data continues to grow, organizations worldwide are quickly adopting Apache Hadoop to store, manage and process Big Data. However, integrating multiple data sources is still one of the more time-consuming and challenging aspects of storing and analyzing data with Hadoop.

Join us for this free informative webinar to learn how the power of open source technologies address these data integration challenges. Hear from Rohit Bakhshi, Solution Architect at Hortonworks and Jim Walker, Director of Product Marketing at Talend, on Apache Hadoop best practices that data enthusiast of any skill-levels can leverage. Gain insights to different approaches organizations can take to avoid the complexity of uploading or extracting data from Hadoop. Also, see a live demonstration on how to load HDFS in less than five minutes without writing a line of code and how to create and run a pig script.

Please Register or Login to View / Download Past Webinars

Extending Hadoop beyond MapReduce

Hortonworks has been developing the next generation of Apache Hadoop MapReduce that factors the framework into a generic resource management fabric to support MapReduce and other application paradigms such as Graph Processing, MPI etc. High-availability is built-in from the beginning; as are security and multi-tenancy to support multiple users and organizations on large, shared clusters. The new architecture will also increase innovation, agility and hardware utilization. NextGen MapReduce is already available in Hadoop 0.23. Join us for this webcast as we discuss the main architectural highlights of MapReduce and its utility to users and administrators.

Please Register or Login to View / Download Past Webinars

HCatalog, Table Management for Hadoop

HCatalog is a metadata and table management system for Hadoop. It allows users to share data and metadata across Hive, Pig, and MapReduce. It also allows users to write their applications without being concerned how or where the data is stored, and insulates users from schema and storage format changes. In this talk, Hortonworks founder Alan Gates will introduce HCatalog, discuss its current features, and give an overview of the short term roadmap for HCatalog. Alan is an original member of the engineering team that took Pig from a Yahoo! Labs research project to a successful Apache open source project. He also also designed HCatalog and guided its adoption as an Apache Incubator project.

Please Register or Login to View / Download Past Webinars

Understanding the Hortonworks Roadmap

Join Hortonworks founder Eric Baldeschwieler as he guides you through Hortonworks’ planned releases for the upcoming year. Eric has led the evolution of Apache Hadoop from a 20-node prototype to a 42,000-node service behind every click at Yahoo! In this webcast, Eric will guide you through the planned enhancements to the major Hadoop components in 2012.

Please Register or Login to View / Download Past Webinars

Reference Architecture for Hadoop in Banking - Industry Perspective

UBS has been an early adopter of Hadoop and continues to test a number of data processing & analytics use cases. In this webcast, Executive Director at UBS, Dave Casper will discuss how Hadoop fits the overall Data & Architecture strategy at UBS. Joining Dave in this discussion is Abhishek Mehta, (Founder, Tresata) and Arun Murthy (Hortonworks) who will outline a template to design, build and implement a Hadoop powered Data Processing & Analytics Platform within the confines of an Enterprise Data Stack. Get expert advice on how financial organizations can introduce Hadoop into a traditional Banking data environment while symbiotically integrating with existing tools and technologies.

You’ll learn:

  • How to integrate Hadoop within enterprise data stack
  • What to build vs buy
  • What problems to use Hadoop for and what not to
Please Register or Login to View / Download Past Webinars

Hadoop HDFS High Availability: HA NameNode

The HDFS NameNode is a robust and reliable service as seen in practice in production at Yahoo, Facebook and other enterprises. However, the NameNode does not have automatic failover. A hot failover solution called HA NameNode is under active development (HDFS-1623) and making excellent progress. Join Hortonworks founder Sanjay Radia, as he outlines the approach and current status. Sanjay is an Apache Hadoop Committer and original architect of the Hadoop HDFS project at Yahoo!

Please Register or Login to View / Download Past Webinars

For the full library of past webinars, please view the Webinar Replay Library module on the right-hand side of this page