We just added a video to the Hortonworks Executive Video library that features Alan Gates, Hortonworks co-founder and Apache PMC member. In this video, Alan discusses HCatalog, one of the most compelling projects in the Apache Hadoop ecosystem.
HCatalog is a metadata and table management system that provides a consistent data model and schema for users of tools such as MapReduce, Hive and Pig. When you consider that there are often users accessing Hadoop clusters using different tools that independently don’t agree on schema, data types, how and where data is stored, etc., then you can understand the value of having a tool such as HCatalog.
In this video, Alan does a good job of not only explaining the role of HCatalog, but also laying out the future direction of the project. He talks about improving the integration with HBase, improving information lifecycle management and expanding the HCatalog data model to address the challenges of unstructured data.
If you would like to learn more about HCatalog or any of the Apache Hadoop projects, I strongly suggest that you attend Hadoop Summit next month. There will be a number of compelling sessions, including a presentation on HCatalog hosted by Alan Gates himself.
~ John Kreisa