As part of the DataWorks / Hadoop Summit there are 5 Big Data Open Source Birds of Feathers sessions that are free and open to the community. The Birds of a Feather(BOF) is an informal discussion group. These BoF sessions are informal meeting at the DataWorks / Hadoop Summit, where the attendees group together based on a shared interest and carry out discussions without any pre-planned agenda.
Come an network with Apache Committers, PMC members and Users and discuss use cases, requirements and discuss the future of these projects.
BIRDS OF A FEATHER SESSIONS
APACHE SPARK, APACHE ZEPPELIN & DATA SCIENCE
Apache Spark is a fast, in-memory data processing engine with elegant and expressive development APIs to allow data workers to efficiently execute streaming, machine learning or SQL workloads that require fast iterative access to datasets. Come learn and discuss Spark, Data Science, Deep Learning innovations and future directions.
APACHE HIVE, APACHE HBASE & APACHE PHOENIX
Apache Hive is the de facto standard for SQL queries in Hadoop. The next phase of the Stinger. next initiative, the Apache community has greatly improved Hive’s speed, scale and SQL semantics. Come learn and discuss what is new in Hive 2.0.
Apache HBase is the NoSQL store that runs on Apache Hadoop. Apache Phoenix provides a SQL skin on top of HBase.
Come learn and discuss Hbase 2.0 along with the latest developments Phoenix.
IOT, STREAMING & DATA FLOW
Real-time data processing with Apache NiFi, Apache Kafka, Apache Storm and Apache Spark Streaming provides the foundation for IoAT. Come learn and discuss the latest streaming & data flow innovations and future directions.
SECURITY, GOVERNANCE & CYBERSECURITY
Apache Knox and Apache Ranger provide Hadoop security while Atlas provides a Hadoop metadata store and enterprise compliance. Come learn and discuss security & governance innovations and future directions.
Apache Metron is a new top level Apache project focused on open source big data cyber security analytics platform supporting real time ingest and analytics to discover information security threats and build out a high value security data lake. Apache Metron helps security operations teams be more efficient by reducing the amount of “DIY” big data and data science tooling necessary to detect threats in real time.
Come learn and discuss the latest Metron innovations and future directions.
APACHE HADOOP – YARN, HDFS
Apache Hadoop keeps evolving to meet the community demands around distributing computing and storage. Apache Hadoop 3.0 is actively in development in the community with key enhancements to YARN and HDFS.
Apache Hadoop YARN is the architectural center of Hadoop that allows multiple data processing engines to handle data stored in a single platform, unlocking an entirely new approach to analytics. Come learn and discuss the latest YARN innovations and future directions.
Apache Hadoop HDFS is a distributed Java-based file system for storing large volumes of data. Come learn and discuss the latest HDFS innovations and future directions.