cta

Get Started

cloud

Ready to Get Started?

Download sandbox

How can we help you?

closeClose button
January 23, 2017
prev slideNext slide

DataWorks Summit/Hadoop Summit San Jose Call for Abstracts is now Open!

Interested in sharing your knowledge with the best and brightest in the data community? If you are, then be sure to submit an abstract for DataWorks Summit/Hadoop Summit San Jose, which will beheld June 13-15 at the San Jose McEnery Convention Center.

DataWorks Summit/Hadoop Summit is the industry’s premier event focusing on next-generation big data solutions. You’ll want to attend so that you can learn from your peers and industry experts alike on how open source technologies enable you to leverage data to drive predictive analytics, distributed deep-learning, and artificial intelligence initiatives across global organizations!

Be sure to submit your abstracts now!

Tracks for this year include:

·         Applications – In this track you will hear from ISVs, and architects that have created applications, frameworks, and solutions that have been built to solve real business problems leveraging data as an asset. These Modern Data Applications are augmenting traditional architectures and extending the reach for insights from the edge to the data center. Sessions in this track span both technical and business audiences, discussing business justification and ROI to technical architecture.

·         Enterprise Adoption – In this track you will learn from enterprise business leaders and innovators about how they have used data to transform their business. Sessions cover architecture, business benefits, challenges, and secrets to success around these transformations. Speakers are from different companies across industries and geographies, but they have one thing in common: they are leveraging data and open source technology for amazing business outcomes. Sessions will cover ROI, business benefits, and success criteria, as well as hard-fought lessons learned in their journey.

·         Data Processing & Warehousing – Apache Hadoop – YARN has transformed Hadoop into a multi-tenant data platform. It is the foundation for a wide range of processing engines that empowers businesses to interact with the same data in multiple ways simultaneously. This means applications can interact with the data in the most appropriate way: from batch to interactive SQL or low latency access with NoSQL, and the interaction of legacy data stores and big data. There is a vast ecosystem of SQL engines and tools that are enabling richer Data Warehousing on Hadoop with capabilities for ACID, interactive queries, OLAP and data transformation. You will have the opportunity to hear from the rock stars of the Apache community and learn how these innovators are building applications.

·         Apache Hadoop – Apache Hadoop continues to drive innovation at a rapid pace, and the next generation of Hadoop is being built today. This track showcases new developments in core Hadoop and closely related technologies. Attendees will hear about key projects, such as HDFS and YARN, projects in incubation, and the industry initiatives driving innovation in and around the Hadoop platform. Attendees will interact with technical leads, committers, and expert users who are actively driving the roadmaps, key features, and advanced technology research around what is coming next for the Apache Hadoop.

·         Governance & Security – With the growing volumes of diverse data being stored in the Data Lake, any breach of this enterprise-wide data can be catastrophic, from privacy violations and regulatory infractions to corporate image and long-term shareholder value. This track focuses on the key enterprise requirements for governance and security for the extended data plane. As Hadoop and streaming applications emerges as a critical foundation of a modern data application, the enterprise has placed stringent requirements on it for these key areas. Speakers will present best practices with an emphasis on tips, tricks, and war stories on how to secure your big data infrastructure. Sessions will cover full deployment lifecycle for on-premise and cloud deployments, including installation, configuration, initial production deployment, recovery, security, and data governance for Hadoop.

·         IoT & Streaming – The increase in the number of sensors and connected devices is fueling data growth and the opportunity to leverage streaming data for new insights and interactions. The speed with which enterprises can make decisions based on data is critical to their competitive advantage. This track covers the state of the art in obtaining perishable insights from streaming data sources, including managing devices at the “jagged edge”, strategies and practices for data ingestion and analysis, and best practices for deriving real-time actionable insights as the data flows from connected devices into Hadoop infrastructure. Attendees will hear from the technical leads, committers, and expert users who are actively driving the roadmaps and key features in IoT emerging technologies. Attendees will also learn how to use these technologies to develop IoT solutions.

·         Cloud & Operations – For a system to be “open for business”, it must be efficiently managed by system administrators. A critical component of a successful connected data architecture is a comprehensive dataflow and operations strategy. This track covers the core practices and patterns for planning, deploying, loading, moving, backup/recovery, HA and managing data across edge, on-premise and cloud. The track is focused on deploying and operating Hadoop and the extended Apache Data ecosystem in the on-premise and cloud. Sessions will range from how to get started, and operating your cluster to cutting-edge best practices for large-scale deployments.

·         Apache Spark & Data Science – Insights from the data lake drive business innovation. Leveraging Spark and Apache Hadoop for predictive analytics and modern applications enables businesses to shift focus from reactive to proactive decision-making. The shift is a key driver across industries in adopting Spark and Hadoop as their go-to advanced analytics platform. This track covers introductory to advanced sessions on algorithms, tools, applications, and emerging research topics that extend the Hadoop ecosystem for data science. Sessions will include examples of innovative analytics applications and systems, data visualization, statistics and machine learning. You will hear from leading data scientists, analysts and practitioners who are driving innovation by extracting valuable insights from data at rest as well as data in motion.

Comments

  • Leave a Reply

    Your email address will not be published. Required fields are marked *