Ready to Get Started?DOWNLOAD SANDBOX
With the release of Spark 1.6, Hortonworks commits to helping customers accelerate data science, maintain seamless data access, drive innovation at the core.
Spark as part of open enterprise Hadoop, empowers organizations to scale Spark, for enterprise value.
Improving data science productivity by enhancing Apache Zeppelin and by contributing additional Spark algorithms and packages to ease the development of key solutions.
For example: Project Magellan - Geospatial analytics in Apache Spark, an open source library for geospatial analytics that facilitates geospatial queries and builds upon Spark to solve hard problems dealing with geospatial data at scale.
Spark SQL provides a SQL and Data Frame APIs to access structured data while Spark Streaming enables developers to easily build scalable, high-throughput, fault-tolerant stream processing of live data streams.
Hortonworks has been improving Spark’s integration with YARN, HDFS, Hive, HBase and ORC. Specifically, we believe that we can further optimize data access via the new Data Source API.