OPEN SOURCE HADOOP NOW RUNS ON AN OPEN COMPUTE PLATFORM
The software market is undergoing a major transition, moving away from proprietary software that leads to customer lock-in. Open source software offers freedom, more flexibility, and faster innovation – all at a lower cost. With the release of HDP 2.6 now available on IBM Power Systems, we offer a 100% open source Hadoop platform on an open server platform for new levels of performance. IBM brings this approach from the software world to the hardware systems market by allowing member companies of the OpenPOWER Foundation to innovate and optimize the OpenPOWER architecture for new workloads such as machine learning and artificial intelligence.
ARE YOU READY FOR ARTIFICIAL INTELLIGENCE
A major achievement stemming from open collaboration is the IBM Power System S822LC for HPC that enables blazing-fast deep learning analysis as part of IBM’s PowerAI platform. The PowerAI platform includes the most popular open-source deep learning frameworks and their dependencies – pre-compiled, performance optimized, and easy to install. With PowerAI on IBM Power Systems, enterprises can rapidly deploy a fully optimized and supported platform for machine learning with superior performance.
PowerAI takes advantage of the NVIDIA® NVLink™ CPU:GPU interconnect to support and load larger deep learning models than ever before. NVIDIA NVLink is a hardware bridge that can link CPUs to GPUs or GPUs to GPUs, at up to 80 GB/s of bandwidth.
On IBM’s highest-end OpenPOWER LC server family model, the Power Systems S822LC for HPC, PowerAI can run some artificial intelligence workloads at up to twice the speed of a competing system— due to the POWER8’s NVLink connection to NVIDIA® Tesla® P100 GPUs and performance exclusive to the PowerAI software distribution.
PowerAI represents a powerful new tool to marry with your existing data plan and data-science toolchain.
HDP 2.6 MAKES DATA SCIENCE MORE ACCESSIBLE
With the availability of HDP 2.6 on IBM Power Systems, we took another step in removing barriers to adoption of new data science workloads. Data scientists using Spark with R language can now deploy their favorite R package with their Spark job, and gain superior performance by running these workloads on IBM Power Systems. During recent performance testing we saw a significant performance improvement resulting in 70% more queries per hour based on an average response time, and 40% reduction on average in query response time.
HDP’s YARN-based architecture enables multiple applications to share a common cluster and dataset while ensuring consistent levels of service and response. Now Spark is one of the many data access engines that works with YARN and that is supported in an HDP enterprise data lake. Spark provides HDP subscribers yet another way to derive value from any data, any application, anywhere.
IBM SPECTRUM SCALE STORAGE FOR IN PLACE ANALYTICS
With the adoption of data science we see customers looking at ingesting, transforming and analyzing all new kinds of data. The proverbial “data lake” has made its way into the vernacular of the big data community. Building a data repository that breaks down data silos, and allows for new insights is crucial for reaping the benefits of next generation analytics. A wide range of data storage solutions has made inroads since the inception of Hadoop with HDFS. Hortonworks has been working with technology partners such as IBM to integrate HDP with enterprise-class storage platforms that allow you to address the most demanding big data business challenges.
IBM Spectrum Scale, formerly IBM General Parallel File System (IBM GPFS™), is such a software defined storage solution for building big data platforms. Hortonworks and IBM are joining engineering efforts to have Spectrum Scale as a shared storage to HDP, which allows for de-coupling of compute and storage to enable optimized configurations. It provides File (NFS, SMB and POSIX) and Object (S3 and Swift) access and supports HDFS application program interface (API) access to the same data.
The advantages of using Spectrum Scale include enterprise-class functionality such as access control security; scalability and performance; built-in file system monitoring; pre-integrated backup and recovery support; pre-integrated information lifecycle management; and file system quotas to restrict abuse, as well as immutability and AppendOnly features to protect from accidentally destroying critical data.
The availability of HDP 2.6 on IBM Power Systems for leading performance of demanding use cases such as machine learning and AI, and integration with IBM’s Spectrum Scale storage solution offers customers more choice as they select their future big data platform.
More details for HDP on Power Systems can be found at:
Join us for a webinar on May 4th: Master Real-Time Intelligence with Open Technology Innovation
To register click here