Developer Tools

Support & Enable a Vibrant Ecosystem of Hadoop Developers

Developers are responsible for building data driven apps in, on and around Apache Hadoop, and expect a vibrant and powerful set of tools, frameworks and interfaces to simplify this task. They are focused on delivering on the value of an application and do not want to be mired in the mechanical details of integration with Hadoop.

Recently, purpose-built application development frameworks, such as Cascading have been created and existing frameworks such as Java and .NET have been extended to accommodate the Hadoop community.  We work to support all of these frameworks, so that we can empower a world of Hadoop application development.

Cascading Development Framework

Cascading is a  application development framework for building data applications. Acting as an abstraction layer, Cascading does the heavy lifting and converts your applications built on Cascading into MapReduce jobs that run effectively on top of Hadoop.

The Cascading SDK provides a collection of tools, documentation, libraries, tutorials and example projects from the greater Cascading community and enables the rapid development of batch and interactive data-driven applications.

  • Lingual. Simplifies systems integration through ANSI SQL compatibility and a JDBC driver
  • Pattern. Enables various machine learning scoring algorithms through PMML compatibility
  • Scalding. Enables development with Scala, a powerful language for solving functional problems
  • Cascading.  Enables development with Clojure, a Lisp dialect

cascadingOnYarn

Integration with HDP allows Cascading to take advantage of advances in interactive applications provided by YARN.  Cascading is certified and supported by Hortonworks and backed by Concurrent.

Additional Resources

Microsoft .NET SDK for Hadoop

The Microsoft .NET SDK for Hadoop provides API access to HDP and Microsoft HDInsight including HDFS, HCatalag, Oozie and Ambari, and also some Powershell scripts for cluster management. There are also libraries for MapReduce and LINQ to Hive. The latter is really interesting as it builds on the established technology for .NET developers to access most data sources to deliver the capabilities of the de facto standard for Hadoop data query.

You can access the Microsoft .NET SDK for Hadoop here.

Additional Resources

Java and the Spring XD Framework

Spring for Apache Hadoop (SHDP)  a consistent configuration and API across a wide range of Hadoop ecosystem projects such as Pig, Hive, and Cascading in addition to providing extensions to Spring Batch for orchestrating Hadoop based workflows. It also provides integration with other Spring ecosystem project such as Spring Integration and Spring Batch enabling you to develop solutions for big data ingest/export and Hadoop workflow orchestration.

SHDP, together with Spring Integration, Spring Batch and Spring Data are part of the Spring IO Platform as foundational libraries.  Building on top of, and extending this foundation, the Spring IO platform provides a big data runtime named Spring XD (XD = eXtreme Data).  Spring XD provides a single platform that  addresses common use cases in big data solutions – without the need to write code – just by using a domain specific language (DSL).  These use cases include data ingestion from external sources, data transformation and real-time analytics, data import/export to/from HDFS, and workflow orchestration.

spring

These foundational parts of Spring IO platform make Hadoop development more accessible to a wider range of Java developers – including the massive Spring developer community – and make the process even faster for Hadoop experts.

Additional Resources

Status

Hortonworks has been supporting application developers since the beginning of the company. We work with the community to produce a standard set of APIs for developers to use to create their applications. Our most recent release includes both those development APIs and integration into developer tools to speed creation of new applications.

Join the Webinar!

Enrich a 360-degree Customer View with Apache Hadoop and Splunk
Tuesday, October 21, 2014
10am BST / 11am CEST

More Webinars »

Try these Tutorials

HDP 2.1 Webinar Series
Join us for a series of talks on some of the new enterprise functionality available in HDP 2.1 including data governance, security, operations and data access :
Contact Us
Hortonworks provides enterprise-grade support, services and training. Discuss how to leverage Hadoop in your business with our sales team.
Integrate with existing systems
Hortonworks maintains and works with an extensive partner ecosystem from broad enterprise platform vendors to specialized solutions and systems integrators.