Develop with Hortonworks Data Platform


Six steps to get you started with Hadoop

  1. Get a Development Cluster

    To get started with Hortonworks Data Platform, you will need a cluster.  There are two options currently available. Choose which one is right for you…

    Get Started with a pre-installed HDP stack…
    Get Started with your own cluster…
    Option 1: Use the Hortonworks Sandbox

    The Hortonworks Sandbox is ideal for developers who want to learn the programming interface without having to setup a cluster. Delivered as a virtual machine, no internet connection is required once you have the Sandbox installed and the datasets ready. Click below to get started.

    Get Started with Hortonworks Sandbox

    Option 2: Install HDP

    The Hortonworks Data Platform download is ideal for more experienced Hadoop developers or when configuring larger multi-node installations. A simple step-by-step process will help you install and configure your cluster quickly.

    Install HDP

  2. Learn from Samples

    The Hortonworks Sandbox comes pre-populated with data and tutorials to make learning easy. The tutorials will be updated on a regular basis. Make sure to visit the Sandbox pages on a regular basis to learn what’s new.

  3. Load Your Cluster with Data & Deploy a Process with Minimal Coding

    Hortonworks Data Platform includes a powerful set of tools for integrating existing data and systems with your Hadoop cluster. Talend Open Studio for Big Data, an add-on to Hortonworks Data Platform, provides a rich graphical interface for quickly importing and exporting data to/from HDFS. It allows you to use Pig, Hive, HBase and HCatalog functions to perform data transformations without having to write a single line of code.

    Talend Open Studio for Big Data is available on the downloads page. Check out the Talend website for useful documentation.

  4. Use Apache HCatalog to Easily Integrate with Other Data Systems

    HCatalog is a data dictionary, or metadata store, for your Hadoop cluster. It allows you to store metadata in your Hadoop cluster using SQL (or REST). As such, it allows you to treat Hadoop as if it were a traditional database. Hadoop data is accessible to developer and analytical tools via a programmatic interface, further simplifying integration.

    You can learn more about the value of HCatalog from this video, presented Alan Gates of Hortonworks.

  5. Get Involved!

    The Hortonworks User Community is where Hortonworks users help each other. It consists of help and discussion forums for exchanging tips and best practices for Hortonworks Data Platform and Apache Hadoop in general. We also recommend that you take part in the Apache Hadoop Community as well.

    It is important to get involved. Vibrant communities lead to innovative and high quality technology.

  6. Explore Training Courses

    Hortonworks University offers a wide range of training courses that can help you learn more about Hadoop. Most notable is the Developing Solutions using Apache Hadoop training course that is designed for Java developers who want to better understand how to create Apache Hadoop solutions. The course consists of an effective mix of hands-on lab exercises and presentation material.

    For a complete list of our training classes, please visit Hortonworks University.