How to get started coding?

to create new topics or reply. | New User Registration

Tagged: , ,

This topic contains 2 replies, has 2 voices, and was last updated by  Clement Fleury 1 year, 4 months ago.

  • Creator
  • #50776

    Clement Fleury

    Hi there!
    First of all, I must warn you : I am VERY new to Hadoop.

    I have installed the Hortonworks sandbox and did the first 6 tutorials (I am a Linux user, so no Excel for me…), so I think I understand quite well now Pig, Hive and HCatalog. I also have read a lot about HDFS, MapReduce and I really think I get all the concepts in Hadoop.

    I also installed successfully HDP2 with Ambari on a remote virtual machine (Proxmox).

    BUT : what now?

    I want to develop my own Java application that uses HDP2 cluster.

    I’m developing on my workstation (Eclipse on Ubuntu).

    How to get started ?
    What plugins / libraries do I have ton install ?
    How do I have to structure my code ?
    How do I get my local program to be executed on my HDP2 ?
    … ?

    I am so new to this that I don’t even know if I’m asking the right questions at the right people.
    Any tip, help, hint, step-by-step tutorial, link, … will be apreciated.


Viewing 2 replies - 1 through 2 (of 2 total)

You must be to reply to this topic. | Create Account

  • Author
  • #51133

    Clement Fleury


    Thanks @Sanjeev , very useful information indeed.

    If I may make a suggestion to the HortonWorks (and Hadoop as a general) community, as a real newbie : the HDP Sandbox tutorials are a really GREAT way to understand the main concepts of Hadoop (HDFS and MapReduce), as well as other tools (HCatalog, Hive, Pig, …).
    BUT, I think there is a real huge gap between these tutorials and the moment when you are actually developing actual code for Hadoop.

    I found only one usful (to me, as a rookie : very detailed and up to date) resource :

    If you have any other such pointers, please share!




    Hi Clement,

    Please find some answers as below:

    How to get started ?
    Once you have eclipse setup, you would start your Java application just like what you would do when developing normal Java application.

    What plugins / libraries do I have to install ?
    There are some plugins available in market to try out. To keep it simple you can just add the required libraries to your project.
    For instance, if you are using HDFS api’s. the jar’s are usually available @ /usr/lib/hadoop/lib. For Hive the location is /usr/lib/hive/lib and so on

    How do I have to structure my code ?
    Nothing specific here

    How do I get my local program to be executed on my HDP2 ?
    Normally, you will bundle your Java code into a jar and use the “hadoop jar <jar-name>” command to run it.
    Additionally, setting up a remote debugging will make it easy to debug your code. Please see

    Hope this helps.


Viewing 2 replies - 1 through 2 (of 2 total)
Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.