HDFS Forum

How to configure a Hadoop Development Environment ?

  • #25728
    suaroman
    Participant

    I have a 3 node cluster setup. All smoke screens run fine for all services. All sample’s seem to work fine too (WordCount,etc…)

    I have a machine that I would like to use as a development machine (e.g. a gateway into my cluster).

    Can someone tell me how to setup Eclipse so I can compile/build java mr programs to run in my cluster?

    Thanks

to create new topics or reply. | New User Registration

  • Author
    Replies
  • #25891
    Seth Lyubich
    Moderator

    Hi Suaroman,

    I think you can try to install Eclipse on spare machine and add your machine as a client node to the cluster. Once you compile code on that machine you should be able to submit jobs to the cluster.

    Hope this helps.

    Thanks,
    Seth

    #25946
    suaroman
    Participant

    Thanks for the reply. I already have Eclipse and spare machine setup.
    The machine connects to the HDP cluster. Currently able to build/ run hive and pig without any problems.

    I’m interested now in learning to develop Java MR programs but uncertain how to configure Eclipse properly to work with my HDP cluster. I never knew finding information like this would be so difficult. Figure basic information like ‘how to build dev encironment” would be plentiful .

    Thanks. Anxiously awaiting further replies

    #26007
    tedr
    Moderator

    Hi Sauroman,

    There is an eclipse plugin jar located in ‘/usr/lib/hadoop/contrib/eclipse-plugin’ that you supposedly can just copy to the plugins directory of your eclipse installation. I am looking for some documentation on how to use it.

    Thanks,
    Ted.

The topic ‘How to configure a Hadoop Development Environment ?’ is closed to new replies.

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.