The Hortonworks Community Connection is now live. A completely rebuilt Q&A forum, Knowledge Base, Code Hub and more, backed by the experts in the industry.

You will be redirected here in 10 seconds. If your are not redirected, click here to visit the new site.

The legacy Hortonworks Forum is now closed. You can view a read-only version of the former site by clicking here. The site will be taken offline on January 31,2016

HDP on Linux – Installation Forum

HDP remote client/edgenode setup

  • #12093
    Shyam Kadari

    I have a five node VM cluster setup with HDP.

    1) What is the recommended way to build a client/edge node machine with HDP that can interact with HDP cluster? The client/edge node is not part of the cluster.

    2) Where do I find config information for clients? I know the config files are /etc/hadoop/conf in the cluster. Do I just take these config files and modify them so that remote client is properly configured.

    Shyam Kadari

  • Author
  • #12097
    Sasha J

    current HMC release does not support client creation outside of cluster.
    You can use HDP rpms to install in on the client machine and never start any services on it.
    Then you can copy /etc/hadoop/conf files to this newly created client machine.
    After this you will be able to connect to HDFS, start mapreduce jobs, use HBase, etc from your client machine.
    I have this type of setup on my Mac notebook and point to remote cluster, it all works OK.

    Leonids-MacBook-Pro:~ lfedotov$ hadoop fs -ls /
    Found 4 items
    drwxr-xr-x – hdfs hdfs 0 2012-11-07 17:34 /apps
    drwx—— – mapred hdfs 0 2012-11-15 09:49 /mapred
    drwxrwxrwx – hdfs hdfs 0 2012-11-15 09:14 /tmp
    drwxr-xr-x – hdfs hdfs 0 2012-11-13 04:00 /user
    Leonids-MacBook-Pro:~ lfedotov$
    Leonids-MacBook-Pro:~ lfedotov$
    Leonids-MacBook-Pro:~ lfedotov$
    Leonids-MacBook-Pro:~ lfedotov$ hbase shell
    HBase Shell; enter ‘help’ for list of supported commands.
    Type “exit” to leave the HBase Shell
    Version, rUnknown, Thu Aug 30 14:51:28 PDT 2012

    hbase(main):001:0> status ‘simple’
    2012-11-15 15:34:37.398 java[56309:1903] Unable to load realm info from SCDynamicStore
    3 live servers
    rhel63-4:60020 1352999830203
    requestsPerSecond=0, numberOfOnlineRegions=1, usedHeapMB=97, maxHeapMB=1004
    rhel63-6:60020 1352999829904
    requestsPerSecond=0, numberOfOnlineRegions=2, usedHeapMB=136, maxHeapMB=1004
    rhel63-5:60020 1352999830880
    requestsPerSecond=0, numberOfOnlineRegions=1, usedHeapMB=85, maxHeapMB=1004
    0 dead servers
    Aggregate load: 0, regions: 4

    Leonids-MacBook-Pro:~ lfedotov$ hadoop jar /usr/lib/hadoop/hadoop-examples.jar pi 10 10
    Number of Maps = 10
    Samples per Map = 10
    Wrote input for Map #0
    Wrote input for Map #1
    Wrote input for Map #2
    Wrote input for Map #3
    Wrote input for Map #4
    Wrote input for Map #5
    Wrote input for Map #6
    Wrote input for Map #7
    Wrote input for Map #8
    Wrote input for Map #9
    Starting Job
    12/11/15 15:35:32 INFO mapred.FileInputFormat: Total input paths to process : 10
    12/11/15 15:35:33 INFO mapred.JobClient: Running job: job_201211150949_0004
    12/11/15 15:35:34 INFO mapred.JobClient: map 0% reduce 0%
    12/11/15 15:35:43 INFO mapred.JobClient: map 10% reduce 0%
    12/11/15 15:35:44 INFO mapred.JobClient: map 60% reduce 0%
    12/11/15 15:35:46 INFO mapred.JobClient: map 80% reduce 0%
    12/11/15 15:35:47 INFO mapred.JobClient: map 100% reduce 0%
    12/11/15 15:35:52 INFO mapred.JobClient: map 100% reduce 30%
    12/11/15 15:35:54 INFO mapred.JobClient: map 100% reduce 100%
    12/11/15 15:35:54 INFO mapred.JobClient: Job complete: job_201211150949_0004
    12/11/15 15:35:54 INFO mapred.JobClient: Counters: 31

    12/11/15 15:35:54 INFO mapred.JobClient: Map output records=20
    Job Finished in 22.629 seconds
    Estimated value of Pi is 3.20000000000000000000

    Hope this help.

    Thank you!

    Shyam Kadari

    Thank you very much for answering my question quickly. This is going to be painful when we have to do client installs and configs for large number of clients (developers, support, BA and data scientists, etc.) who need to access cluster for various reasons in an enterprise.

    Shyam Kadari

    Sasha J

    You can clone on of the “slave” nodes and use it as “client”.
    I think such functionality will be added in the future releases, but as of now this is the only way…


The forum ‘HDP on Linux – Installation’ is closed to new topics and replies.

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.