HDP on Linux – Installation Forum

HDP Cluster Installation

  • #9285

    Hi All –
    I am doing HDP installation on 3-node cluster. I am stucked at ‘Add Nodes’ stage.
    At this we need to give two text files as inut, one for private key and 2nd is Hostdetail.txt
    So for the private key file which file I should give as there are 4 files under .ssh folder. 1) id_rsa 2) id_rsa.pub 3) authorized_keys and 4) known_hosts.
    and second thing is when I give both the inut files as input it will start loading which is not completed since long time.
    Please help me to resolve the issue.

    Saurabh Deshpande.

to create new topics or reply. | New User Registration

  • Author
  • #9457
    Sasha J


    The key file you should give is ‘id_rsa’. What are the contents of the hostdetail.txt? and can you send us the log files?



    ok.. I am done with that.. But now I am getting problem at Deployment Progress – Cluster install stage.
    It is saying Failed to finish setting up the cluster. Here I am putting the deploy log.. Please have a look and help me to resolve the same.

    Deploy Logs:-

    “2”: {
    “nodeReport”: {
    “nodeLogs”: {
    “m2.example.com”: {
    “reportfile”: “/var/lib/puppet/reports/3-2-0/m2.example.com”,
    “overall”: “FAILED”,
    “finishtime”: “2012-09-10 09:28:50.180495 +05:30”,
    “message”: [
    “Loaded state in 0.00 seconds”,
    “Not using expired catalog for m2.example.com from cache; expired at Sat Sep 08 14:54:58 +0530 2012”,
    “Using cached catalog”,
    “\”catalog supports formats: b64_zlib_yaml dot marshal pson raw yaml; using pson\””,
    “Caching catalog for m2.example.com”,
    “Creating default schedules”,
    “Loaded state in 0.00 seconds”,
    “Applying configuration version ‘3-2-0′”,

    “\”Mon Sep 10 09:29:36 +0530 2012 /Stage[2]/Hdp-hadoop::Snamenode/Hdp-hadoop::Snamenode::Create_name_dirs[/dev/mapper/VolGroup00-LogVol00/hadoop/hdfs/namesecondary]/Hdp::Directory_recursive_create[/dev/mapper/VolGroup00-LogVol00/hadoop/hdfs/namesecondary]/Hdp::Exec[mkdir -p /dev/mapper/VolGroup00-LogVol00/hadoop/hdfs/namesecondary]/Exec[mkdir -p /dev/mapper/VolGroup00-LogVol00/hadoop/hdfs/namesecondary]/returns (err): change from notrun to 0 failed: mkdir -p /dev/mapper/VolGroup00-LogVol00/hadoop/hdfs/namesecondary returned 1 instead of one of [0] at /etc/puppet/agent/modules/hdp/manifests/init.pp:253\””,
    “\”Mon Sep 10 09:29:36 +0530 2012 /Stage[2]/Hdp-hadoop::Snamenode/Hdp-hadoop::Snamenode::Create_name_dirs[/dev/mapper/VolGroup00-LogVol00/hadoop/hdfs/namesecondary]/Hdp::Directory_recursive_create[/dev/mapper/VolGroup00-LogVol00/hadoop/hdfs/namesecondary]/Hdp::Exec[mkdir -p /dev/mapper/VolGroup00-LogVol00/hadoop/hdfs/namesecondary]/Anchor[hdp::exec::mkdir -p /dev/mapper/VolGroup00-LogVol00/hadoop/hdfs/namesecondary::end] (notice): Dependency Exec[mkdir -p /dev/mapper/VolGroup00-LogVol00/hadoop/hdfs/namesecondary] has failures: true\””,

    Sasha J

    Here is the problem:
    LogVol00/hadoop/hdfs/namesecondary]/returns (err): change from notrun to 0 failed: mkdir -p /dev/mapper/VolGroup00-LogVol00/hadoop/hdfs/namesecondary returned 1 instead of one of [0] at /etc/puppet/agent/modules/hdp/manifests/init.pp:253\””,

    You have to uncheck /dev/mapper/ in the directory selection page and enter some path in the text field.



    ok. thanks.. cluster is installed suceessfully..


    Hi Sasha –
    Since now cluster is installed on our machines.. I need to run a sample mapreduce wordcount program on 20gb text file.. so from where should I run wordcount? does HDP provides any i/f for that or I need to run it from terminal as same we used to ran it for apache like :-
    $ hadoop*examples*.jar wordcount sample20gb.txt output.txt.
    Also want to run pig scripts.. where I should write and run those? because when I write Pig on terminal instead of opening a pig grunt terminal it is giving an error..


    Hi Sasha –
    Two node cluster(1 Master/1 Slave) is working fine.. I want to add 3rd slave node.. when I Add Node with a single entry of new node in hostdetail.txt file, it failed.. Here is error-

    Failed. Reason: Puppet agent ping failed: , error=111, outputLogs=Puppet agent ping failed: [Connection refused]

    Sasha J

    to run jobs you have to ssh to the one of the cluster nodes, this is correct.
    As of adding nodes, make sure you have all the prerequisites reached on that node.


    I am trying 2-node cluster(these are different systems than above we discussed..)
    I have SSH running and mount point is /root/
    At deployment progress I am getting failed error at HDFS test.

    Sasha J

    As usual,
    please run check script mentioned in the following post:
    And please, do not post whole logs here in the forums…

    Thank you!


    ok.. I am running wordcount mapreduce program for 20gb text file.. it is failing due to some java heap size error.. I tried by increasing heap memory size but still giving the same error..

    Sasha J

    How this heap size problem related to cluster installation?
    Post to correspondent forum thread with the heap questions.
    This thread is for installation problems only.

    Thank you!


    is it possible to remove any of the node from cluster?

    Sasha J

    take a look here:

    Or google for “hadoop decommissioning node”.

    This node removing functionality is not integrated to HMC yet.


    ok.. thanks..


    while installing 3-node cluster at Add Nodes stage preparing discovered nodes is taking much time.. among 3, 2 nodes are succeeded but it is showing 1/3 is still in progress from long time..

    Sasha J

    Is it completed successfully or with an error?
    What means “much long” here? 5 minutes, 30 minutes, 1 hour, more?
    Please, be more specific.
    please run check script mentioned in the following post:


You must be to reply to this topic. | Create Account

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.