HDP on Linux – Installation Forum

Ambari agent socket time out

  • #43461
    Ardavan Moinzadeh

    After the installation of HDP 1.3 on my cluster of 3 nodes. I have all services running with few Alerts.
    Alert 1: Ambari-agent socket time out alert for 2 of my slave nodes. nothing specific in ambari agent logs, as I check every service is getting queued.
    Alert 2: Job tracker CPU utilization Error. Although the mapreduce smoke test finishes successfully I have no job history on my job tab.

    Any solution/suggestion is appreciated

    Thank you

to create new topics or reply. | New User Registration

  • Author
  • #43886
    Koelli Mungee

    Hi Ardavan

    Did you try restarting the ambari-agent that is getting the alert? Do you have more details on the job tracker CPU utilization error?


    Ardavan Moinzadeh

    Hello Koelli,
    -Yes I did try restarting ambari-agent and it didn’t change anything. I even tried killing the process and start it over again.

    -No error/information is shown in jobtracker logs about this issue.


    Hi Ardavan,

    Can you telnet to the ambari-server from the machines that are getting the timeout on port 8440 ?

    For the CPU utilization error:

    CPU utilization alert

    This alert is triggered if the percent per CPU utilization on the master host exceeds the config­ured critical threshold. This alert uses the Nagios check_snmp_load plug-in.
    Possible causes

    • Unusually high CPU utilization
    This can be caused by a very unusual job/query workload, but is generally the sign of an issue in the daemon
    • The SNMP daemon running on the master node is down, producing an unknown status
    Potential remedies

    • Use the “top” command to determine which processes are consuming excess CPU
    • Reset the offending process
    • Check the status of the SNMP daemon

    Let me know how you get on,



    Ardavan Moinzadeh

    I was able to telnet to ambari server from both of the slaves that are getting the ambari-agent socket time out error.
    When I do top there are two MapReduce processes running. I kill them and retstart MapReduce and still having the errors.

    Ardavan Moinzadeh

    also checked SNMP, it’s running on all 3 nodes

    Ardavan Moinzadeh

    In nagios under service configuration i see the following for ambari agent process :
    USER1$/check_tcp -H $HOSTADDRESS$ -p 8670 -w 1 -c 1
    shouldn’t this be :
    USER1$/check_tcp -H $HOSTADDRESS$ -p 8440 -w 1 -c 1 ?

You must be to reply to this topic. | Create Account

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.