Home Forums HDP on Linux – Installation Ambari agent socket time out

This topic contains 6 replies, has 3 voices, and was last updated by  Ardavan Moinzadeh 11 months ago.

  • Creator
    Topic
  • #43461

    Ardavan Moinzadeh
    Participant

    After the installation of HDP 1.3 on my cluster of 3 nodes. I have all services running with few Alerts.
    Alert 1: Ambari-agent socket time out alert for 2 of my slave nodes. nothing specific in ambari agent logs, as I check every service is getting queued.
    Alert 2: Job tracker CPU utilization Error. Although the mapreduce smoke test finishes successfully I have no job history on my job tab.

    Any solution/suggestion is appreciated

    Thank you

Viewing 6 replies - 1 through 6 (of 6 total)

You must be logged in to reply to this topic.

  • Author
    Replies
  • #43937

    Ardavan Moinzadeh
    Participant

    In nagios under service configuration i see the following for ambari agent process :
    USER1$/check_tcp -H $HOSTADDRESS$ -p 8670 -w 1 -c 1
    shouldn’t this be :
    USER1$/check_tcp -H $HOSTADDRESS$ -p 8440 -w 1 -c 1 ?

    Collapse
    #43930

    Ardavan Moinzadeh
    Participant

    also checked SNMP, it’s running on all 3 nodes
    Thanks

    Collapse
    #43929

    Ardavan Moinzadeh
    Participant

    Dave,
    I was able to telnet to ambari server from both of the slaves that are getting the ambari-agent socket time out error.
    When I do top there are two MapReduce processes running. I kill them and retstart MapReduce and still having the errors.
    Thanks

    Collapse
    #43920

    Dave
    Moderator

    Hi Ardavan,

    Can you telnet to the ambari-server from the machines that are getting the timeout on port 8440 ?

    For the CPU utilization error:

    CPU utilization alert

    This alert is triggered if the percent per CPU utilization on the master host exceeds the config­ured critical threshold. This alert uses the Nagios check_snmp_load plug-in.
    Possible causes

    • Unusually high CPU utilization
    This can be caused by a very unusual job/query workload, but is generally the sign of an issue in the daemon
    • The SNMP daemon running on the master node is down, producing an unknown status
    Potential remedies

    • Use the “top” command to determine which processes are consuming excess CPU
    • Reset the offending process
    • Check the status of the SNMP daemon

    Let me know how you get on,

    Thanks

    Dave

    Collapse
    #43918

    Ardavan Moinzadeh
    Participant

    Hello Koelli,
    -Yes I did try restarting ambari-agent and it didn’t change anything. I even tried killing the process and start it over again.

    -No error/information is shown in jobtracker logs about this issue.

    Collapse
    #43886

    Koelli Mungee
    Moderator

    Hi Ardavan

    Did you try restarting the ambari-agent that is getting the alert? Do you have more details on the job tracker CPU utilization error?

    -koelli

    Collapse
Viewing 6 replies - 1 through 6 (of 6 total)