The Hortonworks Community Connection is now live. A completely rebuilt Q&A forum, Knowledge Base, Code Hub and more, backed by the experts in the industry.

You will be redirected here in 10 seconds. If your are not redirected, click here to visit the new site.

The legacy Hortonworks Forum is now closed. You can view a read-only version of the former site by clicking here. The site will be taken offline on January 31,2016

HDP on Linux – Installation Forum

Ambari agent socket time out

  • #43461
    Ardavan Moinzadeh
    Participant

    After the installation of HDP 1.3 on my cluster of 3 nodes. I have all services running with few Alerts.
    Alert 1: Ambari-agent socket time out alert for 2 of my slave nodes. nothing specific in ambari agent logs, as I check every service is getting queued.
    Alert 2: Job tracker CPU utilization Error. Although the mapreduce smoke test finishes successfully I have no job history on my job tab.

    Any solution/suggestion is appreciated

    Thank you

  • Author
    Replies
  • #43886
    Koelli Mungee
    Moderator

    Hi Ardavan

    Did you try restarting the ambari-agent that is getting the alert? Do you have more details on the job tracker CPU utilization error?

    -koelli

    #43918
    Ardavan Moinzadeh
    Participant

    Hello Koelli,
    -Yes I did try restarting ambari-agent and it didn’t change anything. I even tried killing the process and start it over again.

    -No error/information is shown in jobtracker logs about this issue.

    #43920
    Dave
    Moderator

    Hi Ardavan,

    Can you telnet to the ambari-server from the machines that are getting the timeout on port 8440 ?

    For the CPU utilization error:

    CPU utilization alert

    This alert is triggered if the percent per CPU utilization on the master host exceeds the config­ured critical threshold. This alert uses the Nagios check_snmp_load plug-in.
    Possible causes

    • Unusually high CPU utilization
    This can be caused by a very unusual job/query workload, but is generally the sign of an issue in the daemon
    • The SNMP daemon running on the master node is down, producing an unknown status
    Potential remedies

    • Use the “top” command to determine which processes are consuming excess CPU
    • Reset the offending process
    • Check the status of the SNMP daemon

    Let me know how you get on,

    Thanks

    Dave

    #43929
    Ardavan Moinzadeh
    Participant

    Dave,
    I was able to telnet to ambari server from both of the slaves that are getting the ambari-agent socket time out error.
    When I do top there are two MapReduce processes running. I kill them and retstart MapReduce and still having the errors.
    Thanks

    #43930
    Ardavan Moinzadeh
    Participant

    also checked SNMP, it’s running on all 3 nodes
    Thanks

    #43937
    Ardavan Moinzadeh
    Participant

    In nagios under service configuration i see the following for ambari agent process :
    USER1$/check_tcp -H $HOSTADDRESS$ -p 8670 -w 1 -c 1
    shouldn’t this be :
    USER1$/check_tcp -H $HOSTADDRESS$ -p 8440 -w 1 -c 1 ?

The forum ‘HDP on Linux – Installation’ is closed to new topics and replies.

Support from the Experts

A HDP Support Subscription connects you experts with deep experience running Apache Hadoop in production, at-scale on the most demanding workloads.

Enterprise Support »

Become HDP Certified

Real world training designed by the core architects of Hadoop. Scenario-based training courses are available in-classroom or online from anywhere in the world

Training »

Hortonworks Data Platform
The Hortonworks Data Platform is a 100% open source distribution of Apache Hadoop that is truly enterprise grade having been built, tested and hardened with enterprise rigor.
Get started with Sandbox
Hortonworks Sandbox is a self-contained virtual machine with Apache Hadoop pre-configured alongside a set of hands-on, step-by-step Hadoop tutorials.
Modern Data Architecture
Tackle the challenges of big data. Hadoop integrates with existing EDW, RDBMS and MPP systems to deliver lower cost, higher capacity infrastructure.