Home Forums HDP on Linux – Installation Services stuck in Ambari

This topic contains 4 replies, has 3 voices, and was last updated by  Seth Lyubich 1 year, 6 months ago.

  • Creator
    Topic
  • #16073

    Francois BORIE
    Participant

    Hello,

    After some successful work with my Hadoop cluster, I’m having some trouble managing services

    I tried to stop completely all the hadoop services in order to modify some log4j rotation log parameters, and since this time, some services are stuck (they are in state STOP_FAILED, however they are successfully shutdowned).

    The symptoms are the following :

    I’m able to launch events both from the Ambari-server UI or directly with the API via curl. But all the actions that are launched are never taken into account, they stay with state QUEUED or PENDING …

    And puppet site files corresponding to those actions in /var/lib/ambari-agent/data are not generated anymore, as before

    (I succeed in stop / start all those services manually, using custom puppet manifests, so the problem seems not to be situated at the service level)

    It looks like ambari-agents are doing nothing and didn’t take the Ambari-server actions which are terminated with TIMEOUT state.

    I see nothing particular both in Ambari-agent and ambari-server logs which could explain this behavor. I already tried to restart all of them, and even rebooting servers composing my HDP cluster. But the issue is still there.

    Below an example for nagios service of what I am saying :

    {
    “href” : “http://obench20s:8080/api/v1/clusters/hadoop_poc/requests/90/tasks/361″,
    “Tasks” : {
    “exit_code” : 999,
    “stdout” : “”,
    “status” : “QUEUED”,
    “stderr” : “”,
    “host_name” : “obench20s****”,
    “id” : 361,
    “cluster_name” : “hadoop_poc”,
    “attempt_cnt” : 1,
    “request_id” : 90,
    “command” : “STOP”,
    “role” : “NAGIOS_SERVER”,
    “start_time” : 1361895724078,
    “stage_id” : 1
    }

    A few time later :

    {
    “href” : “http://obench20s:8080/api/v1/clusters/hadoop_poc/requests/90/tasks/361″,
    “Tasks” : {
    “exit_code” : 999,
    “stdout” : “”,
    “status” : “TIMEDOUT”,
    “stderr” : “”,
    “host_name” : “obench20s****”,
    “id” : 361,
    “cluster_name” : “hadoop_poc”,
    “attempt_cnt” : 2,
    “request_id” : 90,
    “command” : “STOP”,
    “role” : “NAGIOS_SERVER”,
    “start_time” : 1361895724078,
    “stage_id” : 1
    }

    In ambari-server log, I get :

    17:28:37,403 DEBUG ResourceProviderImpl:271 – Setting property for resource, resourceType=HostComponent, propertyId=HostRoles/host_name, value=obench20s*****
    17:28:37,403 DEBUG ResourceProviderImpl:271 – Setting property for resource, resourceType=HostComponent, propertyId=HostRoles/state, value=STOPPING
    17:28:37,404 DEBUG ResourceProviderImpl:271 – Setting property for resource, resourceType=HostComponent, propertyId=HostRoles/desired_state, value=INSTALLED

    Ambari-agent logs (from the server where Nagios normally run):

    INFO 2013-02-26 17:36:30,487 Heartbeat.py:68 – Heartbeat dump: {‘componentStatus’: [],
    ‘hostname’: ‘obench20s****’,
    ‘nodeStatus’: {’cause’: ‘NONE’, ‘status’: ‘HEALTHY’},
    ‘reports’: [],
    ‘responseId’: 260,
    ‘timestamp’: 1361896590486}

    Many thanks for help

Viewing 4 replies - 1 through 4 (of 4 total)

You must be logged in to reply to this topic.

  • Author
    Replies
  • #17000

    Seth Lyubich
    Keymaster

    This issue is resolved. Link to Apache Ambari Jira that tracks this issue:

    https://issues.apache.org/jira/browse/AMBARI-1582

    Thanks,
    Seth

    Collapse
    #16208

    Larry Liu
    Moderator

    HI, Francois

    I just sent an email to you. Let’s take this offline.

    Thanks

    Larry

    Collapse
    #16164

    Francois BORIE
    Participant

    Hi Larry,

    Thanks for your answer.

    I’m using the following versions of Ambari server and Ambari agent :

    ambari-server-1.2.1.2-1.noarch
    ambari-agent-1.2.1.2-1.x86_64

    Regards,

    Fran├žois

    Collapse
    #16101

    Larry Liu
    Moderator

    Hi, Francois

    What verison of ambari are you using? The most recent ambari 1.2.1 should be able to fix the issue.

    Thanks

    Larry

    Collapse
Viewing 4 replies - 1 through 4 (of 4 total)