【问题标题】:amazon-ecs-agent stops my application every 2~3 minutesamazon-ecs-agent 每 2~3 分钟停止一次我的应用程序
【发布时间】:2018-06-05 20:08:13
【问题描述】:

总结

显然,ECS 代理忽略了我的 ECS_CONTAINER_STOP_TIMEOUT 配置到 1 小时。

说明

我有一个容器需要一些时间来完成他的任务,并且因为我处于验证时间,所以我将 ECS_CONTAINER_STOP_TIMEOUT 变量设置为 1h,以避免代理对我的应用程序产生任何影响。但是每隔 2~3 分钟,代理仍然会停止我的应用程序。

在我看来,代理应该在尝试停止我的容器之前等待配置的 1 小时,对吗?

这是代理日志(Luigi 和 Model 是我的应用程序):

2018-04-09T11:44:06Z [INFO] Managed task [arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f]: sending task change event [arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f -> RUNNING, Known Sent: NONE, PullStartedAt: 2018-04-09 11:44:04.75112162 +0000 UTC m=+628.146626082, PullStoppedAt: 2018-04-09 11:44:05.740402013 +0000 UTC m=+629.135906466, ExecutionStoppedAt: 0001-01-01 00:00:00 +0000 UTC]
2018-04-09T11:44:06Z [INFO] TaskHandler: batching container event: arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f model -> RUNNING, Known Sent: NONE
2018-04-09T11:44:06Z [INFO] TaskHandler: Adding event: TaskChange: [arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f -> RUNNING, Known Sent: NONE, PullStartedAt: 2018-04-09 11:44:04.75112162 +0000 UTC m=+628.146626082, PullStoppedAt: 2018-04-09 11:44:05.740402013 +0000 UTC m=+629.135906466, ExecutionStoppedAt: 0001-01-01 00:00:00 +0000 UTC, arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f luigi -> RUNNING, Ports [{8082 8080 0.0.0.0 0}], Known Sent: NONE, arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f model -> RUNNING, Known Sent: NONE] sent: false
2018-04-09T11:44:06Z [INFO] TaskHandler: Sending task change: TaskChange: [arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f -> RUNNING, Known Sent: NONE, PullStartedAt: 2018-04-09 11:44:04.75112162 +0000 UTC m=+628.146626082, PullStoppedAt: 2018-04-09 11:44:05.740402013 +0000 UTC m=+629.135906466, ExecutionStoppedAt: 0001-01-01 00:00:00 +0000 UTC, arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f luigi -> RUNNING, Ports [{8082 8080 0.0.0.0 0}], Known Sent: NONE, arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f model -> RUNNING, Known Sent: NONE] sent: false
2018-04-09T11:44:06Z [INFO] Managed task [arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f]: sent task change event [arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f -> RUNNING, Known Sent: NONE, PullStartedAt: 2018-04-09 11:44:04.75112162 +0000 UTC m=+628.146626082, PullStoppedAt: 2018-04-09 11:44:05.740402013 +0000 UTC m=+629.135906466, ExecutionStoppedAt: 0001-01-01 00:00:00 +0000 UTC]
2018-04-09T11:44:06Z [INFO] Managed task [arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f]: redundant container state change. model to RUNNING, but already RUNNING
2018-04-09T11:44:14Z [INFO] Saving state! module="statemanager"
2018-04-09T11:46:42Z [INFO] Saving state! module="statemanager"
2018-04-09T11:46:42Z [INFO] Managed task [arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f]: Cgroup resource set up for task complete
2018-04-09T11:46:42Z [INFO] Task engine [arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f]: stopping container [luigi]
2018-04-09T11:46:42Z [INFO] Task engine [arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f]: stopping container [model]
2018-04-09T11:46:43Z [WARN] Error converting stats for container 823686bc5bcb5172a9f3d3fd6c0f4fd2a0fea924870f990f6a74475e6b840674: Invalid container statistics reported, no cpu core usage reported
2018-04-09T11:46:43Z [INFO] Task [arn:aws:ecs:sa-east-1:445147183740:task/22577c77-e2b5-4ca6-81bc-e9c214c1a23f]: recording execution stopped time. Essential container [luigi] stopped at: 2018-04-09 11:46:43.485611165 +0000 UTC m=+786.881116035

我已经手动:更改了 ecs.confg 并重新启动 ECS 代理,以应用新配置。

【问题讨论】:

    标签: amazon-web-services amazon-ecs


    【解决方案1】:

    aws-agent 向容器发送 SIGTERM,应用程序必须处理 SIGTERM 以避免容器关闭。发送 SIGTERM 后,aws-agent 将等待 ECS_CONTAINER_STOP_TIMEOUT 上配置的时间,然后向容器发送 SIGKILL。

    SIGTERM 处理示例:

    #!/bin/bash
    
    exit_script() {
        echo "SIGTERM captured..."
        echo "Cleanning..."
        trap - SIGINT SIGTERM # clear the trap
    }
    trap 'exit_script' SIGINT SIGTERM
    
    while true
    do
      echo "Waiting for the SIGTERM..."
      sleep 3
    done
    

    Credits for the question to richardpen.

    【讨论】:

      猜你喜欢
      • 2013-03-17
      • 1970-01-01
      • 2011-04-25
      • 1970-01-01
      • 1970-01-01
      • 2012-05-20
      • 1970-01-01
      • 1970-01-01
      • 2017-11-15
      相关资源
      最近更新 更多