【问题标题】:How to view docker-compose healthcheck logs?如何查看 docker-compose 健康检查日志?
【发布时间】:2017-08-01 22:53:09
【问题描述】:

在我的docker-compose.yml 中,我有以下service healthcheck 部分。我想知道 MariaDB 是否真的准备好处理查询。一个名为 cmdservice 被配置为依赖于 condition: service_healthy

  db:
    image: mariadb:10
    environment:
      MYSQL_RANDOM_ROOT_PASSWORD: 1
      MYSQL_USER: user
      MYSQL_PASSWORD: password
      MYSQL_DATABASE: database
    healthcheck:
      test: ["CMD", "mysql", "--user=user", "--password=password", "--execute='SELECT 1'", "--host=127.0.0.1", "--port=3306"]
      interval: 1s
      retries: 30

此健康检查不起作用,表明服务不健康。

如何检查test CMD 的输出?

【问题讨论】:

    标签: logging docker docker-compose health-monitoring


    【解决方案1】:

    你可以使用:

    docker inspect --format "{{json .State.Health }}" <container name> | jq
    

    输出:

    {
        "Status": "unhealthy",
        "FailingStreak": 63,
        "Log": [
            {
                "Start": "2017-03-11T20:49:19.668895201+03:30",
                "End": "2017-03-11T20:49:19.735722044+03:30",
                "ExitCode": 1,
                "Output": "ERROR 1064 (42000) at line 1: You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near ''SELECT 1'' at line 1\n"
            }
        ]
    }
    

    然后寻找 输出 部分。

    仅获取输出:

    docker inspect --format "{{json .State.Health }}" mariadb_db_1 | jq '.Log[].Output'
    

    对于 swarm 模式,请使用以下格式(感谢@shotgunner 指出):

    {{json.Spec.TaskTemplate.ContainerSpec.Healthcheck}}

    请随意将jq 换成您用于 json 漂亮打印的任何工具。

    【讨论】:

    • 嗨,您知道有什么方法可以将此日志附加到写入入口点的同一 json-logfile 中吗?我希望看到运行 docker 日志的运行状况检查日志,它们都在同一个地方
    • 在集群模式和经理{{json .State.Health }} 中不起作用。请改用{{json .Spec.TaskTemplate.ContainerSpec.Healthcheck }}
    • 无论如何使用jq,都可以通过它进行查询;它更短更好看:docker inspect $CONTAINER | jq '.[].State.Health'.
    • 我想知道第一个命令的格式是否正确。它拥有一个可疑的空白区域。
    【解决方案2】:

    docker-compose ps 将指示每个服务的状态,如果定义了 healthcheck,则包括其运行状况。这对基本概述很有帮助。

    % docker-compose ps
                    Name                                Command                       State                                       Ports                            
    ----------------------------------------------------------------------------------------------------------------------------------------------------------------
    remix-theme-editor_analytics_1            /bin/sh -c /analytics/run. ...   Up                                                                                   
    remix-theme-editor_base_1                 /bin/bash                        Exit 0                                                                               
    remix-theme-editor_flower_1               /entrypoint --environment  ...   Exit 137                                                                             
    remix-theme-editor_frontend_1             /bin/sh -c perl -p -i -e ' ...   Exit 137                                                                             
    remix-theme-editor_js-app_1               npm run                          Exit 0                                                                               
    remix-theme-editor_mq_1                   docker-entrypoint.sh rabbi ...   Up (healthy)            15671/tcp, 15672/tcp, 25672/tcp, 4369/tcp, 5671/tcp, 5672/tcp
    remix-theme-editor_mysql-migration_1      /entrypoint_mysql-migratio ...   Exit 0                                                                               
    remix-theme-editor_mysql_1                /bin/sh -c /entrypoint_wra ...   Up (health: starting)   127.0.0.2:3308->3306/tcp                                     
    remix-theme-editor_page-renderer_1        npm run start:watch              Up                                                                                   
    remix-theme-editor_python-app_1           /entrypoint                      Exit 2                                                                               
    remix-theme-editor_redis_1                docker-entrypoint.sh /bin/ ...   Up (health: starting)   6379/tcp                                                     
    remix-theme-editor_scheduler_1            /entrypoint --environment  ...   Exit 137                                                                             
    remix-theme-editor_socket_1               /entrypoint --environment  ...   Exit 1                                                                               
    remix-theme-editor_static-builder_1       npm run watch                    Up                                                                                   
    remix-theme-editor_static-http_1          nginx -g daemon off;             Up                      127.0.0.2:6544->443/tcp, 80/tcp                              
    remix-theme-editor_web_1                  /entrypoint --environment  ...   Exit 1                                                                               
    remix-theme-editor_worker_1               /entrypoint --environment  ...   Exit 1                                                                               
    remix-theme-editor_worker_screenshots_1   /entrypoint --environment  ...   Exit 1     
    

    如果您想了解更多详细信息,请将docker inspectdocker ps -q &lt;service-name&gt; 结合使用。

    % docker inspect --format "{{json .State.Health }}" $(docker-compose ps -q mq) | jq
    {
      "Status": "starting",
      "FailingStreak": 48,
      "Log": [
        {
          "Start": "2018-10-03T00:40:18.671527745-05:00",
          "End": "2018-10-03T00:40:18.71729051-05:00",
          "ExitCode": -1,
          "Output": "OCI runtime exec failed: exec failed: container_linux.go:348: starting container process caused \"exec: \\\"nc\\\": executable file not found in $PATH\": unknown"
        },
    ...
    

    您始终可以自己调试运行状况检查,只需自己执行运行状况检查代码即可。例如:

    % docker exec -it $(docker-compose ps -q socket) nc -w2 127.0.0.1 5672
    (UNKNOWN) [127.0.0.1] 5672 (?) : Connection refused
    

    你也可以在 shell 中做同样的事情:

    % docker exec -it $(docker-compose ps -q socket) bash
    root@b5da5207d344:~/src# nc -w2 127.0.0.1 5672
    (UNKNOWN) [127.0.0.1] 5672 (?) : Connection refused
    root@b5da5207d344:~/src# echo $?
    1
    

    最后,您可以在第一个终端窗口中简单地使用docker-compose up,在另一个终端窗口中使用docker-compose logs -f。这将显示来自 docker-compose-managed 容器的所有日志。

    【讨论】:

    • 嗨,您知道有什么方法可以将此日志附加到写入入口点的同一 json-logfile 中吗?我希望看到执行docker logs 的运行状况检查日志,它们都在同一个地方
    【解决方案3】:

    群模式

    1. 首先在管理器中使用docker service ps service_name找到失败的任务id和对应的节点。
    manager$ docker service ps service_name
    ID                  NAME                 IMAGE            NODE                DESIRED STATE       CURRENT STATE               ERROR                              PORTS
    liwww3qzg9dz        service_name.1       image_name:1.3   s-3                 Running             Running 27 seconds ago                                         
    hcgxmwk2efj0         \_ service_name.1   image_name:1.3   s-3                 Shutdown            Failed about a minute ago   "task: non-zero exit (137): do…"  
    

    在本例中,hcgxmwk2efj0 是任务 ID,s-3 是节点名称。

    1. 然后在管理器中使用docker inspect --format "{{json .Status.ContainerStatus.ContainerID }}" task_id获取容器id。
    manager$ docker inspect --format "{{json .Status.ContainerStatus.ContainerID }}" hcgxmwk2efj0
    "412b09d5244047b31471248fd9a0807e5ea42406fb8f5b1701df2244933e30c8"
    
    1. 然后 ssh 到 那个节点,并使用命令docker inspect --format "{{json .State.Health }}" container_id | jq 来获取健康检查的日志。 (此命令中不需要| jq
    s-3$ docker inspect --format "{{json .State.Health }}" 412b09d5244047b31471248fd9a0807e5ea42406fb8f5b1701df2244933e30c8 | jq
    {
      "Status": "unhealthy",
      "FailingStreak": 3,
      "Log": [
        {
          "Start": "2021-09-07T06:10:05.233163051Z",
          "End": "2021-09-07T06:10:07.585487343Z",
          "ExitCode": 0,
          "Output": "... log 1 ..."
        },
        {
          "Start": "2021-09-07T06:10:37.644936244Z",
          "End": "2021-09-07T06:10:39.881196276Z",
          "ExitCode": 0,
          "Output": "... log 2 ..."
        },
        {
          "Start": "2021-09-07T06:11:10.16172012Z",
          "End": "2021-09-07T06:11:25.161912411Z",
          "ExitCode": -1,
          "Output": "Health check exceeded timeout (15s)"
        },
        {
          "Start": "2021-09-07T06:11:55.297395088Z",
          "End": "2021-09-07T06:12:10.302928565Z",
          "ExitCode": -1,
          "Output": "Health check exceeded timeout (15s)"
        },
        {
          "Start": "2021-09-07T06:12:40.371234778Z",
          "End": "2021-09-07T06:12:55.371393914Z",
          "ExitCode": -1,
          "Output": "Health check exceeded timeout (15s)"
        }
      ]
    }
    

    【讨论】: