【问题标题】:Airflow exiting after Initilalization初始化后流出的气流
【发布时间】:2021-08-09 14:56:31
【问题描述】:

我正在尝试在 Ubuntu EC2 实例中使用 docker-compose 设置气流实例。气流初始化后,该过程不会向前推进。我正在努力修复它。我使用下面的 docker compose 文件:https://airflow.apache.org/docs/apache-airflow/stable/docker-compose.yaml。当我尝试在我的 Windows 桌面上运行相同的 docker compose 文件时,它工作正常。我无法弄清楚为什么它不能在我的 EC2 实例上运行。

目前这是我看到的日志输出

$sudo docker-compose up
Creating network "airflow_default" with the default driver
Creating airflow_redis_1    ... done
Creating airflow_postgres_1 ... done
Creating airflow_flower_1            ... done
Creating airflow_airflow-worker_1    ... done
Creating airflow_airflow-init_1      ... done
Creating airflow_airflow-scheduler_1 ... done
Creating airflow_airflow-webserver_1 ... done
Attaching to airflow_postgres_1, airflow_redis_1, airflow_airflow-init_1, airflow_airflow-worker_1, airflow_flower_1, airflow_airflow-scheduler_1, airflow_airflow-webserver_1
postgres_1           |
postgres_1           | PostgreSQL Database directory appears to contain a database; Skipping initialization
postgres_1           |
postgres_1           | 2021-08-09 12:07:35.635 UTC [1] LOG:  starting PostgreSQL 13.3 (Debian 13.3-1.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit
postgres_1           | 2021-08-09 12:07:35.635 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
postgres_1           | 2021-08-09 12:07:35.635 UTC [1] LOG:  listening on IPv6 address "::", port 5432
postgres_1           | 2021-08-09 12:07:35.641 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
postgres_1           | 2021-08-09 12:07:35.648 UTC [25] LOG:  database system was shut down at 2021-08-09 12:07:24 UTC
postgres_1           | 2021-08-09 12:07:35.654 UTC [1] LOG:  database system is ready to accept connections
redis_1              | 1:C 09 Aug 2021 12:07:35.620 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
redis_1              | 1:C 09 Aug 2021 12:07:35.620 # Redis version=6.2.5, bits=64, commit=00000000, modified=0, pid=1, just started
redis_1              | 1:C 09 Aug 2021 12:07:35.620 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf
redis_1              | 1:M 09 Aug 2021 12:07:35.621 * Increased maximum number of open files to 10032 (it was originally set to 1024).
redis_1              | 1:M 09 Aug 2021 12:07:35.621 * monotonic clock: POSIX clock_gettime
redis_1              | 1:M 09 Aug 2021 12:07:35.621 * Running mode=standalone, port=6379.
redis_1              | 1:M 09 Aug 2021 12:07:35.621 # Server initialized
redis_1              | 1:M 09 Aug 2021 12:07:35.622 * Ready to accept connections
airflow-init_1       | BACKEND=postgresql+psycopg2
airflow-init_1       | DB_HOST=postgres
airflow-init_1       | DB_PORT=5432
airflow-init_1       |
airflow-worker_1     | BACKEND=postgresql+psycopg2
airflow-worker_1     | DB_HOST=postgres
airflow-worker_1     | DB_PORT=5432
airflow-worker_1     |
airflow-worker_1     | BACKEND=postgresql+psycopg2
airflow-worker_1     | DB_HOST=postgres
airflow-worker_1     | DB_PORT=5432
airflow-worker_1     |
airflow-scheduler_1  | BACKEND=postgresql+psycopg2
airflow-scheduler_1  | DB_HOST=postgres
airflow-scheduler_1  | DB_PORT=5432
airflow-scheduler_1  |
airflow-scheduler_1  | BACKEND=postgresql+psycopg2
airflow-scheduler_1  | DB_HOST=postgres
airflow-scheduler_1  | DB_PORT=5432
airflow-webserver_1  | BACKEND=postgresql+psycopg2
airflow-webserver_1  | DB_HOST=postgres
airflow-webserver_1  | DB_PORT=5432
airflow-scheduler_1  |
flower_1             | BACKEND=postgresql+psycopg2
flower_1             | DB_HOST=postgres
flower_1             | DB_PORT=5432
airflow-webserver_1  |
flower_1             |
flower_1             | BACKEND=postgresql+psycopg2
flower_1             | DB_HOST=postgres
flower_1             | DB_PORT=5432
flower_1             |
airflow-init_1       | DB: postgresql+psycopg2://airflow:***@postgres/airflow
airflow-init_1       | [2021-08-09 12:07:55,129] {db.py:692} INFO - Creating tables
airflow-init_1       | INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.
airflow-init_1       | INFO  [alembic.runtime.migration] Will assume transactional DDL.
airflow-scheduler_1  |   ____________       _____________
airflow-scheduler_1  |  ____    |__( )_________  __/__  /________      __
airflow-scheduler_1  | ____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
airflow-scheduler_1  | ___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
airflow-scheduler_1  |  _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/
airflow-scheduler_1  | [2021-08-09 12:08:00,630] {scheduler_job.py:1266} INFO - Starting the scheduler
airflow-scheduler_1  | [2021-08-09 12:08:00,630] {scheduler_job.py:1271} INFO - Processing each file at most -1 times
flower_1             | [2021-08-09 12:08:00,991] {command.py:137} INFO - Visit me at http://0.0.0.0:5555
flower_1             | [2021-08-09 12:08:01,111] {command.py:142} INFO - Broker: redis://redis:6379/0
flower_1             | [2021-08-09 12:08:01,125] {command.py:145} INFO - Registered tasks:
flower_1             | ['airflow.executors.celery_executor.execute_command',
flower_1             |  'celery.accumulate',
flower_1             |  'celery.backend_cleanup',
flower_1             |  'celery.chain',
flower_1             |  'celery.chord',
flower_1             |  'celery.chord_unlock',
flower_1             |  'celery.chunks',
flower_1             |  'celery.group',
flower_1             |  'celery.map',
flower_1             |  'celery.starmap']
flower_1             | [2021-08-09 12:08:01,218] {mixins.py:229} INFO - Connected to redis://redis:6379/0
airflow-scheduler_1  | [2021-08-09 12:08:02,350] {dag_processing.py:254} INFO - Launched DagFileProcessorManager with pid: 32
airflow-scheduler_1  | [2021-08-09 12:08:02,382] {scheduler_job.py:1835} INFO - Resetting orphaned tasks for active dag runs
airflow-scheduler_1  | [2021-08-09 12:08:02,411] {settings.py:51} INFO - Configured default timezone Timezone('UTC')
flower_1             | [2021-08-09 12:08:02,690] {inspector.py:42} WARNING - Inspect method reserved failed
flower_1             | [2021-08-09 12:08:02,691] {inspector.py:42} WARNING - Inspect method revoked failed
flower_1             | [2021-08-09 12:08:02,691] {inspector.py:42} WARNING - Inspect method active failed
flower_1             | [2021-08-09 12:08:02,692] {inspector.py:42} WARNING - Inspect method scheduled failed
flower_1             | [2021-08-09 12:08:02,692] {inspector.py:42} WARNING - Inspect method registered failed
flower_1             | [2021-08-09 12:08:02,693] {inspector.py:42} WARNING - Inspect method conf failed
flower_1             | [2021-08-09 12:08:02,697] {inspector.py:42} WARNING - Inspect method stats failed
flower_1             | [2021-08-09 12:08:02,698] {inspector.py:42} WARNING - Inspect method active_queues failed
airflow-worker_1     |  * Serving Flask app "airflow.utils.serve_logs" (lazy loading)
airflow-worker_1     |  * Environment: production
airflow-worker_1     |    WARNING: This is a development server. Do not use it in a production deployment.
airflow-worker_1     |    Use a production WSGI server instead.
airflow-worker_1     |  * Debug mode: off
airflow-worker_1     | [2021-08-09 12:08:02,859] {_internal.py:113} INFO -  * Running on http://0.0.0.0:8793/ (Press CTRL+C to quit)
airflow-worker_1     | /home/airflow/.local/lib/python3.6/site-packages/celery/platforms.py:801 RuntimeWarning: You're running the worker with superuser privileges: this is
airflow-worker_1     | absolutely not recommended!
airflow-worker_1     |
airflow-worker_1     | Please specify a different user using the --uid option.
airflow-worker_1     |
airflow-worker_1     | User information: uid=1000 euid=1000 gid=0 egid=0
airflow-worker_1     |
airflow-init_1       | Upgrades done
airflow-worker_1     | [2021-08-09 12:08:09,448: INFO/MainProcess] Connected to redis://redis:6379/0
airflow-worker_1     | [2021-08-09 12:08:09,459: INFO/MainProcess] mingle: searching for neighbors
airflow-webserver_1  |   ____________       _____________
airflow-webserver_1  |  ____    |__( )_________  __/__  /________      __
airflow-webserver_1  | ____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
airflow-webserver_1  | ___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
airflow-webserver_1  |  _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/
airflow-webserver_1  | [2021-08-09 12:08:09,518] {dagbag.py:496} INFO - Filling up the DagBag from /dev/null
airflow-worker_1     | [2021-08-09 12:08:10,482: INFO/MainProcess] mingle: all alone
airflow-worker_1     | [2021-08-09 12:08:10,508: INFO/MainProcess] celery@5fd1ec8b7299 ready.
airflow-worker_1     | [2021-08-09 12:08:11,145: INFO/MainProcess] Events of group {task} enabled by remote.
airflow-webserver_1  | [2021-08-09 12:08:13 +0000] [36] [INFO] Starting gunicorn 20.1.0
airflow-webserver_1  | [2021-08-09 12:08:13 +0000] [36] [INFO] Listening at: http://0.0.0.0:8080 (36)
airflow-webserver_1  | [2021-08-09 12:08:13 +0000] [36] [INFO] Using worker: sync
airflow-webserver_1  | [2021-08-09 12:08:13 +0000] [40] [INFO] Booting worker with pid: 40
airflow-webserver_1  | [2021-08-09 12:08:13 +0000] [41] [INFO] Booting worker with pid: 41
airflow-webserver_1  | [2021-08-09 12:08:13 +0000] [42] [INFO] Booting worker with pid: 42
airflow-webserver_1  | [2021-08-09 12:08:13 +0000] [43] [INFO] Booting worker with pid: 43
airflow-init_1       | airflow already exist in the db
airflow-init_1       | 2.1.2
airflow_airflow-init_1 exited with code 0

它挂断了,永远不会退出。

我已尝试增加 RAM 和 CPU,但这并没有解决问题。我对码头工人和气流相当陌生。

【问题讨论】:

    标签: postgresql docker-compose airflow


    【解决方案1】:

    这一切都符合预期。这就是docker-compose up 的工作原理。当您运行 docker-compose up 时,气流将继续在前台运行,除非您指定 --detach 标志 https://docs.docker.com/compose/reference/up/

    如果您不这样做,它将在前台运行并继续从您以正确前缀启动的所有容器输出日志。

    您所看到的exiting 是“init”进程(这是完全可以预料的)。它以0 退出代码退出的事实实际上是一个好兆头(这意味着 init 进程完成了它的工作并成功完成)。

    所以看起来你有一个正在运行的气流安装,它应该适合你。

    如果你使用https://airflow.apache.org/docs/apache-airflow/stable/start/docker.html中描述的过程,你应该能够连接到http://0.0.0.0:8080,使用airflow/airflow登录,你应该能够运行DAG(至少你提供的输出中没有错误)。

    【讨论】:

    • 感谢 Jarek 的帮助。我的 docker 容器处于健康状态,但我无法访问 0.0.0.0:8080。我使用了airflow.apache.org/docs/apache-airflow/stable/start/docker.html 中描述的相同过程。这是我的 docker 容器正在监听的地方。 docker-pr 337582 root 4u IPv6 1942503 0t0 TCP *:6379 (LISTEN) docker-pr 337913 root 4u IPv6 1945472 0t0 TCP *:5555 (LISTEN) docker-pr 337944 root 4u IPv6 1945513 0t0 TCP *:8080 (LISTEN)
    • 你试过 127.0.0.1:8080 吗?也许你实际上是从另一台机器上运行它,而不是你在其中运行浏览器的机器——在这种情况下,你需要使用那台机器的实际 IP 地址(可能是虚拟机),或者你必须将 8080 端口从你的主机映射到那个虚拟机?如果您看到 *:8080 正在监听,那么您可能已经启动并运行了气流。
    • 啊,是的 - 因为您使用的是 EC2,所以您需要将本地端口调谐到该机器 - 您可以通过 SSH 命令(和-L <localport>:localhost:8080 参数)或例如通过会话管理器@ 987654326@
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2019-08-08
    • 2021-06-21
    • 2013-10-11
    • 1970-01-01
    • 2020-10-07
    • 2021-10-02
    • 1970-01-01
    相关资源
    最近更新 更多