【发布时间】:2020-04-29 20:30:21
【问题描述】:
我正在尝试使用我的 fork airflow 使用 apache/airflow 的 v1.10.5 版本在 ECS 中运行 apache 气流。我正在使用环境变量将执行程序、Postgres 和 Redis 信息设置到网络服务器。
AIRFLOW__CORE__SQL_ALCHEMY_CONN="postgresql+psycopg2://airflow_user:airflow_password@postgres:5432/airflow_db"
AIRFLOW__CELERY__RESULT_BACKEND="db+postgresql://airflow_user:airflow_password@postgres:5432/airflow_db"
AIRFLOW__CELERY__BROKER_URL="redis://redis_queue:6379/1"
AIRFLOW__CORE__EXECUTOR=CeleryExecutor
FERNET_KEY=46BKJoQYlPPOexq0OhDZnIlNepKFf87WFwLbfzqDDho=
AIRFLOW__CORE__LOAD_EXAMPLES=False
我的任务随机失败并出现以下错误
[2020-01-12 20:06:28,308] {ssh_utils.py:130} WARNING - 20/01/13 01:36:28 INFO db.IntegerSplitter: Split size: 134574; Num splits: 5 from: 2 to: 672873
[2020-01-12 20:06:28,449] {ssh_utils.py:130} WARNING - 20/01/13 01:36:28 INFO mapreduce.JobSubmitter: number of splits:5
[2020-01-12 20:06:28,459] {ssh_utils.py:130} WARNING - 20/01/13 01:36:28 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
[2020-01-12 20:06:28,964] {ssh_utils.py:130} WARNING - 20/01/13 01:36:28 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1578859373494_0012
[2020-01-12 20:06:29,337] {ssh_utils.py:130} WARNING - 20/01/13 01:36:29 INFO impl.YarnClientImpl: Submitted application application_1578859373494_0012
[2020-01-12 20:06:29,371] {ssh_utils.py:130} WARNING - 20/01/13 01:36:29 INFO mapreduce.Job: The url to track the job: http://ip-XX-XX-XX-XX.ap-southeast-1.compute.internal:20888/proxy/application_1578859373494_0012/
[2020-01-12 20:06:29,371] {ssh_utils.py:130} WARNING - 20/01/13 01:36:29 INFO mapreduce.Job: Running job: job_1578859373494_0012
[2020-01-12 20:06:47,489] {ssh_utils.py:130} WARNING - 20/01/13 01:36:47 INFO mapreduce.Job: Job job_1578859373494_0012 running in uber mode : false
[2020-01-12 20:06:47,490] {ssh_utils.py:130} WARNING - 20/01/13 01:36:47 INFO mapreduce.Job: map 0% reduce 0%
[2020-01-12 20:06:54,777] {logging_mixin.py:95} INFO - [[34m2020-01-12 20:06:54,777[0m] {[34mlocal_task_job.py:[0m105} INFO[0m - Task exited with return code -9[0m
但是当我在 EMR UI 中检查某个应用程序时,它显示为已成功运行。
我的ECS配置如下
气流工作者
硬/软内存限制 -> 2560/1024
工人数量 -> 3
气流网络服务器
硬/软内存限制 -> 3072/1024
气流调度器
硬/软内存限制 -> 2048/1024
run_duration -> 86400
是什么导致了这个错误?
【问题讨论】:
标签: python airflow amazon-ecs