【发布时间】:2020-01-06 12:35:48
【问题描述】:
我正在尝试使用 Kubernetes 从气流的 bash 运算符运行 spark 作业,我已将 callback_failure 配置为某些函数,但是即使 spark 作业失败并退出代码为 1,我的任务始终被标记为成功并且不会调用函数(callbcak 失败)。以下是气流日志的sn-ps:
[2020-01-03 13:22:46,730] {{bash_operator.py:128}} INFO - 20/01/03 13:22:46 INFO LoggingPodStatusWatcherImpl: Container final statuses:
[2020-01-03 13:22:46,730] {{bash_operator.py:128}} INFO -
[2020-01-03 13:22:46,730] {{bash_operator.py:128}} INFO -
[2020-01-03 13:22:46,730] {{bash_operator.py:128}} INFO - Container name: spark-kubernetes-driver
[2020-01-03 13:22:46,730] {{bash_operator.py:128}} INFO - Container image: XXXXXXXXX.dkr.ecr.us-east-1.amazonaws.com/spark-py:XX_XX
[2020-01-03 13:22:46,730] {{bash_operator.py:128}} INFO - Container state: Terminated
[2020-01-03 13:22:46,730] {{bash_operator.py:128}} INFO - Exit code: 1
[2020-01-03 13:22:46,731] {{bash_operator.py:128}} INFO - 20/01/03 13:22:46 INFO Client: Application run_report_generator finished.
[2020-01-03 13:22:46,736] {{bash_operator.py:128}} INFO - 20/01/03 13:22:46 INFO ShutdownHookManager: Shutdown hook called
[2020-01-03 13:22:46,737] {{bash_operator.py:128}} INFO - 20/01/03 13:22:46 INFO ShutdownHookManager: Deleting directory /tmp/spark-adb99a7e-ce6c-49f6-8307-a17c28448043
[2020-01-03 13:22:46,761] {{bash_operator.py:132}} INFO - Command exited with return code 0
[2020-01-03 13:22:49,994] {{logging_mixin.py:95}} INFO - [ [34m2020-01-03 13:22:49,994 [0m] {{ [34mlocal_task_job.py: [0m105}} INFO [0m - Task exited with return code 0
【问题讨论】:
-
看起来你的 bash 脚本在容器失败时返回 0(请参阅 github.com/apache/airflow/blob/master/airflow/operators/… 了解 BashOperator 如何处理退出代码)。它成功提交了作业,但我的猜测是 Bash 脚本不会检查作业的结果。您可以发布您的脚本代码 + BashOperator。
标签: apache-spark kubernetes amazon-eks airflow