【发布时间】:2021-12-26 15:17:24
【问题描述】:
即使主执行已完成,我的 Kubeflow 管道组件/作业仍会继续无限期运行。从这些日志中,人们可能明白为什么作业无法成功完成吗?
似乎有一个等待容器继续运行,即使主容器已成功完成。
非常感谢任何见解
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 10m default-scheduler Successfully assigned default/secondary-market-pipeline-6plbl-940127540 to gke-cluster-1-pool-1-46a6353b-wfpg
Normal Pulled 10m kubelet Container image "gcr.io/cloud-marketplace/google-cloud-ai-platform/kubeflow-pipelines/argoexecutor:1.7.1" already present on machine
Normal Created 10m kubelet Created container wait
Normal Started 10m kubelet Started container wait
Normal Pulling 10m kubelet Pulling image "<image>:latest"
Normal Pulled 10m kubelet Successfully pulled image "<image>:latest" in 1.617667035s
Normal Created 10m kubelet Created container main
Normal Started 10m kubelet Started container main
【问题讨论】:
-
我已将其范围缩小到使用高内存节点 - 但不确定为什么这些节点无法成功结束
标签: python kubectl kubeflow-pipelines