【发布时间】:2021-01-18 19:22:09
【问题描述】:
我在 EKS 上运行一个集群,并按照教程使用命令 eksctl create cluster --name prod --version 1.17 --region eu-west-1 --nodegroup-name standard-workers --node-type t3.medium --nodes 3 --nodes-min 1 --nodes-max 4 --ssh-access --ssh-public-key public-key.pub --managed 部署一个。
一旦我完成了我的测试(主要是安装然后卸载 helm 图表),并且我有一个没有运行作业的干净集群,然后我尝试使用 eksctl delete cluster --name prod 删除它,导致这些错误。
[ℹ] eksctl version 0.25.0
[ℹ] using region eu-west-1
[ℹ] deleting EKS cluster "test"
[ℹ] deleted 0 Fargate profile(s)
[✔] kubeconfig has been updated
[ℹ] cleaning up AWS load balancers created by Kubernetes objects of Kind Service or Ingress
[ℹ] 2 sequential tasks: { delete nodegroup "standard-workers", delete cluster control plane "test" [async] }
[ℹ] will delete stack "eksctl-test-nodegroup-standard-workers"
[ℹ] waiting for stack "eksctl-test-nodegroup-standard-workers" to get deleted
[✖] unexpected status "DELETE_FAILED" while waiting for CloudFormation stack "eksctl-test-nodegroup-standard-workers"
[ℹ] fetching stack events in attempt to troubleshoot the root cause of the failure
[✖] AWS::CloudFormation::Stack/eksctl-test-nodegroup-standard-workers: DELETE_FAILED – "The following resource(s) failed to delete: [ManagedNodeGroup]. "
[✖] AWS::EKS::Nodegroup/ManagedNodeGroup: DELETE_FAILED – "Nodegroup standard-workers failed to stabilize: [{Code: Ec2SecurityGroupDeletionFailure,Message: DependencyViolation - resource has a dependent object,ResourceIds: [[REDACTED]]}]"
[ℹ] 1 error(s) occurred while deleting cluster with nodegroup(s)
[✖] waiting for CloudFormation stack "eksctl-test-nodegroup-standard-workers": ResourceNotReady: failed waiting for successful resource state
要修复它们,我必须手动删除 AWS VPC,然后删除 ManagednodeGroups,然后再次删除所有内容。
我再次尝试了上述步骤(使用官方入门文档中提供的命令创建和删除),但删除时出现相同的错误。
在做这样的事情时我必须手动删除资源似乎非常奇怪。是否有解决此问题的方法,是我做错了什么,还是这是标准程序?
所有命令都是通过官方的eksctl cli运行的,我在关注official eksctl deployment
【问题讨论】:
-
您是否在使用 eksctl 创建集群后添加节点组?节点组是 eksctl 创建的 cloudformation 堆栈的一部分吗?
-
这通常发生在您从 cloudformation 堆栈外部创建/附加资源到您的环境时
-
@jordanm 不,除了安装然后卸载掌舵图,我什么也没做。该图表包含 3 个部署、3 个服务(1 个负载均衡器和 2 个 clusterIP)和 1 个配置映射。卸载图表并等待卸载所有资源后,我删除了集群。
-
你用的是什么教程?请添加链接。
标签: amazon-web-services kubernetes amazon-eks eksctl