【问题标题】:AWS EKS nodes creation failureAWS EKS 节点创建失败
【发布时间】:2022-03-13 04:23:14
【问题描述】:

我在 AWS 中有一个由 these 指令创建的集群。

然后我尝试根据this文档在这个集群中添加节点。

似乎无法使用vpc-cnicoredns 健康问题类型创建节点:insufficientNumberOfReplicas The add-on is unhealthy because it doesn't have the desired number of replicas.

豆荚的状态kubectl get pods -n kube-system

NAME                       READY   STATUS             RESTARTS   AGE
aws-node-9cwkd             0/1     CrashLoopBackOff   13         42m
aws-node-h4qjt             0/1     CrashLoopBackOff   13         42m
aws-node-jrn5x             0/1     CrashLoopBackOff   13         43m
coredns-745979c988-25fcc   0/1     Pending            0          120m
coredns-745979c988-qvh7h   0/1     Pending            0          120m
kube-proxy-2bmlq           1/1     Running            0          42m
kube-proxy-hjcrw           1/1     Running            0          43m
kube-proxy-j9r9n           1/1     Running            0          42m

aws-node-9cwkd pod 的日志:

{"level":"info","ts":"2021-11-30T14:11:14.156Z","caller":"entrypoint.sh","msg":"Validating env variables ..."}
{"level":"info","ts":"2021-11-30T14:11:14.157Z","caller":"entrypoint.sh","msg":"Install CNI binaries.."}
{"level":"info","ts":"2021-11-30T14:11:14.177Z","caller":"entrypoint.sh","msg":"Starting IPAM daemon in the background ... "}
{"level":"info","ts":"2021-11-30T14:11:14.179Z","caller":"entrypoint.sh","msg":"Checking for IPAM connectivity ... "}
{"level":"info","ts":"2021-11-30T14:11:16.189Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
{"level":"info","ts":"2021-11-30T14:11:18.198Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
{"level":"info","ts":"2021-11-30T14:11:20.205Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
{"level":"info","ts":"2021-11-30T14:11:22.215Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}
{"level":"info","ts":"2021-11-30T14:11:24.226Z","caller":"entrypoint.sh","msg":"Retrying waiting for IPAM-D"}

运行命令kubectl describe pod aws-node-h4qjt -n kube-system会出现以下错误:

Readiness probe failed: {"level":"info","ts":"2021-11-30T14:11:07.145Z","caller":"/usr/local/go/src/runtime/proc.go:225","msg":"timeout: failed to connect service \":50051\" within 5s"}

非常感谢任何帮助以成功在集群中创建节点。

【问题讨论】:

    标签: amazon-web-services amazon-eks


    【解决方案1】:

    这很可能是节点服务角色的问题。如果您执行到 pod 中然后查看 ipamd.log,您可以获得更多信息

    kubectl exec -it aws-node-9cwkd -n kube-system -- /bin/bash 
    cat /host/var/log/aws-routed-eni/ipamd.log
    

    这是我遇到相同错误时的错误示例

    {"level":"error","ts":"2021-12-02T13:27:51.464Z","caller":"ipamd/ipamd.go:444","msg":"失败 为 [eni-0c01bd25ae6999ed5] 调用 ec2:DescribeNetworkInterfaces: UnauthorizedOperation:您无权执行此操作 操作。\n\t状态码:403,请求 ID: 0438b84b-8052-4f31-9d63-c2ff7512f131"}

    就我而言,我必须将 AmazonEKS_CNI_Policy 策略添加到节点 IAM 角色。

    https://docs.aws.amazon.com/eks/latest/userguide/cni-iam-role.html

    【讨论】:

    • 如果 pod 没有启动,你怎么办?
    【解决方案2】:

    我使用带有--nodes 标志的eksctl 命令行工具,一切都按预期成功创建。

    eksctl create cluster --name cluster-name \
      --nodes 3 \
      --node-type=t3.large \
      --region=eu-west-1
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2022-07-28
      • 2021-01-25
      • 2021-10-16
      • 1970-01-01
      • 1970-01-01
      • 2020-08-25
      • 2020-08-19
      • 2019-10-22
      相关资源
      最近更新 更多