【发布时间】:2020-08-04 06:47:24
【问题描述】:
我在 AWS 上使用 KubeSpray 创建了我的 kubernetes 集群。现在我正试图让入口控制器工作。我的理解是我需要申请https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v0.34.1/deploy/static/provider/aws/deploy.yaml,它将创建我需要的所有资源,包括网络负载均衡器。
但是,LoadBalancer 永远不会退出挂起状态:
$ kubectl -n ingress-nginx get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ingress-nginx-controller LoadBalancer 10.233.28.147 <pending> 80:31304/TCP,443:31989/TCP 11m
ingress-nginx-controller-admission ClusterIP 10.233.58.231 <none> 443/TCP 11m
描述服务似乎没有提供任何有趣的信息。
$ kubectl -n ingress-nginx describe service ingress-nginx-controller
Name: ingress-nginx-controller
Namespace: ingress-nginx
Labels: app.kubernetes.io/component=controller
app.kubernetes.io/instance=ingress-nginx
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=ingress-nginx
app.kubernetes.io/version=0.34.1
helm.sh/chart=ingress-nginx-2.11.1
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"Service","metadata":{"annotations":{"service.beta.kubernetes.io/aws-load-balancer-backend-protocol":"tcp","serv...
service.beta.kubernetes.io/aws-load-balancer-backend-protocol: tcp
service.beta.kubernetes.io/aws-load-balancer-connection-idle-timeout: 60
service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: true
service.beta.kubernetes.io/aws-load-balancer-type: nlb
Selector: app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx
Type: LoadBalancer
IP: 10.233.28.147
Port: http 80/TCP
TargetPort: http/TCP
NodePort: http 31304/TCP
Endpoints: 10.233.97.22:80
Port: https 443/TCP
TargetPort: https/TCP
NodePort: https 31989/TCP
Endpoints: 10.233.97.22:443
Session Affinity: None
External Traffic Policy: Local
HealthCheck NodePort: 30660
Events: <none>
如何调试这个问题?
更新:
kubectl -n kube-system logs -l component=kube-controller-manager 的输出为:
E0801 21:12:29.429759 1 job_controller.go:793] pods "ingress-nginx-admission-create-" is forbidden: error looking up service account ingress-nginx/ingress-nginx-admission: serviceaccount "ingress-nginx-admission" not found
E0801 21:12:29.429788 1 job_controller.go:398] Error syncing job: pods "ingress-nginx-admission-create-" is forbidden: error looking up service account ingress-nginx/ingress-nginx-admission: serviceaccount "ingress-nginx-admission" not found
I0801 21:12:29.429851 1 event.go:278] Event(v1.ObjectReference{Kind:"Job", Namespace:"ingress-nginx", Name:"ingress-nginx-admission-create", UID:"4faad8c5-9b1e-4c23-a942-94be181d590f", APIVersion:"batch/v1", ResourceVersion:"1506255", FieldPath:""}): type: 'Warning' reason: 'FailedCreate' Error creating: pods "ingress-nginx-admission-create-" is forbidden: error looking up service account ingress-nginx/ingress-nginx-admission: serviceaccount "ingress-nginx-admission" not found
E0801 21:12:29.483485 1 job_controller.go:793] pods "ingress-nginx-admission-patch-" is forbidden: error looking up service account ingress-nginx/ingress-nginx-admission: serviceaccount "ingress-nginx-admission" not found
E0801 21:12:29.483512 1 job_controller.go:398] Error syncing job: pods "ingress-nginx-admission-patch-" is forbidden: error looking up service account ingress-nginx/ingress-nginx-admission: serviceaccount "ingress-nginx-admission" not found
I0801 21:12:29.483679 1 event.go:278] Event(v1.ObjectReference{Kind:"Job", Namespace:"ingress-nginx", Name:"ingress-nginx-admission-patch", UID:"92ee0e43-2711-4b37-9fd6-958ef3c95b31", APIVersion:"batch/v1", ResourceVersion:"1506257", FieldPath:""}): type: 'Warning' reason: 'FailedCreate' Error creating: pods "ingress-nginx-admission-patch-" is forbidden: error looking up service account ingress-nginx/ingress-nginx-admission: serviceaccount "ingress-nginx-admission" not found
I0801 21:12:39.436590 1 event.go:278] Event(v1.ObjectReference{Kind:"Job", Namespace:"ingress-nginx", Name:"ingress-nginx-admission-create", UID:"4faad8c5-9b1e-4c23-a942-94be181d590f", APIVersion:"batch/v1", ResourceVersion:"1506255", FieldPath:""}): type: 'Normal' reason: 'SuccessfulCreate' Created pod: ingress-nginx-admission-create-85x58
I0801 21:12:39.489303 1 event.go:278] Event(v1.ObjectReference{Kind:"Job", Namespace:"ingress-nginx", Name:"ingress-nginx-admission-patch", UID:"92ee0e43-2711-4b37-9fd6-958ef3c95b31", APIVersion:"batch/v1", ResourceVersion:"1506257", FieldPath:""}): type: 'Normal' reason: 'SuccessfulCreate' Created pod: ingress-nginx-admission-patch-sn8xv
I0801 21:12:41.448425 1 event.go:278] Event(v1.ObjectReference{Kind:"Job", Namespace:"ingress-nginx", Name:"ingress-nginx-admission-create", UID:"4faad8c5-9b1e-4c23-a942-94be181d590f", APIVersion:"batch/v1", ResourceVersion:"1506297", FieldPath:""}): type: 'Normal' reason: 'Completed' Job completed
I0801 21:12:42.481264 1 event.go:278] Event(v1.ObjectReference{Kind:"Job", Namespace:"ingress-nginx", Name:"ingress-nginx-admission-patch", UID:"92ee0e43-2711-4b37-9fd6-958ef3c95b31", APIVersion:"batch/v1", ResourceVersion:"1506304", FieldPath:""}): type: 'Normal' reason: 'Completed' Job completed
我确实启用了 PodSecurityPolicy 准入控制器。我用以下更改更新了deploy.yaml 文件。
- 将以下内容添加到所有 ClusterRole 和 Role 资源。
- apiGroups: [policy]
resources: [podsecuritypolicies]
resourceNames: [privileged]
verbs: [use]
- 将以下内容添加到文件末尾。
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
labels:
helm.sh/chart: ingress-nginx-2.11.1
app.kubernetes.io/name: ingress-nginx
app.kubernetes.io/instance: ingress-nginx
app.kubernetes.io/version: 0.34.1
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/component: controller
name: ingress-nginx
namespace: ingress-nginx
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: ingress-nginx
subjects:
- kind: ServiceAccount
name: ingress-nginx
namespace: default
问题回复:
-
IAM 角色由 Kubespray
contrib/terraform/aws目录中的 ansible playbook 创建。 -
这些 ansible 脚本为 apiserver 创建了一个经典的负载均衡器。
【问题讨论】:
-
kube-controller-manager 是代表您创建 AWS 资源的守护进程;您可以检查日志以查看是否有任何错误? (例如,
kubectl logs --namespace=kube-system -l component=kube-controller-manager。您是否创建了主实例和节点实例所需的IAM roles?您能否检查是否创建了 NLB? -
几乎可以肯定,在 apiserver 和 controller-manager pod 上安装缺少
--cloud-controller=aws --cloud-config=/etc/kubernetes/cloud_config,但在不知道 kubespray 的具体版本的情况下很难确定 -
我正在使用 kubespray 的 master 分支。我的安装没有提到云控制器。我正在研究这个主题。
-
"要在 AWS 上部署 kubespray,请取消注释 group_vars/all.yml 中的 cloud_provider 选项并将其设置为 'aws'。" Read here 了解有关 kubespray 的 aws 特定配置的更多信息。 @DavidMedinets