【问题标题】:Kubernetes using KVM instances on OpenStack via KubeAdmKubernetes 通过 KubeAdm 在 OpenStack 上使用 KVM 实例
【发布时间】:2018-11-21 19:37:16
【问题描述】:

我已经使用 Horizo​​n 界面成功部署了一个“工作”的 Kubernetes 集群来创建 Linux 实例:

已根据以下内容配置主机:https://kubernetes.io/docs/setup/independent/high-availability/

我现在可以说我有一个 Kubernetes 集群:

$ kubectl get nodes
NAME               STATUS    ROLES     AGE       VERSION
kube-apiserver-1   Ready     master    1d        v1.12.2
kube-apiserver-2   Ready     master    1d        v1.12.2
kube-apiserver-3   Ready     master    1d        v1.12.2
kube-node-1        Ready     <none>    21h       v1.12.2
kube-node-2        Ready     <none>    21h       v1.12.2
kube-node-3        Ready     <none>    21h       v1.12.2
kube-node-4        Ready     <none>    21h       v1.12.2

但是,事实证明,要超越这一点非常困难。我无法创建可用的服务,而且作为基本组件的 coredns 似乎无法使用:

$ kubectl -n kube-system get pods
NAME                                       READY     STATUS             RESTARTS   AGE
coredns-576cbf47c7-4gdnc                   0/1       CrashLoopBackOff   288        23h
coredns-576cbf47c7-x4h4v                   0/1       CrashLoopBackOff   288        23h
kube-apiserver-kube-apiserver-1            1/1       Running            0          1d
kube-apiserver-kube-apiserver-2            1/1       Running            0          1d
kube-apiserver-kube-apiserver-3            1/1       Running            0          1d
kube-controller-manager-kube-apiserver-1   1/1       Running            3          1d
kube-controller-manager-kube-apiserver-2   1/1       Running            1          1d
kube-controller-manager-kube-apiserver-3   1/1       Running            0          1d
kube-flannel-ds-amd64-2zdtd                1/1       Running            0          20h
kube-flannel-ds-amd64-7l5mr                1/1       Running            0          20h
kube-flannel-ds-amd64-bmvs9                1/1       Running            0          1d
kube-flannel-ds-amd64-cmhkg                1/1       Running            0          1d
...

pod中的错误表示无法访问kubernetes服务:

$ kubectl -n kube-system logs coredns-576cbf47c7-4gdnc
E1121 18:04:48.928055       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:04:48.928688       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:04:48.928917       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:19.929869       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:19.930819       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:19.931517       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:50.932159       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:50.932722       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:05:50.933179       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
2018/11/21 18:06:07 [INFO] SIGTERM: Shutting down servers then terminating
E1121 18:06:21.933058       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:355: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:06:21.934010       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:348: Failed to list *v1.Service: Get https://10.96.0.1:443/api/v1/services?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
E1121 18:06:21.935107       1 reflector.go:205] github.com/coredns/coredns/plugin/kubernetes/controller.go:350: Failed to list *v1.Endpoints: Get https://10.96.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout

$ kubectl -n kube-system 描述 pod/coredns-576cbf47c7-dk7sh

...
Events:
  Type     Reason     Age                From                  Message
  ----     ------     ----               ----                  -------
  Normal   Scheduled  25m                default-scheduler     Successfully assigned kube-system/coredns-576cbf47c7-dk7sh to kube-node-3
  Normal   Pulling    25m                kubelet, kube-node-3  pulling image "k8s.gcr.io/coredns:1.2.2"
  Normal   Pulled     25m                kubelet, kube-node-3  Successfully pulled image "k8s.gcr.io/coredns:1.2.2"
  Normal   Created    20m (x3 over 25m)  kubelet, kube-node-3  Created container
  Normal   Killing    20m (x2 over 22m)  kubelet, kube-node-3  Killing container with id docker://coredns:Container failed liveness probe.. Container will be killed and recreated.
  Normal   Pulled     20m (x2 over 22m)  kubelet, kube-node-3  Container image "k8s.gcr.io/coredns:1.2.2" already present on machine
  Normal   Started    20m (x3 over 25m)  kubelet, kube-node-3  Started container
  Warning  Unhealthy  4m (x36 over 24m)  kubelet, kube-node-3  Liveness probe failed: HTTP probe failed with statuscode: 503
  Warning  BackOff    17s (x22 over 8m)  kubelet, kube-node-3  Back-off restarting failed container

kubernetes 服务在那里,并且似乎已正确自动配置:

$ kubectl 获取 svc

NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   23h

$ kubectl 描述 svc/kubernetes

Name:              kubernetes
Namespace:         default
Labels:            component=apiserver
                   provider=kubernetes
Annotations:       <none>
Selector:          <none>
Type:              ClusterIP
IP:                10.96.0.1
Port:              https  443/TCP
TargetPort:        6443/TCP
Endpoints:         192.168.5.19:6443,192.168.5.24:6443,192.168.5.29:6443
Session Affinity:  None
Events:            <none>

$ kubectl 获取端点

NAME         ENDPOINTS                                               AGE
kubernetes   192.168.5.19:6443,192.168.5.24:6443,192.168.5.29:6443   23h

我一直怀疑我在网络层遗漏了一些东西,并且这个问题与 Neutron 有关。有很多关于如何使用其他工具安装 Kubernetes 以及如何在 OpenStack 中安装它的 HOWTO,但我还没有找到一个指南来解释如何通过使用 Horizo​​n 界面创建 KVM 并处理安全组和网络问题来安装它。顺便说一句,所有 IPv4/TCP 端口在主节点和节点之间都是开放的。

有没有人提供解释这种情况的指南?

【问题讨论】:

  • 你能从任何节点成功curl -k https://10.96.0.1:443吗?
  • 不,但请记住,节点/主机网络是 192.168.5.x,不知道如何到达 10.96.0.x。节点网络和 Kubernetes Pod 网络是不同的野兽。
  • 你的 CNI PodCidr 是什么?
  • 法兰绒来自:raw.githubusercontent.com/coreos/flannel/… pod cidr 为:“10.253.0.0/16”
  • 好的,看起来不错。你能谈谈从一个 pod 到另一个 pod 的 ping 吗?创建 2 个虚拟 pod 并相互 ping 它们的10.253.0.0 地址。 (你应该可以)

标签: kubernetes kubeadm openstack-neutron openstack-horizon


【解决方案1】:

这里的问题是污染的 etcd 集群。一旦我重建了 EXTERNAL etcd 集群并使用这些说明从头开始:https://kubernetes.io/docs/setup/independent/high-availability/#external-etcd 所有项目都按预期工作。似乎没有可用于重置 flannel pod 网络的 etcd 条目的工具。

【讨论】:

    猜你喜欢
    • 2014-09-11
    • 2018-03-13
    • 2018-01-28
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2021-11-29
    • 2019-03-03
    • 1970-01-01
    相关资源
    最近更新 更多