【问题标题】:Kubernetes pods are not spreaded across different nodesKubernetes pod 没有分布在不同的节点上
【发布时间】:2016-05-28 21:38:07
【问题描述】:

我在 GKE 上有一个 Kubernetes 集群。我知道 Kubernetes 会传播具有相同标签的 pod,但这不会发生在我身上。这是我的节点描述。

Name:                   gke-pubnation-cluster-prod-high-cpu-14a766ad-node-dpob
Conditions:
  Type          Status  LastHeartbeatTime                       LastTransitionTime                      Reason                          Message
  ----          ------  -----------------                       ------------------                      ------                          -------
  OutOfDisk     False   Fri, 27 May 2016 21:11:17 -0400         Thu, 26 May 2016 22:16:27 -0400         KubeletHasSufficientDisk        kubelet has sufficient disk space available
  Ready         True    Fri, 27 May 2016 21:11:17 -0400         Thu, 26 May 2016 22:17:02 -0400         KubeletReady                    kubelet is posting ready status. WARNING: CPU hardcapping unsupported
Capacity:
 cpu:           2
 memory:        1848660Ki
 pods:          110
System Info:
 Machine ID:
 Kernel Version:                3.16.0-4-amd64
 OS Image:                      Debian GNU/Linux 7 (wheezy)
 Container Runtime Version:     docker://1.9.1
 Kubelet Version:               v1.2.4
 Kube-Proxy Version:            v1.2.4
Non-terminated Pods:            (2 in total)
  Namespace                     Name                                                                                    CPU Requests    CPU Limits  Memory Requests Memory Limits
  ---------                     ----                                                                                    ------------    ----------  --------------- -------------
  kube-system                   fluentd-cloud-logging-gke-pubnation-cluster-prod-high-cpu-14a766ad-node-dpob            80m (4%)        0 (0%)              200Mi (11%)     200Mi (11%)
  kube-system                   kube-proxy-gke-pubnation-cluster-prod-high-cpu-14a766ad-node-dpob                       20m (1%)        0 (0%)              0 (0%)          0 (0%)
Allocated resources:
  (Total limits may be over 100%, i.e., overcommitted. More info: http://releases.k8s.io/HEAD/docs/user-guide/compute-resources.md)
  CPU Requests  CPU Limits      Memory Requests Memory Limits
  ------------  ----------      --------------- -------------
  100m (5%)     0 (0%)          200Mi (11%)     200Mi (11%)
No events.

Name:                   gke-pubnation-cluster-prod-high-cpu-14a766ad-node-qhw2
Conditions:
  Type          Status  LastHeartbeatTime                       LastTransitionTime                      Reason                          Message
  ----          ------  -----------------                       ------------------                      ------                          -------
  OutOfDisk     False   Fri, 27 May 2016 21:11:17 -0400         Fri, 27 May 2016 18:16:38 -0400         KubeletHasSufficientDisk        kubelet has sufficient disk space available
  Ready         True    Fri, 27 May 2016 21:11:17 -0400         Fri, 27 May 2016 18:17:12 -0400         KubeletReady                    kubelet is posting ready status. WARNING: CPU hardcapping unsupported
Capacity:
 pods:          110
 cpu:           2
 memory:        1848660Ki
System Info:
 Machine ID:
 Kernel Version:                3.16.0-4-amd64
 OS Image:                      Debian GNU/Linux 7 (wheezy)
 Container Runtime Version:     docker://1.9.1
 Kubelet Version:               v1.2.4
 Kube-Proxy Version:            v1.2.4
Non-terminated Pods:            (10 in total)
  Namespace                     Name                                                                                    CPU Requests    CPU Limits  Memory Requests Memory Limits
  ---------                     ----                                                                                    ------------    ----------  --------------- -------------
  default                       pn-minions-deployment-prod-3923308490-axucq                                             100m (5%)       0 (0%)              0 (0%)          0 (0%)
  default                       pn-minions-deployment-prod-3923308490-mvn54                                             100m (5%)       0 (0%)              0 (0%)          0 (0%)
  default                       pn-minions-deployment-staging-2522417973-8cq5p                                          100m (5%)       0 (0%)              0 (0%)          0 (0%)
  default                       pn-minions-deployment-staging-2522417973-9yatt                                          100m (5%)       0 (0%)              0 (0%)          0 (0%)
  kube-system                   fluentd-cloud-logging-gke-pubnation-cluster-prod-high-cpu-14a766ad-node-qhw2            80m (4%)        0 (0%)              200Mi (11%)     200Mi (11%)
  kube-system                   heapster-v1.0.2-1246684275-a8eab                                                        150m (7%)       150m (7%)   308Mi (17%)     308Mi (17%)
  kube-system                   kube-dns-v11-uzl1h                                                                      310m (15%)      310m (15%)  170Mi (9%)      920Mi (50%)
  kube-system                   kube-proxy-gke-pubnation-cluster-prod-high-cpu-14a766ad-node-qhw2                       20m (1%)        0 (0%)              0 (0%)          0 (0%)
  kube-system                   kubernetes-dashboard-v1.0.1-3co2b                                                       100m (5%)       100m (5%)   50Mi (2%)       50Mi (2%)
  kube-system                   l7-lb-controller-v0.6.0-o5ojv                                                           110m (5%)       110m (5%)   70Mi (3%)       120Mi (6%)
Allocated resources:
  (Total limits may be over 100%, i.e., overcommitted. More info: http://releases.k8s.io/HEAD/docs/user-guide/compute-resources.md)
  CPU Requests  CPU Limits      Memory Requests Memory Limits
  ------------  ----------      --------------- -------------
  1170m (58%)   670m (33%)      798Mi (44%)     1598Mi (88%)
No events.

这是部署的描述:

Name:                   pn-minions-deployment-prod
Namespace:              default
Labels:                 app=pn-minions,environment=production
Selector:               app=pn-minions,environment=production
Replicas:               2 updated | 2 total | 2 available | 0 unavailable
OldReplicaSets:         <none>
NewReplicaSet:          pn-minions-deployment-prod-3923308490 (2/2 replicas created)

Name:                   pn-minions-deployment-staging
Namespace:              default
Labels:                 app=pn-minions,environment=staging
Selector:               app=pn-minions,environment=staging
Replicas:               2 updated | 2 total | 2 available | 0 unavailable
OldReplicaSets:         <none>
NewReplicaSet:          pn-minions-deployment-staging-2522417973 (2/2 replicas created)

如您所见,所有四个 pod 都在同一个节点上。我应该做些什么来完成这项工作吗?

【问题讨论】:

    标签: kubernetes google-kubernetes-engine


    【解决方案1】:

    默认情况下,Pod 以无限制的 CPU 和内存限制运行。这意味着系统中的任何 pod 都将能够在执行该 pod 的节点上消耗尽可能多的 CPU 和内存。 http://kubernetes.io/docs/admin/limitrange/

    当你不指定 CPU 限制时,kubernetes 将不知道需要多少 CPU 资源,并会尝试在一个节点中创建 pod。

    这里是Deployment的例子

    apiVersion: extensions/v1beta1
    kind: Deployment
    metadata:
      name: jenkins
    spec:
      replicas: 4
      template:
        metadata:
          labels:
            app: jenkins
        spec:
          containers:
            - name: jenkins
              image: quay.io/naveensrinivasan/jenkins:0.4
              ports:
                - containerPort: 8080
              resources:
                limits:
                    cpu: "400m"
    #          volumeMounts:
    #            - mountPath: /var/jenkins_home
    #              name: jenkins-volume
    #      volumes:
    #         - name: jenkins-volume
    #           awsElasticBlockStore:
    #            volumeID: vol-29c4b99f
    #            fsType: ext4
          imagePullSecrets:
             - name: registrypullsecret
    

    这是创建部署后kubectl describe po | grep Node 的输出。

    ~ aws_kubernetes  naveen@GuessWho  ~/revature/devops/jenkins   jenkins ● k describe po | grep Node
    Node:       ip-172-20-0-26.us-west-2.compute.internal/172.20.0.26
    Node:       ip-172-20-0-29.us-west-2.compute.internal/172.20.0.29
    Node:       ip-172-20-0-27.us-west-2.compute.internal/172.20.0.27
    Node:       ip-172-20-0-29.us-west-2.compute.internal/172.20.0.29
    

    它现在在 4 个不同的节点中创建。它基于集群上的 CPU 限制。您可以增加/减少 replicas 以查看它被部署在不同的节点中。

    这不是 GKE 或 AWS 特定的。

    【讨论】:

    • "将不知道需要多少 CPU 资源" -> 这应该是资源请求的工作吗? kubernetes.io/docs/user-guide/compute-resources
    • 是的,这是资源请求的工作
    • 我在集群中有一个默认的limitrange。您可以在我的问题中看到,pod 继承自默认 cpu 请求。这是否意味着这还不够?
    • 尝试增加副本,看看部署在哪些节点。
    • 新的第 5 个 pod 被安排在与其他 4 个不同的节点上。这是否意味着只有在 CPU 请求超过 50% 时才能进行传播?
    猜你喜欢
    • 2019-06-29
    • 2018-02-09
    • 1970-01-01
    • 2021-05-01
    • 1970-01-01
    • 1970-01-01
    • 2016-10-20
    • 2020-04-10
    • 2020-07-03
    相关资源
    最近更新 更多