【问题标题】:kubernetes api server not automatically start after master reboots主服务器重启后 kubernetes api 服务器不会自动启动
【发布时间】:2026-02-27 00:20:03
【问题描述】:

我已经用 kubeadm 建立了一个小型集群,它工作正常并且 6443 端口已启动。但是重启我的系统后,集群就再也起不来了。

我该怎么办?

以下是一些信息:

systemctl status kubelet

● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/kubelet.service.d
        └─10-kubeadm.conf
Active: active (running) since Sun 2020-04-05 14:16:44 UTC; 6s ago
  Docs: https://kubernetes.io/docs/home/
  Main PID: 31079 (kubelet)
 Tasks: 20 (limit: 4915)
CGroup: /system.slice/kubelet.service
        └─31079 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet

 k8s.io/kubernetes/pkg/kubelet/kubelet.go:458: Failed to list *v1.Node: Get https://infra01.mydomainname.com:6443/api/v1/nodes?fieldSelector=metadata.name%3Dtest-infra01&limit=500&resourceVersion=0: dial tcp 116.66.187.210:6443: connect: connection refused

kubectl 获取节点

The connection to the server infra01.mydomainname.com:6443 was refused - did you specify the right host or port?

kubeadm 版本

kubeadm version: &version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.3", GitCommit:"06ad960bfd03b39c8310aaf92d1e7c12ce618213", GitTreeState:"clean", BuildDate:"2020-02-11T18:12:12Z", GoVersion:"go1.13.6", Compiler:"gc", Platform:"linux/amd64"}

journalctl -xeu kubelet

 6   18167 reflector.go:153] k8s.io/kubernetes/pkg/kubelet/kubelet.go:458: 
           Failed to list *v1.Node: Get https://infra01.mydomainname.com
 1   18167 reflector.go:153] 
           k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:46: Failed to list *v1.Pod: Get https://huawei-infra01.s
 4   18167 aws_credentials.go:77] while getting AWS credentials 
           NoCredentialProviders: no valid providers in chain. Deprecated.
           messaging see aws.Config.CredentialsChainVerboseErrors
 6   18167 kuberuntime_manager.go:211] Container runtime docker initialized, 
           version: 19.03.7, apiVersion: 1.40.0
 6   18167 server.go:1113] Started kubelet
 1   18167 kubelet.go:1302] Image garbage collection failed once. Stats 
           initialization may not have completed yet: failed to get imageF
 8   18167 server.go:144] Starting to listen on 0.0.0.0:10250
 4   18167 server.go:778] Starting healthz server failed: listen tcp 
           127.0.0.1:10248: bind: address already in use
 5   18167 fs_resource_analyzer.go:64] Starting FS ResourceAnalyzer
 4   18167 volume_manager.go:265] Starting Kubelet Volume Manager
 1   18167 desired_state_of_world_populator.go:138] Desired state populator 
           starts to run
 3   18167 server.go:384] Adding debug handlers to kubelet server.
 4   18167 server.go:158] listen tcp 0.0.0.0:10250: bind: address already in 
           use

码头工人

docker run hello-world
Hello from Docker!

ubuntu

lsb_release -a
Ubuntu 18.04.2 LTS

交换 && kubeconfig

swap is turned off and kubeconfig was correctly exported

注意
可以通过重置集群来解决问题,但这应该是最后的选择。

【问题讨论】:

    标签: kubernetes ubuntu-18.04 kubectl kubernetes-apiserver kube-apiserver


    【解决方案1】:

    由于端口已在使用中,Kubelet 未启动,因此无法为 api 服务器创建 pod。 使用以下命令找出哪个进程占用了 10250 端口

    root@master admin]# ss -lntp | grep 10250
    LISTEN     0      128         :::10250                   :::*                   users:(("kubelet",pid=23373,fd=20))
    

    它将为您提供该进程的 PID 和该进程的名称。如果是不需要的进程占用了该端口,您可以随时终止该进程,并且该端口可供 kubelet 使用。

    杀死进程后再次运行上述命令,应该没有返回值。

    为了安全起见,运行 kubeadm reset 然后运行 ​​kubeadm init 它应该会通过

    编辑:

    使用 snap stop kubelet 可以在节点上停止 kubelet。

    【讨论】:

    • 我杀死了进程,然后重新启动了 kubelet,它仍然显示相同的消息。我现在不想重置集群,这应该是最后的选择。
    • 你可以试试运行 systemctl stop kubelet.service
    • 我调用了“systemctl stop kubelet.service”,然后杀死了持有 10250 端口的进程,但几秒钟后,另一个进程/kublet 使用了相同的端口。
    • 使用“systemctl status kubelet”来检查状态,它说的是“inactive”。那么如何阻止进程持有 10250 端口呢?
    • systemctl kill -s SIGKILL kubelet.service