【问题标题】:Kubernetes Init Container Exits With No LogsKubernetes Init 容器退出时没有日志
【发布时间】:2020-07-01 20:54:42
【问题描述】:

我正在使用 Kubernetes 部署使用本地数据库的服务。该服务部署为具有 3 个副本的有状态集。我有 3 个不同的 init 容器,但第 3 个容器总是因 crashLoopBackOff 而失败。第三个 init 容器只是删除了已安装卷上的一些目录。我尝试过结合 bash 逻辑或只是简单地使用 rm -rf 来使用删除目录(如果存在)的多种变体。结果与没有日志的crashLoopBackOff 相同。

失败的特定初始化容器:

- name: init-snapshot
        image: camlcasetezos/tezos:mainnet
        command: 
        - sh
        - -c
        # - exit 0
        - if [ -d "/mnt/nd/node/data/store" ]; then rm -Rf /mnt/nd/node/data/store; fi
        - if [ -d "/mnt/nd/node/data/context" ]; then rm -Rf /mnt/nd/node/data/context; fi
        volumeMounts:
        - name: node-data
          mountPath: /mnt/nd
        securityContext:
          runAsUser: 100

整个 StatefulSet:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mainnet-full-node
  labels:
    app: mainnet-full
    component: mainnet-full-node
spec:
  serviceName: mainnet-full-rpc
  replicas: 3
  selector:
    matchLabels:
      app: mainnet-full
      component: mainnet-full-node
  template:
    metadata:
      labels:
        app: mainnet-full
        component: mainnet-full-node
    spec:
      initContainers:
      - name: init-perm
        # Fix the permissions of the storage volumes--chown to the right user.
        image: library/busybox
        command: 
        - sh
        - -c
        - chown -R 100 /mnt/*
        volumeMounts:
        - name: node-data
          mountPath: /mnt/nd
        - name: node-home
          mountPath: /mnt/nh
        securityContext:
          runAsUser: 0
      - name: init-identity
        # Generate a network identity if needed (use to repair the default, then disable)
        image: camlcasetezos/tezos:mainnet
        command: 
        - sh
        - -c
        - exit 0; rm /mnt/nd/node/data/identity.json 2>&1 > /dev/null; /usr/local/bin/tezos-node identity generate 26 --data-dir=/mnt/nd/node/data
        volumeMounts:
        - name: node-data
          mountPath: /mnt/nd
        securityContext:
          runAsUser: 100
      - name: init-snapshot
        # Generate a network identity if needed (use to repair the default, then disable)
        image: camlcasetezos/tezos:mainnet
        command: 
        - sh
        - -c
        # - exit 0
        - if [ -d "/mnt/nd/node/data/store" ]; then rm -Rf /mnt/nd/node/data/store; fi
        - if [ -d "/mnt/nd/node/data/context" ]; then rm -Rf /mnt/nd/node/data/context; fi
        volumeMounts:
        - name: node-data
          mountPath: /mnt/nd
        securityContext:
          runAsUser: 100
      # We have to use host networking to get the correct address advertised?
      #hostNetwork: true
      containers:
      - name: mainnet-full-node
        image: camlcasetezos/tezos:mainnet
        args: ["tezos-node", "--history-mode", "full"]
        command: # Note the rpc address; block it from your firewall.
        - sh
        - -c
        - /usr/local/bin/tezos-node snapshot import /tmp/mainnet.full --data-dir=/var/run/tezos/node/data
        ports:
        - containerPort: 8732 # management
        - containerPort: 9732 # p2p service
        volumeMounts:
        - name: node-data
          mountPath: "/var/run/tezos"
        - name: node-home
          mountPath: "/home/tezos"
        - name: node-config
          mountPath: /home/tezos/.tezos-node
        - name: local-client-config
          mountPath: /home/tezos/.tezos-client
        securityContext:
          # emperically, this is the uid that gets chosen for the 'tezos'
          # user. Make it explicit.
          runAsUser: 100
      volumes:
      - name: node-data
        persistentVolumeClaim:
          claimName: node-data
      - name: node-config
        configMap:
          name: configs
          items:
          - key: node-config
            path: config
      - name: local-client-config
        configMap:
          name: configs
          items:
          - key: local-client-config
            path: config
  volumeClaimTemplates:
  - metadata:
      name: node-data
    spec:
      accessModes:
      - ReadWriteOnce
      volumeMode: Filesystem
      resources:
        requests:
          storage: 100Gi
      storageClassName: do-block-storage
  - metadata:
      name: node-home
    spec:
      accessModes:
      - ReadWriteOnce
      volumeMode: Filesystem
      resources:
        requests:
          storage: 1Gi
      storageClassName: do-block-storage

【问题讨论】:

  • 我认为您不必对已安装的卷进行 chown。查看在 securityContext 下使用 fsGroup 应该设置卷的组所有权。

标签: kubernetes kubernetes-statefulset


【解决方案1】:

尝试使用kubectl logs -p podname 获取以前的日志。

由于它处于崩溃循环中,因此您只能在它崩溃之前看到 pod 日志。

如果这不起作用,请尝试kubectl describe pod podname 并查看底部显示的事件。通常,如果 CrashLoopBackoff 中有某些内容,那么事件中至少会有某些内容,即使 pod 本身永远无法启动。

【讨论】:

  • 感谢您的帮助。我无法从 pod 的日志或描述中找到任何有用的信息,但我确实将命令移到了配置映射内的 bash 脚本中,并且问题得到了解决。
猜你喜欢
  • 1970-01-01
  • 2021-09-21
  • 1970-01-01
  • 2018-06-13
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2021-02-26
  • 1970-01-01
相关资源
最近更新 更多