【问题标题】:How to set prometheus rules in stable/prometheus chart values.yaml?如何在 stable/prometheus 图表 values.yaml 中设置 prometheus 规则?
【发布时间】:2018-01-26 07:12:11
【问题描述】:

使用官方Prometheus图表stable/prometheus

自定义其values.yaml文件以设置alertmanager.yml文件和serverFiles区域。

rules: {}:

https://github.com/kubernetes/charts/blob/master/stable/prometheus/values.yaml#L598

它是{}。这里如何写真正的警报规则为official format

例如,我试过:

  serverFiles:
    alerts: {}
    rules:
    # Alert for any instance that is unreachable for >5 minutes.
    - alert: InstanceDown
      expr: up == 0
      for: 5m
      labels:
        severity: page
      annotations:
        summary: "Instance {{ $labels.instance }} down"
      description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes."

然后跑了$ helm install my_prometheus。然后 pod 得到这个错误:

PersistentVolumeClaim is not bound: "sweet-terrier-prometheus-server"
Back-off restarting failed container
Error syncing pod

【问题讨论】:

    标签: charts configuration yaml rules prometheus


    【解决方案1】:
    serverFiles:
      alerts:
        groups:
        - name: NodeAlerts
          rules:
          - alert: NodeCPUUsage
            expr: (100 - (avg(irate(node_cpu{mode="idle"}[5m])) BY (instance) * 100)) > 75
            for: 2m
            labels:
              severity: alert
            annotations:
              description: '{{$labels.instance}}: CPU usage is above 75% (current value is:
                {{ $value }})'
              summary: '{{$labels.instance}}: High CPU usage detect
    

    rules 用于记录规则,alert 用于警报规则。

    https://prometheus.io/docs/practices/rules/

    【讨论】:

    • 除了修改 values.yaml 之外,有没有办法在 helm 安装或升级期间“注入”规则?喜欢在文件中添加规则或以某种方式添加规则?
    • @alex 你找到解决方案了吗?
    • @MikeBevz 自从我使用这个已经很长时间了 - 我不认为我这样做了,但我并没有很努力
    猜你喜欢
    • 2019-11-18
    • 1970-01-01
    • 2021-12-10
    • 1970-01-01
    • 1970-01-01
    • 2019-12-16
    • 2018-07-09
    • 1970-01-01
    • 2021-11-03
    相关资源
    最近更新 更多