【问题标题】:get node labels in prometheus alertmanager rule在普罗米修斯警报管理器规则中获取节点标签
【发布时间】:2021-10-13 22:05:51
【问题描述】:

所以我有这个规则

- alert: InstanceNotReady
        expr: kube_node_status_condition{condition="Ready", status=~"unknown|false"} == 1
        for: 1m
        labels:
          severity: critical
          
        annotations:
          summary: {{`Kubernetes node {{ $labels.node }} is in NotReady state`}}
          description: Node entered NotReady unresponsive state

但它不包含节点的标签:

kube_node_status_condition{
 app_kubernetes_io_instance="prometheus", 
 app_kubernetes_io_managed_by="Helm", 
 app_kubernetes_io_name="kube-state-metrics", 
 argocd_argoproj_io_instance="prometheus", 
 condition="Ready", 
 helm_sh_chart="kube-state-metrics-3.5.2", 
 instance="10.120.1.147:8080", 
 job="kubernetes-service-endpoints", 
 kubernetes_name="prometheus-kube-state-metrics", 
 kubernetes_namespace="prometheus", 
 kubernetes_node="ip-10-120-1-39.us-west-2.compute.internal", 
 node="ip-10-120-3-76.us-west-2.compute.internal", 
 status="unknown"
}

所以我需要添加分配给 kubernetes 节点的标签,以使警报信息更丰富。

我有我想要的 kube_node_labels

kube_node_labels{
  app_kubernetes_io_instance="prometheus", 
  app_kubernetes_io_managed_by="Helm",
  app_kubernetes_io_name="kube-state-metrics",
  argocd_argoproj_io_instance="prometheus", 
  helm_sh_chart="kube-state-metrics-3.5.2", 
  instance="10.120.0.226:8080", 
  job="kubernetes-service-endpoints", 
  kubernetes_name="prometheus-kube-state-metrics", 
  kubernetes_namespace="prometheus", 
  kubernetes_node="ip-10-120-1-39.us-west-2.compute.internal", 
  label_grafana="true", 
  label_node_kubernetes_io_instance_type="t3.small",
  label_node_kubernetes_io_lifecycle="on-demand", 
  label_topology_kubernetes_io_region="us-west-2", 
  node="ip-10-120-3-76.us-west-2.compute.internal"
}

所以我想将这些label_* 标签添加到警报中并以松弛的方式显示它们。

我试过了:

kube_node_status_condition{condition="Ready", status=~"false|unknown"}==1 group_left kube_node_labels
kube_node_status_condition{condition="Ready", status=~"false|unknown"}==1 group_left(node) kube_node_labels

没有出错

Error executing query: invalid parameter "query": 1:75: parse error: unexpected <group_left>

所以我的问题

  • 如何通过 promql 查询获取这些标签?
  • 如何修改 go tpl 以显示带有 label_ 前缀的警报规则标签

【问题讨论】:

    标签: prometheus promql prometheus-alertmanager


    【解决方案1】:

    解决方案

    (kube_node_status_condition{condition="Ready", status="unknown"} * on (node) group_right() kube_node_labels) == 1
    

    输出

    {
     app_kubernetes_io_instance="prometheus",
     app_kubernetes_io_managed_by="Helm",
     app_kubernetes_io_name="kube-state-metrics",
     argocd_argoproj_io_instance="prometheus",
     helm_sh_chart="kube-state-metrics-3.5.2",
     instance="10.120.0.226:8080",
     job="kubernetes-service-endpoints",
     kubernetes_name="prometheus-kube-state-metrics",
     kubernetes_namespace="prometheus",
     kubernetes_node="ip-10-120-1-39.us-west-2.compute.internal",
     label_grafana="true",
     label_node_kubernetes_io_instance_type="t3.small",
     label_node_kubernetes_io_lifecycle="on-demand",
     label_topology_kubernetes_io_region="us-west-2",
     node="ip-10-120-3-76.us-west-2.compute.internal"
    }
    

    查看更多详情:https://www.robustperception.io/left-joins-in-promql

    【讨论】:

      猜你喜欢
      • 2021-07-20
      • 2021-12-24
      • 2021-07-24
      • 1970-01-01
      • 2022-11-02
      • 2022-12-02
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多