【发布时间】:2021-10-13 22:05:51
【问题描述】:
所以我有这个规则
- alert: InstanceNotReady
expr: kube_node_status_condition{condition="Ready", status=~"unknown|false"} == 1
for: 1m
labels:
severity: critical
annotations:
summary: {{`Kubernetes node {{ $labels.node }} is in NotReady state`}}
description: Node entered NotReady unresponsive state
但它不包含节点的标签:
kube_node_status_condition{
app_kubernetes_io_instance="prometheus",
app_kubernetes_io_managed_by="Helm",
app_kubernetes_io_name="kube-state-metrics",
argocd_argoproj_io_instance="prometheus",
condition="Ready",
helm_sh_chart="kube-state-metrics-3.5.2",
instance="10.120.1.147:8080",
job="kubernetes-service-endpoints",
kubernetes_name="prometheus-kube-state-metrics",
kubernetes_namespace="prometheus",
kubernetes_node="ip-10-120-1-39.us-west-2.compute.internal",
node="ip-10-120-3-76.us-west-2.compute.internal",
status="unknown"
}
所以我需要添加分配给 kubernetes 节点的标签,以使警报信息更丰富。
我有我想要的 kube_node_labels
kube_node_labels{
app_kubernetes_io_instance="prometheus",
app_kubernetes_io_managed_by="Helm",
app_kubernetes_io_name="kube-state-metrics",
argocd_argoproj_io_instance="prometheus",
helm_sh_chart="kube-state-metrics-3.5.2",
instance="10.120.0.226:8080",
job="kubernetes-service-endpoints",
kubernetes_name="prometheus-kube-state-metrics",
kubernetes_namespace="prometheus",
kubernetes_node="ip-10-120-1-39.us-west-2.compute.internal",
label_grafana="true",
label_node_kubernetes_io_instance_type="t3.small",
label_node_kubernetes_io_lifecycle="on-demand",
label_topology_kubernetes_io_region="us-west-2",
node="ip-10-120-3-76.us-west-2.compute.internal"
}
所以我想将这些label_* 标签添加到警报中并以松弛的方式显示它们。
我试过了:
kube_node_status_condition{condition="Ready", status=~"false|unknown"}==1 group_left kube_node_labels
kube_node_status_condition{condition="Ready", status=~"false|unknown"}==1 group_left(node) kube_node_labels
没有出错
Error executing query: invalid parameter "query": 1:75: parse error: unexpected <group_left>
所以我的问题
- 如何通过 promql 查询获取这些标签?
- 如何修改 go tpl 以显示带有 label_ 前缀的警报规则标签
【问题讨论】:
标签: prometheus promql prometheus-alertmanager