【发布时间】:2020-09-21 13:39:03
【问题描述】:
我已经使用 AlertManager 服务器(与 Prometheus 安装在同一台服务器中)配置了关于 CloudWatch 导出器关闭时的警报。规则如下:
groups:
- name: Alerts
rules:
# Alert for any instance that is unreachable for >5 minutes.
- alert: CloudWatchExporterDown
expr: up{instance="localhost:9106",job="cloudwatch_exporter"} == 0
for: 5m
labels:
severity: critical
annotations:
summary: "Instance {{ .instance }} down"
description: "{{ .instance }} of job {{ .job }} has been down for more than 5 minutes."
现在我在 /var/log/messages 中有这些错误:
Sep 21 03:55:50 ip-10-193-192-40 prometheus: level=warn ts=2020-09-21T03:55:50.728Z caller=alerting.go:343 component="rule manager" alert=CloudWatchExporterDown msg="Expanding alert template failed" err="error executing template __alert_CloudWatchExporterDown: template: __alert_CloudWatchExporterDown:1:92: executing \"__alert_CloudWatchExporterDown\" at <.instance>: can't evaluate field instance in type struct { Labels map[string]string; ExternalLabels map[string]string; Value float64 }" data="unsupported value type"
我想知道规则有什么问题?为什么不计算表达式 { .instance }?
【问题讨论】:
标签: json prometheus prometheus-alertmanager