【发布时间】:2016-04-27 07:13:59
【问题描述】:
我正在尝试根据自定义指标在 Kubernetes 1.2.3(测试版)集群上设置自动缩放。 (我已经在集群上尝试过基于 CPU 的自动缩放,效果很好。)
我尝试关注他们的custom metrics proposal,但在创建必要的设置时遇到了问题。
这是我到目前为止所做的:
-
为正在部署的 pod 规范添加了自定义指标注释 (类似于他们提案中提供的配置):
apiVersion: v1 kind: ReplicationController metadata: name: metrix namespace: "default" spec: replicas: 1 template: metadata: labels: app: metrix annotations: metrics.alpha.kubernetes.io/custom-endpoints: > [ { "api": "prometheus", "path": "/status", "port": "9090", "names": ["test1"] }, { "api": "prometheus", "path": "/metrics", "port": "9090" "names": ["test2"] } ] spec: containers: - name: metrix image: janaka/prometheus-ep:v1 resources: requests: cpu: 400m 创建了一个标记为
janaka/prometheus-ep:v1(本地)的 Docker 容器,在端口 9090 上运行与 Prometheus 兼容的服务器,端点为/status和/metrics通过在
/etc/default/kubelet处附加--enable-custom-metrics=true到KUBELET_OPTS(基于 the kubelet CLI reference)在 kubelet 上启用自定义指标并重新启动 kubelet
所有 pod(在 default 和 kube-system 命名空间中)都在运行,并且 heapster pod 日志也不包含任何“异常”输出(除了启动时的小故障,由于 InfluxDB 暂时不可用) :
$ kubesys logs -f heapster-daftr
I0427 05:07:45.807277 1 heapster.go:60] /heapster --source=kubernetes:https://kubernetes.default --sink=influxdb:http://monitoring-influxdb:8086
I0427 05:07:45.807359 1 heapster.go:61] Heapster version 1.1.0-beta1
I0427 05:07:45.807638 1 configs.go:60] Using Kubernetes client with master "https://kubernetes.default" and version "v1"
I0427 05:07:45.807661 1 configs.go:61] Using kubelet port 10255
E0427 05:08:15.847319 1 influxdb.go:185] issues while creating an InfluxDB sink: failed to ping InfluxDB server at "monitoring-influxdb:8086" - Get http://monitoring-influxdb:8086/ping: dial tcp xxx.xxx.xxx.xxx:8086: i/o timeout, will retry on use
I0427 05:08:15.847376 1 influxdb.go:199] created influxdb sink with options: host:monitoring-influxdb:8086 user:root db:k8s
I0427 05:08:15.847412 1 heapster.go:87] Starting with InfluxDB Sink
I0427 05:08:15.847427 1 heapster.go:87] Starting with Metric Sink
I0427 05:08:15.877349 1 heapster.go:166] Starting heapster on port 8082
I0427 05:08:35.000342 1 manager.go:79] Scraping metrics start: 2016-04-27 05:08:00 +0000 UTC, end: 2016-04-27 05:08:30 +0000 UTC
I0427 05:08:35.035800 1 manager.go:152] ScrapeMetrics: time: 35.209696ms size: 24
I0427 05:08:35.044674 1 influxdb.go:177] Created database "k8s" on influxDB server at "monitoring-influxdb:8086"
I0427 05:09:05.000441 1 manager.go:79] Scraping metrics start: 2016-04-27 05:08:30 +0000 UTC, end: 2016-04-27 05:09:00 +0000 UTC
I0427 05:09:06.682941 1 manager.go:152] ScrapeMetrics: time: 1.682157776s size: 24
I0427 06:43:38.767146 1 manager.go:79] Scraping metrics start: 2016-04-27 05:09:00 +0000 UTC, end: 2016-04-27 05:09:30 +0000 UTC
I0427 06:43:38.810243 1 manager.go:152] ScrapeMetrics: time: 42.940682ms size: 1
I0427 06:44:05.012989 1 manager.go:79] Scraping metrics start: 2016-04-27 06:43:30 +0000 UTC, end: 2016-04-27 06:44:00 +0000 UTC
I0427 06:44:05.063583 1 manager.go:152] ScrapeMetrics: time: 50.368106ms size: 24
I0427 06:44:35.002038 1 manager.go:79] Scraping metrics start: 2016-04-27 06:44:00 +0000 UTC, end: 2016-04-27 06:44:30 +0000 UTC
但是,自定义端点没有被抓取。 (我通过为我的服务器的启动和端点处理程序添加 stderr 日志来验证它;只有服务器初始化日志显示在 pod 的 kubectl 日志中。)
由于我是 Kubernetes 的新手,我们非常感谢任何帮助。
(根据我对提案和this issue 的理解,我们不必在集群中运行单独的 Prometheus 收集器,因为 cAdvisor 应该已经从pod 规范。这是真的吗,还是我还需要一个单独的 Prometheus 收集器?)
【问题讨论】:
标签: kubernetes metrics autoscaling