Prometheus 的 http_request 自定义指标在 Kubernetes 中不起作用答案

【问题标题】：Prometheus' http_request custom-metric not working in KubernetesPrometheus 的 http_request 自定义指标在 Kubernetes 中不起作用
【发布时间】：2019-06-06 14:56:04
【问题描述】：

我对亚马逊 AWS 上的 Kubernetes 和 Prometheus 的自定义指标有疑问。默认情况下，CPU 和内存指标运行良好。 Prometheus http_requests 不是，这是错误：

$ kubectl describe hpa hpa-deploy
Name:                       hpa-deploy
Namespace:                  default
Labels:                     <none>
Annotations:                kubectl.kubernetes.io/last-applied-configuration:
                              {"apiVersion":"autoscaling/v2beta2","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"hpa-deploy","namespace":"default...
CreationTimestamp:          Thu, 06 Jun 2019 11:06:48 +0000
Reference:                  Deployment/django
Metrics:                    ( current / target )
  "http_requests" on pods:  <unknown> / 2k
Min replicas:               1
Max replicas:               10
Deployment pods:            1 current / 0 desired
Conditions:
  Type           Status  Reason               Message
  ----           ------  ------               -------
  AbleToScale    True    SucceededGetScale    the HPA controller was able to get the target's current scale
  ScalingActive  False   FailedGetPodsMetric  the HPA was unable to compute the replica count: unable to get metric http_requests: unable to fetch metrics from custom metrics API: the server could not find the metric http_requests for pods
Events:
  Type     Reason               Age                     From                       Message
  ----     ------               ----                    ----                       -------
  Warning  FailedGetPodsMetric  8m53s (x414 over 114m)  horizontal-pod-autoscaler  unable to get metric http_requests: unable to fetch metrics from custom metrics API: the server is currently unable to handle the request (get pods.custom.metrics.k8s.io *)
  Warning  FailedGetPodsMetric  3m48s (x12 over 6m36s)  horizontal-pod-autoscaler  unable to get metric http_requests: unable to fetch metrics from custom metrics API: the server could not find the metric http_requests for pods

我按照github 项目的建议使用 helm 安装了 Prometheus 并检查了 api：

$ kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1
{
  "kind": "APIResourceList",
  "apiVersion": "v1",
  "groupVersion": "custom.metrics.k8s.io/v1beta1",
  "resources": []
}

然后添加以下规则：

$ kubectl edit cm my-release-prometheus-adapter
    rules:
    - seriesQuery: 'http_requests_total{kubernetes_namespace!="",kubernetes_pod_name!=""}'
      resources:
        overrides:
          kubernetes_namespace: {resource: "namespace"}
          kubernetes_pod_name: {resource: "pod"}
      name:
        matches: "^(.*)_total"
        as: "${1}_per_second"
      metricsQuery: 'sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m])) by (<<.GroupBy>>)'

walkthrough说新规则添加后api检查的返回值应该在"resources":[]里面有值，但是没有，不知道为什么。

这是我的 hpa 代码：

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: hpa-deploy
spec:
  scaleTargetRef:
    apiVersion: extensions/v1beta1
    kind: Deployment
    name: django
  minReplicas: 1
  maxReplicas: 10
  metrics:
    - type: Pods
      pods:
        metric:
          name: http_requests
        target:
          type: Value
          averageValue: 2k

另外，我使用的是基于 Nginx 的入口控制器，但入口和服务的 hpa kubectl describe 表明：

$ kubectl describe hpa hpa-ingress
Name:                                                      hpa-ingress
Namespace:                                                 default
Labels:                                                    <none>
Annotations:                                               kubectl.kubernetes.io/last-applied-configuration:
                                                             {"apiVersion":"autoscaling/v2beta2","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"hpa-ingress","namespace":"defaul...
CreationTimestamp:                                         Thu, 06 Jun 2019 11:06:48 +0000
Reference:                                                 Ingress/test-ingress
Metrics:                                                   ( current / target )
  "http_requests" on Ingress/test-ingress (target value):  <unknown> / 2k
Min replicas:                                              1
Max replicas:                                              10
Ingress pods:                                              0 current / 0 desired
Conditions:
  Type         Status  Reason          Message
  ----         ------  ------          -------
  AbleToScale  False   FailedGetScale  the HPA controller was unable to get the target's current scale: the server could not find the requested resource
Events:
  Type     Reason          Age                     From                       Message
  ----     ------          ----                    ----                       -------
  Warning  FailedGetScale  2m40s (x473 over 122m)  horizontal-pod-autoscaler  the server could not find the requested resource

我不确定是否必须手动导出 Pod 的 http_requests 指标，如果是这种情况，我该怎么做？文档都是“复制和粘贴，一切都会正常工作”，但事实并非如此。请，如果可能的话，越详细越好，我对这个主题真的很陌生。非常感谢。

【问题讨论】：

我也面临一个非常相似的问题。如果我走得太远，我会继续努力并更新你。
非常感谢。如果我能找到新的东西，我会发送。
办公室的一个人告诉我，自定义指标默认为空，并且 kube 不会真正读取它们，直到所有 pod 上的指标端点都命中。我还没有确认这一点，但我想我会分享，以防你觉得它有用。
我们存储所有这些自定义配置的共享空间空间不足，导致此功能无法正常工作。我们的解决方案是清除它。希望我能提供更多信息，但为我们处理这件事的是另一个团队。
谢谢，我会试试看，看看会发生什么。

标签： kubernetes prometheus-operator

【解决方案1】：

在我的情况下，这是由于错误的普罗米修斯端点。我通过将日志级别设置为 6 发现了这一点，发现 prometheus 查询日志失败并出现 404 错误。

https://github.com/kubernetes-sigs/prometheus-adapter/blob/master/README.md#why-isnt-my-metric-showing-up

【讨论】：