【发布时间】:2020-05-04 17:38:15
【问题描述】:
我使用 2 个 azure Ubuntu 虚拟机创建了 Kubernetes 集群,并尝试监控集群。为此,我部署了 node-exporter daemonSet、heapster、Prometheus 和 grafana。将节点导出器配置为 Prometheus 规则文件中的目标。但我收到Get http://master-ip:30002/metrics: context deadline exceeded 错误。我还在 Prometheus-rules 文件中增加了 scrape_interval 和 scrape_timeout 值。
以下是 Prometheus-rules 文件和 node-exporter daemonSet 和服务文件的清单文件。
apiVersion: apps/v1
kind: DaemonSet
metadata:
labels:
app: node-exporter
name: node-exporter
namespace: kube-system
spec:
selector:
matchLabels:
app: node-exporter
template:
metadata:
labels:
app: node-exporter
spec:
containers:
- args:
- --web.listen-address=<master-IP>:30002
- --path.procfs=/host/proc
- --path.sysfs=/host/sys
- --path.rootfs=/host/root
- --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+)($|/)
- --collector.filesystem.ignored-fs-types=^(autofs|binfmt_misc|cgroup|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|mqueue|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|sysfs|tracefs)$
image: quay.io/prometheus/node-exporter:v0.18.1
name: node-exporter
resources:
limits:
cpu: 250m
memory: 180Mi
requests:
cpu: 102m
memory: 180Mi
volumeMounts:
- mountPath: /host/proc
name: proc
readOnly: false
- mountPath: /host/sys
name: sys
readOnly: false
- mountPath: /host/root
mountPropagation: HostToContainer
name: root
readOnly: true
- args:
- --logtostderr
- --secure-listen-address=[$(IP)]:9100
- --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256
- --upstream=http://<master-IP>:30002/
env:
- name: IP
valueFrom:
fieldRef:
fieldPath: status.podIP
image: quay.io/coreos/kube-rbac-proxy:v0.4.1
name: kube-rbac-proxy
ports:
- containerPort: 9100
hostPort: 9100
name: https
resources:
limits:
cpu: 20m
memory: 40Mi
requests:
cpu: 10m
memory: 20Mi
hostNetwork: true
hostPID: true
nodeSelector:
kubernetes.io/os: linux
securityContext:
runAsNonRoot: true
runAsUser: 65534
serviceAccountName: node-exporter
tolerations:
- operator: Exists
volumes:
- hostPath:
path: /proc
name: proc
- hostPath:
path: /sys
name: sys
- hostPath:
path: /
name: root
---
apiVersion: v1
kind: Service
metadata:
labels:
k8s-app: node-exporter
name: node-exporter
namespace: kube-system
spec:
type: NodePort
ports:
- name: https
port: 9100
targetPort: https
nodePort: 30002
selector:
app: node-exporter
---prometheus-config-map.yaml-----
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-server-conf
labels:
name: prometheus-server-conf
namespace: default
data:
prometheus.yml: |-
global:
scrape_interval: 5m
evaluation_interval: 3m
scrape_configs:
- job_name: 'node'
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
static_configs:
- targets: ['<master-IP>:30002']
- job_name: 'kubernetes-apiservers'
kubernetes_sd_configs:
- role: endpoints
scheme: https
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
relabel_configs:
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
action: keep
regex: default;kubernetes;https
我们可以将服务作为 Node-exporter daemonSet 的 NodePort 吗?如果答案为 NO,我们如何在 prometheus-rules 文件中配置为目标?谁能帮我理解这个场景?任何建议的链接也可以吗?
【问题讨论】:
-
Daemonset 的日志中是否有任何错误?
-
嗨@Arghya Sadhu。我检查了 node-exporter daemonset 的日志。实际上我的守护进程由两个容器组成。第一个容器(节点导出器)将日志显示为
Listening on localhost:9100" source="node_exporter.go:170"。第二个容器显示日志为Listening securely on [10.0.0.4]:9100。那么,如何在 prometheus 规则文件中配置 node-exporter daemonset? -
目前,以下是我在 prometheus 目标页面中遇到的错误:` Get localhost:9100/metrics: dial tcp 127.0.0.1:9100: connect: connection denied `
-
谢谢@jt97。它对我有用。
标签: kubernetes prometheus-node-exporter