【发布时间】:2020-10-15 13:50:22
【问题描述】:
我有一个运行在 kubernetes 集群(在 AWS EKS 上)的 mongo db 副本集,比如 cluster-1。这是在具有 cidr 192.174.0.0/16 的 VPC-1 中运行的。
我在一个单独的 VPC 中有另一个集群,比如 VPC-2,我将在 mongo 集群上运行一些应用程序。此 VPC cidr 范围为 192.176.0.0/16。所有 VPC 对等互连和安全组入口/出口规则都运行良好,我能够跨两个 VPC ping 集群节点。
我正在为 mongo 集群使用 NodePort 类型的服务和 StatefulSet:
apiVersion: v1
kind: Service
metadata:
name: mongodb-service
labels:
name: mongodb
spec:
selector:
role: mongo
type: NodePort
ports:
- port: 26017
targetPort: 27017
nodePort: 30017
这里是 mongo 集群 cluster-1 中的节点和 pod:
ubuntu@ip-192-174-5-253:/st_config/kubeobj$ kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
ip-192-174-187-133.ap-south-1.compute.internal Ready <none> 19h v1.16.8-eks-e16311 192.174.187.133 13.232.195.39 Amazon Linux 2 4.14.181-140.257.amzn2.x86_64 docker://19.3.6
ip-192-174-23-229.ap-south-1.compute.internal Ready <none> 19h v1.16.8-eks-e16311 192.174.23.229 13.234.111.139 Amazon Linux 2 4.14.181-140.257.amzn2.x86_64 docker://19.3.6
ubuntu@ip-192-174-5-253:/st_config/kubeobj$
ubuntu@ip-192-174-5-253:/st_config/kubeobj$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
mongod-0 1/1 Running 0 45m 192.174.8.10 ip-192-174-23-229.ap-south-1.compute.internal <none> <none>
mongod-1 1/1 Running 0 44m 192.174.133.136 ip-192-174-187-133.ap-south-1.compute.internal <none> <none>
ubuntu@ip-192-174-5-253:/st_config/kubeobj$
如果我尝试使用特定节点地址或两个节点地址进行连接,kubernetes 可能正在负载平衡或以循环方式旋转连接:
ubuntu@ip-192-176-42-206:~$ mongo mongodb://192.174.23.229:30017
MongoDB shell version v3.6.3
connecting to: mongodb://192.174.23.229:30017
MongoDB server version: 3.4.24
WARNING: shell and server versions do not match
test_rs0:PRIMARY>
ubuntu@ip-192-176-42-206:~$ mongo mongodb://192.174.23.229:30017
MongoDB shell version v3.6.3
connecting to: mongodb://192.174.23.229:30017
MongoDB server version: 3.4.24
WARNING: shell and server versions do not match
test_rs0:SECONDARY>
ubuntu@ip-192-176-42-206:~$ mongo mongodb://192.174.23.229:30017,192.174.187.133:30017
MongoDB shell version v3.6.3
connecting to: mongodb://192.174.23.229:30017,192.174.187.133:30017
MongoDB server version: 3.4.24
WARNING: shell and server versions do not match
test_rs0:PRIMARY>
我希望利用副本集功能。因此,当我将连接字符串用作 - mongodb://192.174.23.229:30017,192.174.187.133:30017/?replicaSet=test_rs0 时,它实际上正在获取未从集群中的节点解析的 Pod 的 FQDN - VPC-2 中有 2 个节点/pod。
ubuntu@ip-192-176-42-206:~$ mongo mongodb://192.174.23.229:30017,192.174.187.133:30017/?replicaSet=test_rs0
MongoDB shell version v3.6.3
connecting to: mongodb://192.174.23.229:30017,192.174.187.133:30017/?replicaSet=test_rs0
2020-06-23T15:59:07.407+0000 I NETWORK [thread1] Starting new replica set monitor for test_rs0/192.174.23.229:30017,192.174.187.133:30017
2020-06-23T15:59:07.409+0000 I NETWORK [ReplicaSetMonitor-TaskExecutor-0] Successfully connected to 192.174.23.229:30017 (1 connections now open to 192.174.23.229:30017 with a 5 second timeout)
2020-06-23T15:59:07.409+0000 I NETWORK [thread1] Successfully connected to 192.174.187.133:30017 (1 connections now open to 192.174.187.133:30017 with a 5 second timeout)
2020-06-23T15:59:07.410+0000 I NETWORK [thread1] changing hosts to test_rs0/mongod-0.mongodb-service.default.svc.cluster.local:27017,mongod-1.mongodb-service.default.svc.cluster.local:27017 from test_rs0/192.174.187.133:30017,192.174.23.229:30017
2020-06-23T15:59:07.415+0000 I NETWORK [thread1] getaddrinfo("mongod-1.mongodb-service.default.svc.cluster.local") failed: Name or service not known
2020-06-23T15:59:07.415+0000 I NETWORK [ReplicaSetMonitor-TaskExecutor-0] getaddrinfo("mongod-0.mongodb-service.default.svc.cluster.local") failed: Name or service not known
2020-06-23T15:59:07.917+0000 I NETWORK [thread1] getaddrinfo("mongod-0.mongodb-service.default.svc.cluster.local") failed: Name or service not known
2020-06-23T15:59:07.918+0000 I NETWORK [thread1] getaddrinfo("mongod-1.mongodb-service.default.svc.cluster.local") failed: Name or service not known
2020-06-23T15:59:07.918+0000 W NETWORK [thread1] Unable to reach primary for set test_rs0
2020-06-23T15:59:07.918+0000 I NETWORK [thread1] Cannot reach any nodes for set test_rs0. Please check network connectivity and the status of the set. This has happened for 1 checks in a row.
我是否需要一些额外的 DNS 服务才能在 VPC-2 节点中解析名称?最好的方法是什么?
另外,我如何使用连接字符串可以基于服务名称,例如。 mongodb://mongodb-service.default.svc.cluster.local:/?replicaSet=test_rs0 来自 VPC-2 中的任何节点?它适用于 VPC-1 中的任何 pod。但是我需要从 VPC-2 中的集群中的 pod 开始工作,这样我就不必在连接字符串中指定特定的 pod/节点 IP。我所有的 kubernetes 对象都在默认命名空间中。
非常感谢这里的一些帮助。 **请注意:我没有使用 helm **
【问题讨论】:
-
@SureshVishnoi 你能看一下并提出建议吗?
标签: mongodb kubernetes