【发布时间】:2020-10-23 07:07:15
【问题描述】:
我在 Dataflow 作业中使用 Google Maps Geocode API (https://github.com/googlemaps/google-maps-services-java)。我的 DoFn 在设置时准备 GeoApiContext。流程元素函数是这样完成的:
public void processElement(ProcessContext c) {
String address = c.element().get("Address").toString();
String id = c.element().get("Id").toString();
Gson gson = new GsonBuilder().create();
try {
GeocodingResult[] results = GeocodingApi.newRequest(this.geocodeContext).address(address).language("pt-BR").components(ComponentFilter.country("BR")).await();
if(results.length == 0) {
TableRow outputRow = new TableRow();
outputRow.set("Id", id);
c.output(outputRow);
} else {
for(GeocodingResult r : results) {
TableRow outputRow = convertTableRow(gson.toJson(r).toString());
outputRow.set("Id", id);
c.output(outputRow);
}
}
} catch(ApiException e) {
LOGGER.error("ApiException on address: {}", address, e);
} catch(InterruptedException e) {
LOGGER.error("InterruptedException on address: {}", address, e);
} catch(IOException e) {
LOGGER.error("IOException on address: {}", address, e);
}
}
此代码在本地运行良好,但在部署到数据流时会引发网络错误:
exception: "java.net.ConnectException: Failed to connect to maps.googleapis.com/2607:f8b0:4001:c05:0:0:0:5f:443
at okhttp3.internal.connection.RealConnection.connectSocket(RealConnection.java:265)
at okhttp3.internal.connection.RealConnection.connect(RealConnection.java:183)
at okhttp3.internal.connection.ExchangeFinder.findConnection(ExchangeFinder.java:224)
at okhttp3.internal.connection.ExchangeFinder.findHealthyConnection(ExchangeFinder.java:108)
at okhttp3.internal.connection.ExchangeFinder.find(ExchangeFinder.java:88)
at okhttp3.internal.connection.Transmitter.newExchange(Transmitter.java:169)
at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:41)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117)
at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:94)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117)
at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:88)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:142)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:117)
at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:229)
at okhttp3.RealCall$AsyncCall.execute(RealCall.java:172)
at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.net.ConnectException: Network is unreachable (connect failed)
at java.base/java.net.PlainSocketImpl.socketConnect(Native Method)
at java.base/java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:399)
at java.base/java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:242)
at java.base/java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:224)
at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:403)
at java.base/java.net.Socket.connect(Socket.java:591)
at okhttp3.internal.platform.Platform.connectSocket(Platform.java:130)
at okhttp3.internal.connection.RealConnection.connectSocket(RealConnection.java:263)
... 22 more
我已确保生成的 VM 可以访问互联网,我什至可以从容器内部 ping maps.googleapis.com 端点:
USER@test-geocode-07020834-qmrj-harness-3k2l ~ $ docker container ls
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
b2fd123138aa 3a1cb7aedd54 "/opt/google/dataflo…" 6 minutes ago Up 5 minutes k8s_healthchecker_dataflow-test-geocode-07020834-qmrj-harness-3k2l_default_5648e9815f2ca5beea8b0eb945e12d1f_0
086e36c3dd23 4127911f4769 "/opt/google/dataflo…" 6 minutes ago Up 5 minutes k8s_vmmonitor_dataflow-test-geocode-07020834-qmrj-harness-3k2l_default_5648e9815f2ca5beea8b0eb945e12d1f_0
2890fa415af5 664bd8972b23 "/opt/google/dataflo…" 6 minutes ago Up 6 minutes k8s_shuffle_dataflow-test-geocode-07020834-qmrj-harness-3k2l_default_5648e9815f2ca5beea8b0eb945e12d1f_0
eea757bf6be7 gcr.io/cloud-dataflow/v1beta3/beam-java11-batch "/opt/google/dataflo…" 6 minutes ago Up 6 minutes k8s_java-batch_dataflow-test-geocode-07020834-qmrj-harness-3k2l_default_5648e9815f2ca5beea8b0eb945e12d1f_0
b636784118f5 k8s.gcr.io/pause:3.1 "/pause" 6 minutes ago Up 6 minutes k8s_POD_dataflow-test-geocode-07020834-qmrj-harness-3k2l_default_5648e9815f2ca5beea8b0eb945e12d1f_0
lucas@test-geocode-07020834-qmrj-harness-3k2l ~ $ docker exec -it eea /bin/sh
# ping maps.googleapis.com
PING maps.googleapis.com (172.217.214.95) 56(84) bytes of data.
64 bytes from 172.217.214.95: icmp_seq=1 ttl=115 time=1.08 ms
64 bytes from 172.217.214.95: icmp_seq=2 ttl=115 time=1.28 ms
64 bytes from 172.217.214.95: icmp_seq=3 ttl=115 time=1.15 ms
64 bytes from 172.217.214.95: icmp_seq=4 ttl=115 time=1.41 ms
^C
--- maps.googleapis.com ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3004ms
rtt min/avg/max/mdev = 1.089/1.235/1.414/0.131 ms
#
关于版本,我使用的是最新的 beam 版本 (2.22.0) 和最新的 google maps 版本 (0.14.0)。
不知道还有什么可以看的,感谢任何帮助。
更新
问题似乎是请求是使用 ipv6 地址完成的。但是,GCE 机器似乎不支持 ipv6,并且调用只是失败而没有退回到 ipv4。
考虑到这一点,似乎没有办法解决这个问题:
- 无法使用 Dataflow 将 JVM 配置为首选 ipv4 地址(忽略 JVM 标志)
- 也无法自定义 GCE 机器(因为使用了基础 Dataflow 映像)
- 该库似乎没有打开任何配置 ipv4 或 ipv6 的选项
谢谢
【问题讨论】:
-
您是否指定了任何network 参数来提交作业?您是否为 Dataflow 工作人员分配了外部 IP 地址?
-
是的,它正确指向我的 VPC 及其子网。这些机器有外部 IP,但也配置了 Cloud NAT(不知道这是否会以某种方式干扰)。无论如何,如前所述,我可以 ssh 进入机器并检查互联网连接是否正常,因此它似乎不太可能只是网络或防火墙配置。
-
您能否分享最新更新中给出的发现结果、撰写答案并帮助进一步的贡献者进行研究?
-
真的很抱歉,但我不明白您所说的共享发现结果究竟是什么意思。如果您可以进一步详细说明,我很乐意用其他任何需要修改问题。
-
显然,您正在尝试通过 GCE 实例当前不支持的 IPv6 协议连接到
maps.googleapis.com。你会考虑切换到 IPv4 吗?
标签: java google-cloud-platform okhttp dataflow google-geocoding-api