【问题标题】:Apache Beam - Bigquery streaming insert showing RuntimeException: ManagedChannel allocation siteApache Beam - Bigquery 流式插入显示 RuntimeException:ManagedChannel 分配站点
【发布时间】:2021-08-19 11:18:42
【问题描述】:

我正在 Google Dataflow 中运行流式 Apache 光束管道。它正在从 Kafka 读取数据并将插入流式传输到 Bigquery。

但在 bigquery 流式插入步骤中,它会引发大量警告 -

    java.lang.RuntimeException: ManagedChannel allocation site
at io.grpc.internal.ManagedChannelOrphanWrapper$ManagedChannelReference.<init> (ManagedChannelOrphanWrapper.java:93)
at io.grpc.internal.ManagedChannelOrphanWrapper.<init> (ManagedChannelOrphanWrapper.java:53)
at io.grpc.internal.ManagedChannelOrphanWrapper.<init> (ManagedChannelOrphanWrapper.java:44)
at io.grpc.internal.ManagedChannelImplBuilder.build (ManagedChannelImplBuilder.java:612)
at io.grpc.internal.AbstractManagedChannelImplBuilder.build (AbstractManagedChannelImplBuilder.java:261)
at com.google.api.gax.grpc.InstantiatingGrpcChannelProvider.createSingleChannel (InstantiatingGrpcChannelProvider.java:340)
at com.google.api.gax.grpc.InstantiatingGrpcChannelProvider.access$1600 (InstantiatingGrpcChannelProvider.java:73)
at com.google.api.gax.grpc.InstantiatingGrpcChannelProvider$1.createSingleChannel (InstantiatingGrpcChannelProvider.java:214)
at com.google.api.gax.grpc.ChannelPool.create (ChannelPool.java:72)
at com.google.api.gax.grpc.InstantiatingGrpcChannelProvider.createChannel (InstantiatingGrpcChannelProvider.java:221)
at com.google.api.gax.grpc.InstantiatingGrpcChannelProvider.getTransportChannel (InstantiatingGrpcChannelProvider.java:204)
at com.google.api.gax.rpc.ClientContext.create (ClientContext.java:169)
at com.google.cloud.bigquery.storage.v1beta2.stub.GrpcBigQueryWriteStub.create (GrpcBigQueryWriteStub.java:138)
at com.google.cloud.bigquery.storage.v1beta2.stub.BigQueryWriteStubSettings.createStub (BigQueryWriteStubSettings.java:145)
at com.google.cloud.bigquery.storage.v1beta2.BigQueryWriteClient.<init> (BigQueryWriteClient.java:128)
at com.google.cloud.bigquery.storage.v1beta2.BigQueryWriteClient.create (BigQueryWriteClient.java:109)
at org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl.newBigQueryWriteClient (BigQueryServicesImpl.java:1255)
at org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl.access$800 (BigQueryServicesImpl.java:135)
at org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.<init> (BigQueryServicesImpl.java:521)
at org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.<init> (BigQueryServicesImpl.java:449)
at org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl.getDatasetService (BigQueryServicesImpl.java:169)
at org.apache.beam.sdk.io.gcp.bigquery.BatchedStreamingWrite.flushRows (BatchedStreamingWrite.java:374)
at org.apache.beam.sdk.io.gcp.bigquery.BatchedStreamingWrite.access$800 (BatchedStreamingWrite.java:69)
at org.apache.beam.sdk.io.gcp.bigquery.BatchedStreamingWrite$BatchAndInsertElements.finishBundle (BatchedStreamingWrite.java:271)
at org.apache.beam.sdk.io.gcp.bigquery.BatchedStreamingWrite$BatchAndInsertElements$DoFnInvoker.invokeFinishBundle (Unknown Source)
at org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SimpleDoFnRunner.finishBundle (SimpleDoFnRunner.java:242)
at org.apache.beam.runners.dataflow.worker.SimpleParDoFn.finishBundle (SimpleParDoFn.java:432)
at org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.finish (ParDoOperation.java:56)
at org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute (MapTaskExecutor.java:103)
at org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process (StreamingDataflowWorker.java:1430)
at org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1100 (StreamingDataflowWorker.java:165)
at org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$7.run (StreamingDataflowWorker.java:1109)
at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:624)
at java.lang.Thread.run (Thread.java:748)

我正在使用 apache beam java sdk 2.29.0。

任何线索是什么导致了这个问题?

【问题讨论】:

  • 这似乎是 Beam 或其依赖项之一中的错误,而不是用户错误。例如,这是 github 上的 similar bug report。将此问题报告为 Apache Beam Jira 上的错误可能会有所帮助,尤其是在您有复制问题的说明时。

标签: google-bigquery google-cloud-dataflow apache-beam


【解决方案1】:

Apache Beam 管道从 Pub/Sub 读取并流式传输到 BigQuery 时遇到了同样的问题。我能够通过降级到 Apache Beam java SDK 的 2.28.0 版本来“解决”它。似乎问题是在 SDK 的 2.29.0 版本中引入的,并且在 2.30.0 中仍然存在。

【讨论】:

  • 这也为我们解决了这个问题。作为参考,this ticket 表示该问题将在 2.31.0 版本中修复。
【解决方案2】:

此问题的修复程序刚刚合并到 Beam 中,有望在下一个 Beam 版本中发布。

【讨论】:

    【解决方案3】:

    我有同样的问题。我正在使用批处理(也尝试过流式传输)写入 BigQuery。我通过降级到 2.28.0 来“解决”它

    任何等于或大于 2.29.0 的版本都会产生相同的错误。虽然我的管道仍然可以将数据写入 BigQuery。

    【讨论】:

      猜你喜欢
      • 2019-06-06
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2018-12-07
      • 2021-02-25
      • 1970-01-01
      • 2020-12-21
      相关资源
      最近更新 更多