【问题标题】:NettyBlockTransferService does not respect spark.blockManager.port configurationNettyBlockTransferService 不尊重 spark.blockManager.port 配置
【发布时间】:2020-03-26 22:39:14
【问题描述】:

我正在运行 Spark 2.4.4。在纱线上。 NodeManagers 上的 spark 配置如下所示:

spark-defaults.conf:

spark.driver.port=38429
spark.blockManager.port=35430
spark.driver.blockManager.port=44349

创建 Spark 驱动程序和执行程序时,它们会选择驱动程序端口 (38429) 配置,而不是 blockManager (35430) / driver.blockManager (44349) 配置。 blockManager 端口是随机分配的

司机:

14:23:40 INFO spark.SparkContext: Running Spark version 2.4.4
14:23:40 INFO util.Utils: Successfully started service 'sparkDriver' on port **38429**.
14:23:41 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 38171.
14:23:41 INFO netty.NettyBlockTransferService: Server created on driverhost:**38171**

执行者:

14:23:44 INFO client.TransportClientFactory: Successfully created connection to driverhost:**38429** after 73 ms (0 ms spent in bootstraps)
14:23:45 INFO executor.Executor: Starting executor ID 1 on host ...
14:23:45 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 34914.
14:23:45 INFO netty.NettyBlockTransferService: Server created on executorhost:**34914**

我遇到了一个描述此问题的错误 Jira,但它是针对 Spark 2.4.0 提出的,并在 12 个月前关闭:https://issues.apache.org/jira/browse/SPARK-27139

查看 GitHub 中的 Spark 代码,我没有发现任何明显的东西:

https://github.com/apache/spark/blob/branch-2.4/core/src/main/scala/org/apache/spark/SparkEnv.scala

333    val blockManagerPort = if (isDriver) {
334      conf.get(DRIVER_BLOCK_MANAGER_PORT)
335    } else {
336      conf.get(BLOCK_MANAGER_PORT)
337    }
338
339    val blockTransferService =
340      new NettyBlockTransferService(conf, securityManager, bindAddress, advertiseAddress,
341        blockManagerPort, numUsableCores)

https://github.com/apache/spark/blob/branch-2.4/core/src/main/scala/org/apache/spark/internal/config/package.scala

308  private[spark] val BLOCK_MANAGER_PORT = ConfigBuilder("spark.blockManager.port")
309    .doc("Port to use for the block manager when a more specific setting is not provided.")
310    .intConf
311    .createWithDefault(0)
312
313  private[spark] val DRIVER_BLOCK_MANAGER_PORT = ConfigBuilder("spark.driver.blockManager.port")
314    .doc("Port to use for the block manager on the driver.")
315    .fallbackConf(BLOCK_MANAGER_PORT)

谁能告诉我为什么我的 NettyBlockTransferService 端口是随机分配的,而不是 35430 或 44349?

【问题讨论】:

  • 您可以尝试通过spark-submit 命令行传递这些配置吗?
  • 这似乎奏效了,谢谢。对我来说这似乎是一个错误,除非我的配置有问题。但我可以用这种方法解决它 INFO util.Utils: Successly started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 44349. INFO netty.NettyBlockTransferService: Server created on driverhost:44349
  • 此外,我不确定为什么会这样,Spark 文档建议在 spark-defaults.conf 中使用空格分隔键值,而不是等号。 spark.apache.org/docs/latest/…

标签: apache-spark hadoop hadoop-yarn


【解决方案1】:

这里的问题是在 YARN NodeManagers 上设置这个配置。它需要在客户端(即提交 Spark 应用程序的进程)上设置,而不是在集群本身上设置。

【讨论】:

    猜你喜欢
    • 2019-03-10
    • 2013-06-05
    • 2018-05-28
    • 2013-10-01
    • 1970-01-01
    • 2017-12-10
    • 2020-03-06
    • 1970-01-01
    • 2019-08-26
    相关资源
    最近更新 更多