【问题标题】:kafka Too many open fileskafka 打开的文件太多
【发布时间】:2021-05-22 08:41:20
【问题描述】:

你有没有遇到过关于 kafka 的类似问题?我收到此错误:Too many open files。我不知道为什么。以下是一些日志:

[2018-08-27 10:07:26,268] ERROR Error while deleting the clean shutdown file in dir /home/weihu/kafka/kafka/logs (kafka.server.LogD)
java.nio.file.FileSystemException: /home/weihu/kafka/kafka/logs/BC_20180821_1_LOCATION-87/leader-epoch-checkpoint: Too many open fis
        at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
        at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
        at java.nio.file.Files.newByteChannel(Files.java:361)
        at java.nio.file.Files.createFile(Files.java:632)
        at kafka.server.checkpoints.CheckpointFile.<init>(CheckpointFile.scala:45)
        at kafka.server.checkpoints.LeaderEpochCheckpointFile.<init>(LeaderEpochCheckpointFile.scala:62)
        at kafka.log.Log.initializeLeaderEpochCache(Log.scala:278)
        at kafka.log.Log.<init>(Log.scala:211)
        at kafka.log.Log$.apply(Log.scala:1748)
        at kafka.log.LogManager.loadLog(LogManager.scala:265)
        at kafka.log.LogManager.$anonfun$loadLogs$12(LogManager.scala:335)
        at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:62)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
[2018-08-27 10:07:26,268] ERROR Error while deleting the clean shutdown file in dir /home/weihu/kafka/kafka/logs (kafka.server.LogD)
java.nio.file.FileSystemException: /home/weihu/kafka/kafka/logs/BC_20180822_PARSE-136/leader-epoch-checkpoint: Too many open files
        at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
        at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
        at java.nio.file.Files.newByteChannel(Files.java:361)
        at java.nio.file.Files.createFile(Files.java:632)
        at kafka.server.checkpoints.CheckpointFile.<init>(CheckpointFile.scala:45)
        at kafka.server.checkpoints.LeaderEpochCheckpointFile.<init>(LeaderEpochCheckpointFile.scala:62)
        at kafka.log.Log.initializeLeaderEpochCache(Log.scala:278)
        at kafka.log.Log.<init>(Log.scala:211)
        at kafka.log.Log$.apply(Log.scala:1748)
        at kafka.log.LogManager.loadLog(LogManager.scala:265)
        at kafka.log.LogManager.$anonfun$loadLogs$12(LogManager.scala:335)
        at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:62)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
[2018-08-27 10:07:26,269] ERROR Error while deleting the clean shutdown file in dir /home/weihu/kafka/kafka/logs (kafka.server.LogD)
java.nio.file.FileSystemException: /home/weihu/kafka/kafka/logs/BC_20180813_1_STATISTICS-402/leader-epoch-checkpoint: Too many opens
        at sun.nio.fs.UnixException.translateToIOException(UnixException.java:91)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
        at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
        at java.nio.file.Files.newByteChannel(Files.java:361)
        at java.nio.file.Files.createFile(Files.java:632)
        at kafka.server.checkpoints.CheckpointFile.<init>(CheckpointFile.scala:45)
        at kafka.server.checkpoints.LeaderEpochCheckpointFile.<init>(LeaderEpochCheckpointFile.scala:62)
        at kafka.log.Log.initializeLeaderEpochCache(Log.scala:278)
        at kafka.log.Log.<init>(Log.scala:211)
        at kafka.log.Log$.apply(Log.scala:1748)
        at kafka.log.LogManager.loadLog(LogManager.scala:265)
        at kafka.log.LogManager.$anonfun$loadLogs$12(LogManager.scala:335)
        at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:62)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

【问题讨论】:

  • 会将其移至 serverfault 社区

标签: apache-kafka


【解决方案1】:

在 Kafka 中,每个主题都(可选地)分成许多分区。对于每个分区,一些文件由代理维护(用于索引和实际数据)。

kafka-topics --zookeeper localhost:2181 --describe --topic topic_name

将为您提供主题 topic_name 的分区数。每个主题num.partitions 的默认分区数在/etc/kafka/server.properties 下定义

如果代理托管许多分区并且特定分区有许多日志段文件,则打开文件的总数可能非常大。

运行可以看到当前文件描述符的限制

ulimit -n

您也可以使用lsof查看打开的文件数:

lsof | wc -l

要解决此问题,您需要更改打开文件描述符的限制:

ulimit -n <noOfFiles>

或以某种方式减少打开文件的数量(例如,减少每个主题的分区数量)。

【讨论】:

  • 非常感谢您完美地解决了我的问题。而且打开文件的数量需要多次尝试才能确认。
  • 您可能会提到unix.stackexchange.com/questions/8945/…,因为提高 ulimit 受系统限制。
【解决方案2】:

在使用 Systemd 的 Linux 发行版(如 RHEL 和 CentOS)上,您需要在 Systemd 服务文件的第二个块中添加配置行。仅更改 /etc/security/limits.conf 是不够的。

[Unit]
Requires=network.target remote-fs.target
After=network.target remote-fs.target

[Service]
Type=simple
User=kafka
LimitAS=infinity
LimitRSS=infinity
LimitCORE=infinity
LimitNOFILE=65536
ExecStart=/home/kafka/kafka/bin/zookeeper-server-start.sh /home/kafka/kafka/config/zookeeper.properties
ExecStop=/home/kafka/kafka/bin/zookeeper-server-stop.sh
Restart=on-abnormal

[Install]
WantedBy=multi-user.target

【讨论】:

  • 我在服务中添加LimitNOFILE,kafka.service,没关系
猜你喜欢
  • 1970-01-01
  • 2021-12-28
  • 1970-01-01
  • 1970-01-01
  • 2020-09-21
  • 1970-01-01
  • 1970-01-01
  • 2012-05-09
相关资源
最近更新 更多