【问题标题】:DSE Node Decommissions ItselfDSE 节点自行退役
【发布时间】:2014-06-25 08:35:49
【问题描述】:

我们目前处于 DSE 节点决定自行退役的情况。似乎一开始它遇到了Too many open files 错误,然后决定可以从环中删除节点,因为the disk is FULL。除了让节点自行删除的完整哲学问题之外,磁盘仅被使用了 1/4。

以下是日志文件中的相关条目:

ERROR [pool-1-thread-1] 2014-06-20 01:53:19,957 DiskHealthChecker.java (line 62)  Error  in finding disk space for directory /raid0/cassandra/data
java.io.IOException: Cannot run program "df": error=24, Too many open files
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1041)
        at java.lang.Runtime.exec(Runtime.java:617)
        at java.lang.Runtime.exec(Runtime.java:485)
        at org.apache.commons.io.FileSystemUtils.openProcess(FileSystemUtils.java:535)
        at org.apache.commons.io.FileSystemUtils.performCommand(FileSystemUtils.java:482)
        at org.apache.commons.io.FileSystemUtils.freeSpaceUnix(FileSystemUtils.java:396)
        at org.apache.commons.io.FileSystemUtils.freeSpaceOS(FileSystemUtils.java:266)
        at org.apache.commons.io.FileSystemUtils.freeSpaceKb(FileSystemUtils.java:200)
        at org.apache.commons.io.FileSystemUtils.freeSpaceKb(FileSystemUtils.java:171)
        at com.datastax.bdp.util.DiskHealthChecker.checkDiskSpace(DiskHealthChecker.java:52)
        at com.datastax.bdp.util.DiskHealthChecker.checkDiskSpace(DiskHealthChecker.java:71)
        at com.datastax.bdp.util.DiskHealthChecker.checkDiskSpace(DiskHealthChecker.java:71)
        at com.datastax.bdp.util.DiskHealthChecker.checkDiskSpace(DiskHealthChecker.java:71)
        at com.datastax.bdp.util.DiskHealthChecker.checkDiskSpace(DiskHealthChecker.java:71)
        at com.datastax.bdp.util.DiskHealthChecker.checkDiskSpace(DiskHealthChecker.java:71)
        at com.datastax.bdp.util.DiskHealthChecker.checkDiskSpace(DiskHealthChecker.java:71)
        at com.datastax.bdp.util.DiskHealthChecker.access$000(DiskHealthChecker.java:18)
        at com.datastax.bdp.util.DiskHealthChecker$DiskHealthCheckTask.run(DiskHealthChecker.java:104)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
Caused by: java.io.IOException: error=24, Too many open files
        at java.lang.UNIXProcess.forkAndExec(Native Method)
        at java.lang.UNIXProcess.<init>(UNIXProcess.java:135)
        at java.lang.ProcessImpl.start(ProcessImpl.java:130)
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1022)
        ... 24 more
 INFO [pool-1-thread-1] 2014-06-20 01:53:19,959 DiskHealthChecker.java (line 82) Removing this node from the ring for the disk is close to FULL
 INFO [pool-1-thread-1] 2014-06-20 01:53:19,996 StorageService.java (line 947) LEAVING: sleeping 30000 ms for pending range setup
ERROR [ReadStage:30] 2014-06-20 01:53:22,058 CassandraDaemon.java (line 191) Exception in thread Thread[ReadStage:30,5,main]
java.lang.RuntimeException: java.lang.RuntimeException: java.io.FileNotFoundException: /raid0/cassandra/data/linkcurrent_search/content_items/linkcurrent_search-content_items-ic-1803-Data.db (Too many open files)
        at org.apache.cassandra.service.RangeSliceVerbHandler.doVerb(RangeSliceVerbHandler.java:64)
        at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: /raid0/cassandra/data/linkcurrent_search/content_items/linkcurrent_search-content_items-ic-1803-Data.db (Too many open files)
        at org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:58)
        at org.apache.cassandra.io.sstable.SSTableReader.openDataReader(SSTableReader.java:1213)
        at org.apache.cassandra.io.sstable.SSTableScanner.<init>(SSTableScanner.java:66)
        at org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1017)
        at org.apache.cassandra.db.RowIteratorFactory.getIterator(RowIteratorFactory.java:72)
        at org.apache.cassandra.db.ColumnFamilyStore.getSequentialIterator(ColumnFamilyStore.java:1432)
        at org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1484)
        at org.apache.cassandra.service.RangeSliceVerbHandler.executeLocally(RangeSliceVerbHandler.java:46)
        at org.apache.cassandra.service.RangeSliceVerbHandler.doVerb(RangeSliceVerbHandler.java:58)
        ... 4 more
Caused by: java.io.FileNotFoundException: /raid0/cassandra/data/linkcurrent_search/content_items/linkcurrent_search-content_items-ic-1803-Data.db (Too many open files)
        at java.io.RandomAccessFile.open(Native Method)
        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:241)
        at org.apache.cassandra.io.util.RandomAccessReader.<init>(RandomAccessReader.java:67)
        at org.apache.cassandra.io.compress.CompressedRandomAccessReader.<init>(CompressedRandomAccessReader.java:75)
        at org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:54)
        ... 12 more

【问题讨论】:

    标签: cassandra datastax-enterprise


    【解决方案1】:

    感谢您的发现,我们将禁用此功能并留给其他磁盘监控工具在磁盘接近满时提醒管理员,以便管理员在接近满时采取一些措施。

    【讨论】:

    • 从DSE 4.0开始,可以设置health_check_interval: 0来禁用它。
    【解决方案2】:

    如果你还没有,你可能想要设置

    health_check_interval:0

    在您的 dse.yaml 文件中暂时启用此选项。

    【讨论】:

    • 这个选项似乎只在 DSE 4 中可用。
    • 从DSE 4.0开始,可以设置health_check_interval: 0来禁用它。
    猜你喜欢
    • 2021-05-15
    • 2018-04-08
    • 1970-01-01
    • 2018-11-21
    • 1970-01-01
    • 1970-01-01
    • 2014-11-04
    • 2017-05-19
    • 2018-10-23
    相关资源
    最近更新 更多