【问题标题】:Resetting heartbeat timestamps because of huge system clock jump由于巨大的系统时钟跳跃而重置心跳时间戳
【发布时间】:2021-09-05 04:42:15
【问题描述】:

我正在运行 hazelcast 应用程序,但在将我的机器保持在睡眠模式/注销一段时间后,我遇到了以下错误。

2021-06-21 14:41:07.854  INFO 8288 --- [cached.thread-7] c.h.i.c.impl.ClusterHeartbeatManager 

    : [192.168.181.51]:5701 [APP] [4.2] System clock apparently jumped from 2021-06-21 14:10:28.569 to 2021-06-21 14:41:07.832 since last heartbeat (+1834263 ms)
2021-06-21 14:41:07.855  INFO 8288 --- [cached.thread-9] c.h.i.server.tcp.TcpServerConnection     : [192.168.181.51]:5701 [APP] [4.2] Connection[id=1, /127.0.0.1:5701->/127.0.0.1:5702, qualifier=null, endpoint=[127.0.0.1]:5702, alive=false, connectionType=JVM, planeIndex=-1] closed. Reason: Client heartbeat is timed out, closing connection to Connection[id=1, /127.0.0.1:5701->/127.0.0.1:5702, qualifier=null, endpoint=[127.0.0.1]:5702, alive=true, connectionType=JVM, planeIndex=-1]. Now: 2021-06-21 14:41:07.833. LastTimePacketReceived: 2021-06-21 14:10:29.314
2021-06-21 14:41:07.915  WARN 8288 --- [cached.thread-7] c.h.i.c.impl.ClusterHeartbeatManager     : [192.168.181.51]:5701 [APP] [4.2] Resetting heartbeat timestamps because of huge system clock jump! Clock-Jump: 1834263 ms, Heartbeat-Timeout: 60000 ms
2021-06-21 14:41:08.208  WARN 8288 --- [onMonitorThread] c.h.s.i.o.impl.InvocationMonitor         : [192.168.181.51]:5701 [APP] [4.2] MonitorInvocationsTask delayed 1836451 ms
2021-06-21 14:41:08.213  WARN 8288 --- [onMonitorThread] c.h.s.i.o.impl.InvocationMonitor         : [192.168.181.51]:5701 [APP] [4.2] BroadcastOperationControlTask delayed 1834623 ms
2021-06-21 14:41:08.539  INFO 8288 --- [cached.thread-9] c.h.i.server.tcp.TcpServerConnection     : [192.168.181.51]:5701 [APP] [4.2] Connection[id=2, /127.0.0.1:5701->/127.0.0.1:5703, qualifier=null, endpoint=[127.0.0.1]:5703, alive=false, connectionType=JVM, planeIndex=-1] closed. Reason: Client heartbeat is timed out, closing connection to Connection[id=2, /127.0.0.1:5701->/127.0.0.1:5703, qualifier=null, endpoint=[127.0.0.1]:5703, alive=true, connectionType=JVM, planeIndex=-1]. Now: 2021-06-21 14:41:08.539. LastTimePacketReceived: 2021-06-21 14:10:29.949
2021-06-21 14:41:08.551  WARN 8288 --- [ached.thread-36] c.h.i.cluster.impl.MulticastService      : [192.168.181.51]:5701 [APP] [4.2] Sending multicast datagram failed. Exception message saying the operation is not permitted usually means the underlying OS is not able to send packets at a given pace. It can be caused by starting several hazelcast members in parallel when the members send their join message nearly at the same time.

java.net.NoRouteToHostException: No route to host: Datagram send failed
        at java.net.TwoStacksPlainDatagramSocketImpl.send(Native Method) ~[na:1.8.0_251]
        at java.net.DatagramSocket.send(Unknown Source) ~[na:1.8.0_251]
        at com.hazelcast.internal.cluster.impl.MulticastService.send(MulticastService.java:291) ~[hazelcast-all-4.2.jar!/:4.2]
        at com.hazelcast.internal.cluster.impl.MulticastJoiner.searchForOtherClusters(MulticastJoiner.java:113) [hazelcast-all-4.2.jar!/:4.2]
        at com.hazelcast.internal.cluster.impl.SplitBrainHandler.searchForOtherClusters(SplitBrainHandler.java:75) [hazelcast-all-4.2.jar!/:4.2]
        at com.hazelcast.internal.cluster.impl.SplitBrainHandler.run(SplitBrainHandler.java:42) [hazelcast-all-4.2.jar!/:4.2]
        at com.hazelcast.spi.impl.executionservice.impl.DelegateAndSkipOnConcurrentExecutionDecorator$DelegateDecorator.run(DelegateAndSkipOnConcurrentExecutionDecorator.java:77) [hazelcast-all-4.2.jar!/:4.2]
        at com.hazelcast.internal.util.executor.CachedExecutorServiceDelegate$Worker.run(CachedExecutorServiceDelegate.java:217) [hazelcast-all-4.2.jar!/:4.2]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [na:1.8.0_251]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [na:1.8.0_251]
        at java.lang.Thread.run(Unknown Source) [na:1.8.0_251]
        at com.hazelcast.internal.util.executor.HazelcastManagedThread.executeRun(HazelcastManagedThread.java:76) [hazelcast-all-4.2.jar!/:4.2]
        at com.hazelcast.internal.util.executor.HazelcastManagedThread.run(HazelcastManagedThread.java:102) [hazelcast-all-4.2.jar!/:4.2]

我的客户端配置如下:

ClientConfig clientConfig = new ClientConfig();
clientConfig.setClusterName("abc");
clientConfig.getNetworkConfig().addAddress("localhost");
clientConfig.getNetworkConfig().setSmartRouting(true);
clientConfig.getNetworkConfig().addOutboundPortDefinition("5701-5720");

ClientConnectionStrategyConfig connectionStrategyConfig = clientConfig.getConnectionStrategyConfig();
ConnectionRetryConfig connectionRetryConfig = connectionStrategyConfig.getConnectionRetryConfig();
connectionRetryConfig.setInitialBackoffMillis(1000)
                     .setMaxBackoffMillis(60000)
                     .setMultiplier(2)
                     .setClusterConnectTimeoutMillis(1000)
                     .setJitter(0.2);

HazelcastClient hc = HazelcastClient.newHazelcastClient(clientConfig);

请告诉我我做错了什么配置或为什么会发生这种情况?

【问题讨论】:

  • 堆栈跟踪与时钟跳转警告无关。计算机从睡眠中唤醒后,预计会出现时钟跳跃。另一个问题很可能是由于您的操作系统在睡眠后(重新)初始化网络接口的方式。它是什么操作系统?
  • @JaromirHamala,这是窗户。我们如何防止它停止?
  • 我也不明白,如果服务器和客户端已经连接,那么如果机器注销/睡眠它们应该在后台运行,它们应该如何断开连接,因为进程仍在运行,对吧?或者我误解了什么。
  • 睡眠有不同的级别,这取决于操作系统的确切版本、您的硬件能力等。无论您的确切设置如何,睡眠的全部意义在于节省电能。这意味着,CPU 通常处于超低性能模式,进程经常被冻结等。当具有 Hazelcast 客户端的系统处于睡眠状态时,从 Hazelcast 服务器的角度来看,它看起来好像客户端被冻结并且从未响应因此断开连接客户并不意外。
  • 这是 Windows 10,这让我有些困惑,我有 Spring Boot 应用程序,远程客户端正在使用该应用程序,但由于客户端机器进入注销/睡眠模式,整个功能停止工作,但应用程序其他部分工作正常,有什么办法可以防止这种情况发生,因为在那之后我们重新启动了整个服务器,这对我们不利。请帮忙。

标签: java hazelcast hazelcast-imap


【解决方案1】:

这是基于套接字的应用程序的常见问题。理想情况下,您禁用睡眠/省电模式。您可以尝试使用SystemParametersInfo API: SystemParametersInfo(SPI_SETPOWEROFFACTIVE, 0, NULL, 0);

但通常这会被视为行为不端,因为您应该在安装应用程序期间禁用关机功能从而请求权限。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2013-10-15
    • 1970-01-01
    • 2018-07-13
    相关资源
    最近更新 更多