【问题标题】:How to reliably create and detect a thread deadlock如何可靠地创建和检测线程死锁
【发布时间】:2023-03-08 01:16:02
【问题描述】:

我在一个工具类中有一个方法应该在运行时检测死锁的存在:

/**
 * Returns a list of thread IDs that are in a deadlock
 * @return the IDs or <code>null</code> if there is no
 * deadlock in the system
 */
public static String[] getDeadlockedThreads() {
    ThreadMXBean threadBean = ManagementFactory.getThreadMXBean();
    long[] vals = threadBean.findDeadlockedThreads();
    if (vals == null){
        return null;
    }
    String[] ret = new String[vals.length];
    for (int i = 0; i < ret.length; i++){
        ret[i] = Long.toString(vals[i]);
    }
    return ret;
}

我创建了一个 JUnit 测试来测试该功能。它在 Windows 上运行良好,但在 Linux 系统上测试失败 10 次中有 8 次失败。这是我的测试代码:

/**
 * Tests the correct functionality of the get deadlock info functionality
 * 
 * @throws Exception Will be thrown if there was an error
 *             while performing the test
 */
public void testGetDeadlockInformation() throws Exception {
    assertNull("check non-existance of deadlock", ThreadUtils.getDeadlockedThreads());

    final String monitor1 = "Monitor1";
    final String monitor2 = "Monitor2";

    Thread[] retThreads = createDeadlock(monitor1, monitor2, this);

    String[] res = ThreadUtils.getDeadlockedThreads();
    assertNotNull("check existance of returned deadlock info", res);
    assertEquals("check length of deadlock array", 2, res.length);

    retThreads[0].interrupt();
    retThreads[0].interrupt();
    Thread.sleep(100);

    res = ThreadUtils.getDeadlockedThreads();
    assertNotNull("check existance of returned deadlock info", res);
    assertEquals("check length of deadlock array", 2, res.length);
}

/**
 * Creates a deadlock
 * 
 * @param monitor1 monitor 1 that will be used for synchronization
 * @param monitor2 monitor 2 that will be used for synchronization
 * @param waitMonitor The monitor to be used for internal synchronization
 * @return The threads that should be deadlocked
 * @throws InterruptedException Will be thrown if there was an error
 *             while setting up the deadlock
 */
public static Thread[] createDeadlock(final String monitor1, final String monitor2, Object waitMonitor) throws InterruptedException {
    DeadlockThread dt1 = new DeadlockThread(monitor1, monitor2, waitMonitor);
    DeadlockThread dt2 = new DeadlockThread(monitor2, monitor1, waitMonitor);
    DeadlockThread[] retThreads = new DeadlockThread[] {
            dt1,
            dt2,
    };

    synchronized (waitMonitor) {
        dt1.start();
        waitMonitor.wait(1000);
        dt2.start();
        waitMonitor.wait(1000);
    }
    synchronized (monitor1) {
        synchronized (monitor2) {
            monitor1.notifyAll();
            monitor2.notifyAll();
        }
    }
    Thread.sleep(4000);
    return retThreads;
}

private static class DeadlockThread extends Thread {
    private String monitor1;
    private String monitor2;
    private Object waitMonitor;

    public DeadlockThread(String monitor1, String monitor2, Object waitMonitor) {
        this.monitor1 = monitor1;
        this.monitor2 = monitor2;
        this.waitMonitor = waitMonitor;
        setDaemon(true);
        setName("DeadlockThread for monitor " + monitor1 + " and " + monitor2);
    }

    @Override
    public void run() {
        System.out.println(getName() + ": Running");
        synchronized (monitor1) {
            System.out.println(getName() + ": Got lock for monitor '" + monitor1 + "'");
            synchronized (waitMonitor) {
                waitMonitor.notifyAll();
            }
            try {
                System.out.println(getName() + ": Waiting to get lock on '" + monitor2 + "'");
                monitor1.wait(5000);
                System.out.println(getName() + ": Try to get lock on '" + monitor2 + "'");
                synchronized (monitor2) {
                    monitor2.wait(5000);
                }
                System.out.println(getName() + ": Got lock on '" + monitor2 + "', finished");
            } catch (Exception e) {
                // waiting
            }
        }
    }
}

这是运行测试用例时的输出:

DeadlockThread for monitor Monitor1 and Monitor2: Running
DeadlockThread for monitor Monitor1 and Monitor2: Got lock for monitor 'Monitor1'
DeadlockThread for monitor Monitor1 and Monitor2: Waiting to get lock on 'Monitor2'
DeadlockThread for monitor Monitor2 and Monitor1: Running
DeadlockThread for monitor Monitor2 and Monitor1: Got lock for monitor 'Monitor2'
DeadlockThread for monitor Monitor2 and Monitor1: Waiting to get lock on 'Monitor1'
DeadlockThread for monitor Monitor1 and Monitor2: Try to get lock on 'Monitor2'
DeadlockThread for monitor Monitor2 and Monitor1: Try to get lock on 'Monitor1'

根据输出应该有一个死锁,所以要么我尝试检测死锁的方式是错误的,要么是我在这里遗漏的其他东西,不能像我预期的那样工作。但是,测试应该一直失败,而不仅仅是大部分时间。

在 Windows 上运行测试时,输出是相同的。

【问题讨论】:

  • 尝试增加 Thread.sleep(4000) 看看是否有帮助。
  • @NyamiouTheGaleanthrope 4000 是这次尝试的结果。它对 MacOS 有帮助(将其增加到 1000),但如果有更好的解决方案,我认为它可以进一步扩展,我认为这是“最后的手段”。
  • 只是一个猜测 - 可能与 monitor1 和 monitor2 是字符串有关。
  • @AndrewS 这应该不是问题。同步发生在传递的引用上,因此不能混淆内部引用和非内部引用。在这种情况下,我希望输出会有所不同,因为两个线程都会报告他们在第二个监视器上获得了锁,但事实并非如此。

标签: java junit cross-platform deadlock mxbean


【解决方案1】:

只是猜测。您对Thread.sleep() 的使用似乎非常可疑。尝试使用某种形式的通信来确定两个线程都准备好死锁。

未经测试:

   private Thread[] creadDeadlock() throws InterruptedException {
      Thread[] deadLocked = new Thread [2];
      CountDownLatch gate = new CountDownLatch( 2 );
      CountDownLatch ready = new CountDownLatch( 2 );
      Object monitor1 = new Object();
      Object monitor2 = new Object();
      Runnable r1 = () -> {
         synchronized( monitor1 ) {
            try {
               gate.countDown();
               gate.await();
               ready.countDown();
               synchronized( monitor2 ) {
                  wait();
               }
            } catch( InterruptedException ex ) {
               // exit
            }
         }
      };
      Runnable r2 = () -> {
         synchronized( monitor2 ) {
            try {
               gate.countDown();
               gate.await();
               ready.countDown();
               synchronized( monitor1 ) {
                  wait();
               }
            } catch( InterruptedException ex ) {
               // exit
            }
         }
      };

      deadLocked[0] = new Thread( r1 );
      deadLocked[1] = new Thread( r2 );
      deadLocked[0].start();
      deadLocked[1].start();
      ready.await();
      return deadLocked;
   }

【讨论】:

  • 感谢您的回答。最后的sleep 发生在死锁应该已经到位以确保ThreadMXBean 可以意识到死锁之后。您的建议已经在死锁线程的run 方法中发生,方法是等待每个监视器并在createDeadlock 方法中对它们执行notifyAll
  • 你意识到如果你wait()你释放了锁,对吧?
  • 是的,但是这里已经介绍过了。查看输出,应该存在死锁,因为两个线程都没有输出第二个“Got lock on [monitorname]”,如果那部分代码有问题,我会预料到。
  • 但我认为它不可靠。 waitMonitor 真的什么都不做。你希望一个线程被调度并运行,但你不确定。所以你必须使用更明确的东西来确定一个线程(两个线程)已经到达代码中的某个点。
  • waitMonitor 被 trheads 用来告诉调用方法他们准备好死锁(锁定第一个监视器后调用waitMonitor.notifyAll)。
猜你喜欢
  • 2015-01-15
  • 2014-07-19
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多