Spring之ShutDown Hook死锁现象解读

Spring ShutDown Hook死锁现象

偶然出现一次项目异常spring却没有正常停止的情况,最终发现是Spring Shutdown导致的死锁现象。

某个框架里嵌入了类似这样的一段代码

  1. @Component
  2. public class ShutDownHookTest implements ApplicationListener<ContextRefreshedEvent> {
  3.      @Override
  4.      public void onApplicationEvent(ContextRefreshedEvent event) {
  5.          if (onException) {
  6.          System.out.println(“test shutdown hook deadlock”);
  7.              System.exit(0);
  8.          }
  9.      }
  10. }

它的逻辑就是想要在出现异常后,通过System.exit来确保应用程序退出。

而且没有使用异步事件,是在主线程下跑了System.exit,然后就发现springboot server还是正常运行着的。

而且程序看着好像也没问题,由于我们是dubbo服务化系统,在测试环境上服务还是正常的。

这很明显不符常理,正常来说,System.exit这样的指令是spring能够感知到的,并且会执行shutDown处理的,先来看看Spring 注册ShutdownHook

  1. public abstract class AbstractApplicationContext extends DefaultResourceLoader
  2.          implements ConfigurableApplicationContext {
  3.      @Override
  4.      public void registerShutdownHook() {
  5.          if (this.shutdownHook == null) {
  6.              // No shutdown hook registered yet.
  7.              this.shutdownHook = new Thread(SHUTDOWN_HOOK_THREAD_NAME) {
  8.                  @Override
  9.                  public void run() {
  10.                  //重点在这里获取startupShutdownMonitor的监视器锁
  11.                      synchronized (startupShutdownMonitor) {
  12.                      doClose();
  13.                      }
  14.                  }
  15.              };
  16.              Runtime.getRuntime().addShutdownHook(this.shutdownHook);
  17.          }
  18.      }
  19.      protected void doClose() {
  20.          // Check whether an actual close attempt is necessary…
  21.          if (this.active.get() && this.closed.compareAndSet(false, true)) {
  22.              if (logger.isDebugEnabled()) {
  23.                  logger.debug(“Closing “ + this);
  24.              }
  25.              if (!NativeDetector.inNativeImage()) {
  26.                  LiveBeansView.unregisterApplicationContext(this);
  27.              }
  28.              try {
  29.                  // Publish shutdown event.
  30.                  publishEvent(new ContextClosedEvent(this));
  31.              }
  32.              catch (Throwable ex) {
  33.                  logger.warn(“Exception thrown from ApplicationListener handling ContextClosedEvent”, ex);
  34.              }
  35.              // Stop all Lifecycle beans, to avoid delays during individual destruction.
  36.              if (this.lifecycleProcessor != null) {
  37.                  try {
  38.                      this.lifecycleProcessor.onClose();
  39.                  }
  40.                  catch (Throwable ex) {
  41.                      logger.warn(“Exception thrown from LifecycleProcessor on context close”, ex);
  42.                  }
  43.              }
  44.              // Destroy all cached singletons in the context’s BeanFactory.
  45.              destroyBeans();
  46.              // Close the state of this context itself.
  47.              closeBeanFactory();
  48.              // Let subclasses do some final clean-up if they wish…
  49.              onClose();
  50.              // Reset local application listeners to pre-refresh state.
  51.              if (this.earlyApplicationListeners != null) {
  52.                  this.applicationListeners.clear();
  53.                  this.applicationListeners.addAll(this.earlyApplicationListeners);
  54.              }
  55.              // Switch to inactive.
  56.              this.active.set(false);
  57.          }
  58.      }
  59. }

也就是说spring新起了一个线程,加入了JVM Shutdown钩子函数。

重点是close前要获取startupShutdownMonitor的对象监视器锁,这个锁看着就很眼熟,Spring在refresh时也会获取这把锁。

  1. public abstract class AbstractApplicationContext extends DefaultResourceLoader
  2.          implements ConfigurableApplicationContext {
  3.      @Override
  4.      public void refresh() throws BeansException, IllegalStateException {
  5.          synchronized (this.startupShutdownMonitor) {
  6.              StartupStep contextRefresh = this.applicationStartup.start(“spring.context.refresh”);
  7.              // Prepare this context for refreshing.
  8.              prepareRefresh();
  9.              // Tell the subclass to refresh the internal bean factory.
  10.              ConfigurableListableBeanFactory beanFactory = obtainFreshBeanFactory();
  11.              // Prepare the bean factory for use in this context.
  12.              prepareBeanFactory(beanFactory);
  13.              ……
  14.          }
  15.      }
  16. }

这个时候我们猜想,是获取startupShutdownMonitor死锁了。

jstack打下线程栈看看

-1

 “SpringContextShutdownHook” #18 prio=5 os_prio=0 tid=0x0000000024e00800 nid=0x407c waiting for monitor entry [0x000000002921f000]
Java.lang.Thread.State: blockED (on object monitor)
at org.springframework.context.support.AbstractApplicationContext$1.run(AbstractApplicationContext.java:991)
– waiting to lock <0x00000006c494f430> (a java.lang.Object)

“main” #1 prio=5 os_prio=0 tid=0x0000000002de4000 nid=0x1ff4 in Object.wait() [0x0000000002dde000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
– waiting on <0x00000006c4a43118> (a org.springframework.context.support.AbstractApplicationContext$1)
at java.lang.Thread.join(Thread.java:1252)
– locked <0x00000006c4a43118> (a org.springframework.context.support.AbstractApplicationContext$1)
at java.lang.Thread.join(Thread.java:1326)
at java.lang.ApplicationShutdownHooks.runHooks(ApplicationShutdownHooks.java:107)
at java.lang.ApplicationShutdownHooks$1.run(ApplicationShutdownHooks.java:46)
at java.lang.Shutdown.runHooks(Shutdown.java:123)
at java.lang.Shutdown.sequence(Shutdown.java:167)
at java.lang.Shutdown.exit(Shutdown.java:212)
– locked <0x00000006c4845128> (a java.lang.Class for java.lang.Shutdown)
at java.lang.Runtime.exit(Runtime.java:109)
at java.lang.System.exit(System.java:971)
at io.seata.server.ShutDownHookTest.onApplicationEvent(ShutDownHookTest.java:12)
at io.seata.server.ShutDownHookTest.onApplicationEvent(ShutDownHookTest.java:7)
at org.springframework.context.event.SimpleApplicationEventMulticaster.doInvokeListener(SimpleApplicationEventMulticaster.java:176)
at org.springframework.context.event.SimpleApplicationEventMulticaster.invokeListener(SimpleApplicationEventMulticaster.java:169)
at org.springframework.context.event.SimpleApplicationEventMulticaster.multicastEvent(SimpleApplicationEventMulticaster.java:143)
at org.springframework.context.support.AbstractApplicationContext.publishEvent(AbstractApplicationContext.java:421)
at org.springframework.context.support.AbstractApplicationContext.publishEvent(AbstractApplicationContext.java:378)
at org.springframework.context.support.AbstractApplicationContext.finishRefresh(AbstractApplicationContext.java:938)
at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:586)
– locked <0x00000006c494f430> (a java.lang.Object)
at org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext.refresh(ServletWebServerApplicationContext.java:144)
at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:771)
at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:763)

乍一看jstack并没有提示线程死锁(jvisualvm、jconsle之类的工具也不行),但是从线程栈来看:

  • main线程先获取到了startupShutdownMonitor锁 <0x00000006c494f430>
  • SpringContextShutdownHook线程在等待startupShutdownMonitor锁
  • main线程掉了Thread.join阻塞在获取<0x00000006c4a43118>这把锁

根本原因是main线程调System.exit阻塞住了,一直往下追踪,会发现阻塞在ApplicationShutdownHooks这里

  1. class ApplicationShutdownHooks {
  2.      /* Iterates over all application hooks creating a new thread for each
  3.          * to run in. Hooks are run concurrently and this method waits for
  4.          * them to finish.
  5.          */
  6.      static void runHooks() {
  7.          Collection<Thread> threads;
  8.          synchronized(ApplicationShutdownHooks.class) {
  9.              threads = hooks.keySet();
  10.              hooks = null;
  11.          }
  12.          for (Thread hook : threads) {
  13.              hook.start();
  14.          }
  15.          for (Thread hook : threads) {
  16.              while (true) {
  17.                  try {
  18.                      // 等待shutdow线程结束
  19.                      hook.join();
  20.                      break;
  21.                  } catch (InterruptedException ignored) {
  22.                  }
  23.              }
  24.          }
  25.      }
  26. }

总结

整个死锁的流程:

  • main线程-spring refresh开始时会获取startupShutdownMonitor对象监视器锁
  • main线程-在spring refresh还未完成的时候,触发了System.exit指令
  • SpringContextShutdownHook线程-SpringContextShutdownHook线程开始工作,等待获取startupShutdownMonitor对象监视器锁
  • main线程调用Thread.join等待SpringContextShutdownHook线程结束

所以在Spring未完成refresh时,是不能够触发System.exit指令的

以上为个人经验,希望能给大家一个参考,也希望大家多多支持我们。

标签

发表评论