Spring之ShutDown Hook死锁现象解读
Spring ShutDown Hook死锁现象
偶然出现一次项目异常spring却没有正常停止的情况,最终发现是Spring Shutdown导致的死锁现象。
某个框架里嵌入了类似这样的一段代码
- @Component
- public class ShutDownHookTest implements ApplicationListener<ContextRefreshedEvent> {
- @Override
- public void onApplicationEvent(ContextRefreshedEvent event) {
- if (onException) {
- System.out.println(“test shutdown hook deadlock”);
- System.exit(0);
- }
- }
- }
它的逻辑就是想要在出现异常后,通过System.exit来确保应用程序退出。
而且没有使用异步事件,是在主线程下跑了System.exit,然后就发现springboot server还是正常运行着的。
而且程序看着好像也没问题,由于我们是dubbo服务化系统,在测试环境上服务还是正常的。
这很明显不符常理,正常来说,System.exit这样的指令是spring能够感知到的,并且会执行shutDown处理的,先来看看Spring 注册ShutdownHook
- public abstract class AbstractApplicationContext extends DefaultResourceLoader
- implements ConfigurableApplicationContext {
- @Override
- public void registerShutdownHook() {
- if (this.shutdownHook == null) {
- // No shutdown hook registered yet.
- this.shutdownHook = new Thread(SHUTDOWN_HOOK_THREAD_NAME) {
- @Override
- public void run() {
- //重点在这里获取startupShutdownMonitor的监视器锁
- synchronized (startupShutdownMonitor) {
- doClose();
- }
- }
- };
- Runtime.getRuntime().addShutdownHook(this.shutdownHook);
- }
- }
- protected void doClose() {
- // Check whether an actual close attempt is necessary…
- if (this.active.get() && this.closed.compareAndSet(false, true)) {
- if (logger.isDebugEnabled()) {
- logger.debug(“Closing “ + this);
- }
- if (!NativeDetector.inNativeImage()) {
- LiveBeansView.unregisterApplicationContext(this);
- }
- try {
- // Publish shutdown event.
- publishEvent(new ContextClosedEvent(this));
- }
- catch (Throwable ex) {
- logger.warn(“Exception thrown from ApplicationListener handling ContextClosedEvent”, ex);
- }
- // Stop all Lifecycle beans, to avoid delays during individual destruction.
- if (this.lifecycleProcessor != null) {
- try {
- this.lifecycleProcessor.onClose();
- }
- catch (Throwable ex) {
- logger.warn(“Exception thrown from LifecycleProcessor on context close”, ex);
- }
- }
- // Destroy all cached singletons in the context’s BeanFactory.
- destroyBeans();
- // Close the state of this context itself.
- closeBeanFactory();
- // Let subclasses do some final clean-up if they wish…
- onClose();
- // Reset local application listeners to pre-refresh state.
- if (this.earlyApplicationListeners != null) {
- this.applicationListeners.clear();
- this.applicationListeners.addAll(this.earlyApplicationListeners);
- }
- // Switch to inactive.
- this.active.set(false);
- }
- }
- }
也就是说spring新起了一个线程,加入了JVM Shutdown钩子函数。
重点是close前要获取startupShutdownMonitor的对象监视器锁,这个锁看着就很眼熟,Spring在refresh时也会获取这把锁。
- public abstract class AbstractApplicationContext extends DefaultResourceLoader
- implements ConfigurableApplicationContext {
- @Override
- public void refresh() throws BeansException, IllegalStateException {
- synchronized (this.startupShutdownMonitor) {
- StartupStep contextRefresh = this.applicationStartup.start(“spring.context.refresh”);
- // Prepare this context for refreshing.
- prepareRefresh();
- // Tell the subclass to refresh the internal bean factory.
- ConfigurableListableBeanFactory beanFactory = obtainFreshBeanFactory();
- // Prepare the bean factory for use in this context.
- prepareBeanFactory(beanFactory);
- ……
- }
- }
- }
这个时候我们猜想,是获取startupShutdownMonitor死锁了。
jstack打下线程栈看看
“SpringContextShutdownHook” #18 prio=5 os_prio=0 tid=0x0000000024e00800 nid=0x407c waiting for monitor entry [0x000000002921f000]
Java.lang.Thread.State: blockED (on object monitor)
at org.springframework.context.support.AbstractApplicationContext$1.run(AbstractApplicationContext.java:991)
– waiting to lock <0x00000006c494f430> (a java.lang.Object)
“main” #1 prio=5 os_prio=0 tid=0x0000000002de4000 nid=0x1ff4 in Object.wait() [0x0000000002dde000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
– waiting on <0x00000006c4a43118> (a org.springframework.context.support.AbstractApplicationContext$1)
at java.lang.Thread.join(Thread.java:1252)
– locked <0x00000006c4a43118> (a org.springframework.context.support.AbstractApplicationContext$1)
at java.lang.Thread.join(Thread.java:1326)
at java.lang.ApplicationShutdownHooks.runHooks(ApplicationShutdownHooks.java:107)
at java.lang.ApplicationShutdownHooks$1.run(ApplicationShutdownHooks.java:46)
at java.lang.Shutdown.runHooks(Shutdown.java:123)
at java.lang.Shutdown.sequence(Shutdown.java:167)
at java.lang.Shutdown.exit(Shutdown.java:212)
– locked <0x00000006c4845128> (a java.lang.Class for java.lang.Shutdown)
at java.lang.Runtime.exit(Runtime.java:109)
at java.lang.System.exit(System.java:971)
at io.seata.server.ShutDownHookTest.onApplicationEvent(ShutDownHookTest.java:12)
at io.seata.server.ShutDownHookTest.onApplicationEvent(ShutDownHookTest.java:7)
at org.springframework.context.event.SimpleApplicationEventMulticaster.doInvokeListener(SimpleApplicationEventMulticaster.java:176)
at org.springframework.context.event.SimpleApplicationEventMulticaster.invokeListener(SimpleApplicationEventMulticaster.java:169)
at org.springframework.context.event.SimpleApplicationEventMulticaster.multicastEvent(SimpleApplicationEventMulticaster.java:143)
at org.springframework.context.support.AbstractApplicationContext.publishEvent(AbstractApplicationContext.java:421)
at org.springframework.context.support.AbstractApplicationContext.publishEvent(AbstractApplicationContext.java:378)
at org.springframework.context.support.AbstractApplicationContext.finishRefresh(AbstractApplicationContext.java:938)
at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:586)
– locked <0x00000006c494f430> (a java.lang.Object)
at org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext.refresh(ServletWebServerApplicationContext.java:144)
at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:771)
at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:763)
乍一看jstack并没有提示线程死锁(jvisualvm、jconsle之类的工具也不行),但是从线程栈来看:
- main线程先获取到了startupShutdownMonitor锁 <0x00000006c494f430>
- SpringContextShutdownHook线程在等待startupShutdownMonitor锁
- main线程掉了Thread.join阻塞在获取<0x00000006c4a43118>这把锁
根本原因是main线程调System.exit阻塞住了,一直往下追踪,会发现阻塞在ApplicationShutdownHooks这里
- class ApplicationShutdownHooks {
- /* Iterates over all application hooks creating a new thread for each
- * to run in. Hooks are run concurrently and this method waits for
- * them to finish.
- */
- static void runHooks() {
- Collection<Thread> threads;
- synchronized(ApplicationShutdownHooks.class) {
- threads = hooks.keySet();
- hooks = null;
- }
- for (Thread hook : threads) {
- hook.start();
- }
- for (Thread hook : threads) {
- while (true) {
- try {
- // 等待shutdow线程结束
- hook.join();
- break;
- } catch (InterruptedException ignored) {
- }
- }
- }
- }
- }
总结
整个死锁的流程:
- main线程-spring refresh开始时会获取startupShutdownMonitor对象监视器锁
- main线程-在spring refresh还未完成的时候,触发了System.exit指令
- SpringContextShutdownHook线程-SpringContextShutdownHook线程开始工作,等待获取startupShutdownMonitor对象监视器锁
- main线程调用Thread.join等待SpringContextShutdownHook线程结束
所以在Spring未完成refresh时,是不能够触发System.exit指令的
以上为个人经验,希望能给大家一个参考,也希望大家多多支持我们。
发表评论