前段时间出现了一个小的线上死锁问题,推了一版新代码上线发现业务服务启动时间长的有些异常,看log发现一直卡着也没有报错,简单看了dump文件发现是skywalking的agent和主线程死锁了,简单记录一下
背景:JDK21, Spring Boot3.x skywalking 9.4.0
Found one Java-level deadlock:
=============================
"main":
waiting to lock monitor 0x00007fd8b588d4e0 (object 0x000000008b300030, a java.lang.Object),
which is held by "SkywalkingAgent-5-JVMService-produce-0"
"SkywalkingAgent-5-JVMService-produce-0":
waiting to lock monitor 0x00007fd7dc023e70 (object 0x0000000087d68b68, a java.util.jar.JarFile),
which is held by "main"
Java stack information for the threads listed above:
===================================================
"main":
at jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(java.base@21.0.8/BuiltinClassLoader.java:651)
- waiting to lock <0x000000008b300030> (a java.lang.Object)
at jdk.internal.loader.BuiltinClassLoader.loadClass(java.base@21.0.8/BuiltinClassLoader.java:639)
at jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(java.base@21.0.8/ClassLoaders.java:188)
at java.lang.ClassLoader.loadClass(java.base@21.0.8/ClassLoader.java:526)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.pool.TypePool$Default$LazyTypeDescription$RecordComponentToken.<init>(TypePool.java:6117)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.pool.TypePool$Default$TypeExtractor$RecordComponentExtractor.visitEnd(TypePool.java:8946)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.jar.asm.ClassReader.readRecordComponent(ClassReader.java:1053)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.jar.asm.ClassReader.accept(ClassReader.java:732)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.jar.asm.ClassReader.accept(ClassReader.java:425)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.pool.TypePool$Default.parse(TypePool.java:880)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.pool.TypePool$Default.doDescribe(TypePool.java:864)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.pool.TypePool$Default$WithLazyResolution.access$001(TypePool.java:944)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.pool.TypePool$Default$WithLazyResolution.doResolve(TypePool.java:1042)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.pool.TypePool$Default$WithLazyResolution$LazyTypeDescription.delegate(TypePool.java:1111)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.description.type.TypeDescription$AbstractBase$OfSimpleType$WithDelegation.getModifiers(TypeDescription.java:8535)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.SWDescriptionStrategy$SWTypeDescriptionWrapper.getModifiers(SWDescriptionStrategy.java:353)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.matcher.ModifierMatcher.doMatch(ModifierMatcher.java:60)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.matcher.ModifierMatcher.doMatch(ModifierMatcher.java:27)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.matcher.ElementMatcher$Junction$ForNonNullValues.matches(ElementMatcher.java:249)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.matcher.ElementMatcher$Junction$Disjunction.matches(ElementMatcher.java:214)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$RawMatcher$ForElementMatchers.matches(AgentBuilder.java:1971)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer.doTransform(AgentBuilder.java:12425)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer.transform(AgentBuilder.java:12385)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer.access$1800(AgentBuilder.java:12094)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer$Java9CapableVmDispatcher.run(AgentBuilder.java:12872)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer$Java9CapableVmDispatcher.run(AgentBuilder.java:12804)
at java.security.AccessController.executePrivileged(java.base@21.0.8/AccessController.java:778)
at java.security.AccessController.doPrivileged(java.base@21.0.8/AccessController.java:400)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer.doPrivileged(AgentBuilder.java)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer.transform(AgentBuilder.java:12328)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer$ByteBuddy$ModuleSupport.transform(Unknown Source)
at sun.instrument.TransformerManager.transform(java.instrument@21.0.8/TransformerManager.java:188)
at sun.instrument.InstrumentationImpl.transform(java.instrument@21.0.8/InstrumentationImpl.java:610)
at jdk.internal.misc.ThreadTracker.begin(java.base@21.0.8/ThreadTracker.java:70)
at java.util.jar.JarFile.beginInit(java.base@21.0.8/JarFile.java:1055)
at java.util.jar.JarFile.ensureInitialization(java.base@21.0.8/JarFile.java:1069)
- locked <0x0000000087d68b68> (a java.util.jar.JarFile)
at java.util.jar.JavaUtilJarAccessImpl.ensureInitialization(java.base@21.0.8/JavaUtilJarAccessImpl.java:42)
at jdk.internal.loader.URLClassPath$JarLoader$2.getManifest(java.base@21.0.8/URLClassPath.java:852)
at jdk.internal.loader.BuiltinClassLoader.defineClass(java.base@21.0.8/BuiltinClassLoader.java:848)
at jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull(java.base@21.0.8/BuiltinClassLoader.java:760)
at jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(java.base@21.0.8/BuiltinClassLoader.java:681)
- locked <0x000000008b300d60> (a java.lang.Object)
at jdk.internal.loader.BuiltinClassLoader.loadClass(java.base@21.0.8/BuiltinClassLoader.java:639)
at jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(java.base@21.0.8/ClassLoaders.java:188)
at java.lang.ClassLoader.loadClass(java.base@21.0.8/ClassLoader.java:526)
at java.lang.Class.forName0(java.base@21.0.8/Native Method)
at java.lang.Class.forName(java.base@21.0.8/Class.java:534)
at java.lang.Class.forName(java.base@21.0.8/Class.java:513)
at sun.launcher.LauncherHelper.loadMainClass(java.base@21.0.8/LauncherHelper.java:811)
at sun.launcher.LauncherHelper.checkAndLoadMain(java.base@21.0.8/LauncherHelper.java:706)
"SkywalkingAgent-5-JVMService-produce-0":
at java.util.zip.ZipFile.getEntry(java.base@21.0.8/ZipFile.java:337)
- waiting to lock <0x0000000087d68b68> (a java.util.jar.JarFile)
at java.util.jar.JarFile.getEntry(java.base@21.0.8/JarFile.java:517)
at java.util.jar.JarFile.getJarEntry(java.base@21.0.8/JarFile.java:472)
at jdk.internal.loader.URLClassPath$JarLoader.getResource(java.base@21.0.8/URLClassPath.java:923)
at jdk.internal.loader.URLClassPath.getResource(java.base@21.0.8/URLClassPath.java:316)
at jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull(java.base@21.0.8/BuiltinClassLoader.java:757)
at jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(java.base@21.0.8/BuiltinClassLoader.java:681)
- locked <0x000000008b300030> (a java.lang.Object)
at jdk.internal.loader.BuiltinClassLoader.loadClass(java.base@21.0.8/BuiltinClassLoader.java:639)
at jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(java.base@21.0.8/ClassLoaders.java:188)
at java.lang.ClassLoader.loadClass(java.base@21.0.8/ClassLoader.java:526)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.pool.TypePool$Default$LazyTypeDescription$RecordComponentToken.<init>(TypePool.java:6117)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.pool.TypePool$Default$TypeExtractor$RecordComponentExtractor.visitEnd(TypePool.java:8946)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.jar.asm.ClassReader.readRecordComponent(ClassReader.java:1053)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.jar.asm.ClassReader.accept(ClassReader.java:732)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.jar.asm.ClassReader.accept(ClassReader.java:425)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.pool.TypePool$Default.parse(TypePool.java:880)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.pool.TypePool$Default.doDescribe(TypePool.java:864)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.pool.TypePool$Default$WithLazyResolution.access$001(TypePool.java:944)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.pool.TypePool$Default$WithLazyResolution.doResolve(TypePool.java:1042)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.pool.TypePool$Default$WithLazyResolution$LazyTypeDescription.delegate(TypePool.java:1111)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.description.type.TypeDescription$AbstractBase$OfSimpleType$WithDelegation.getModifiers(TypeDescription.java:8535)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.SWDescriptionStrategy$SWTypeDescriptionWrapper.getModifiers(SWDescriptionStrategy.java:353)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.matcher.ModifierMatcher.doMatch(ModifierMatcher.java:60)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.matcher.ModifierMatcher.doMatch(ModifierMatcher.java:27)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.matcher.ElementMatcher$Junction$ForNonNullValues.matches(ElementMatcher.java:249)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.matcher.ElementMatcher$Junction$Disjunction.matches(ElementMatcher.java:214)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$RawMatcher$ForElementMatchers.matches(AgentBuilder.java:1971)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer.doTransform(AgentBuilder.java:12425)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer.transform(AgentBuilder.java:12385)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer.access$1800(AgentBuilder.java:12094)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer$Java9CapableVmDispatcher.run(AgentBuilder.java:12872)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer$Java9CapableVmDispatcher.run(AgentBuilder.java:12804)
at java.security.AccessController.executePrivileged(java.base@21.0.8/AccessController.java:778)
at java.security.AccessController.doPrivileged(java.base@21.0.8/AccessController.java:400)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer.doPrivileged(AgentBuilder.java)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer.transform(AgentBuilder.java:12328)
at org.apache.skywalking.apm.dependencies.net.bytebuddy.agent.builder.AgentBuilder$Default$ExecutingTransformer$ByteBuddy$ModuleSupport.transform(Unknown Source)
at sun.instrument.TransformerManager.transform(java.instrument@21.0.8/TransformerManager.java:188)
at sun.instrument.InstrumentationImpl.transform(java.instrument@21.0.8/InstrumentationImpl.java:610)
at java.nio.DirectByteBuffer.<init>(java.base@21.0.8/DirectByteBuffer.java:128)
at java.nio.ByteBuffer.allocateDirect(java.base@21.0.8/ByteBuffer.java:360)
at sun.nio.ch.Util.getTemporaryDirectBuffer(java.base@21.0.8/Util.java:242)
at sun.nio.ch.IOUtil.read(java.base@21.0.8/IOUtil.java:303)
at sun.nio.ch.IOUtil.read(java.base@21.0.8/IOUtil.java:283)
at sun.nio.ch.FileChannelImpl.read(java.base@21.0.8/FileChannelImpl.java:234)
- locked <0x000000008b32ed08> (a java.lang.Object)
at sun.nio.ch.ChannelInputStream.read(java.base@21.0.8/ChannelInputStream.java:74)
at sun.nio.ch.ChannelInputStream.read(java.base@21.0.8/ChannelInputStream.java:103)
- locked <0x000000008b32ee78> (a sun.nio.ch.ChannelInputStream)
at sun.nio.cs.StreamDecoder.readBytes(java.base@21.0.8/StreamDecoder.java:350)
at sun.nio.cs.StreamDecoder.implRead(java.base@21.0.8/StreamDecoder.java:393)
at sun.nio.cs.StreamDecoder.lockedRead(java.base@21.0.8/StreamDecoder.java:217)
at sun.nio.cs.StreamDecoder.read(java.base@21.0.8/StreamDecoder.java:171)
at java.io.InputStreamReader.read(java.base@21.0.8/InputStreamReader.java:188)
at java.io.BufferedReader.fill(java.base@21.0.8/BufferedReader.java:160)
at java.io.BufferedReader.implReadLine(java.base@21.0.8/BufferedReader.java:370)
at java.io.BufferedReader.readLine(java.base@21.0.8/BufferedReader.java:347)
at java.io.BufferedReader.readLine(java.base@21.0.8/BufferedReader.java:436)
at java.nio.file.Files.readAllLines(java.base@21.0.8/Files.java:3395)
at java.nio.file.Files.readAllLines(java.base@21.0.8/Files.java:3433)
at jdk.internal.platform.CgroupUtil.lambda$readAllLinesPrivileged$2(java.base@21.0.8/CgroupUtil.java:83)
at jdk.internal.platform.CgroupUtil$$Lambda/0x00007fd8441f9ab0.run(java.base@21.0.8/Unknown Source)
at java.security.AccessController.executePrivileged(java.base@21.0.8/AccessController.java:809)
at java.security.AccessController.doPrivileged(java.base@21.0.8/AccessController.java:571)
at jdk.internal.platform.CgroupUtil.readAllLinesPrivileged(java.base@21.0.8/CgroupUtil.java:84)
at jdk.internal.platform.CgroupSubsystemFactory.determineType(java.base@21.0.8/CgroupSubsystemFactory.java:143)
at jdk.internal.platform.CgroupSubsystemFactory.create(java.base@21.0.8/CgroupSubsystemFactory.java:85)
at jdk.internal.platform.CgroupMetrics.getInstance(java.base@21.0.8/CgroupMetrics.java:193)
at jdk.internal.platform.SystemMetrics.instance(java.base@21.0.8/SystemMetrics.java:29)
at jdk.internal.platform.Metrics.systemMetrics(java.base@21.0.8/Metrics.java:58)
at jdk.internal.platform.Container.metrics(java.base@21.0.8/Container.java:43)
at com.sun.management.internal.OperatingSystemImpl.<init>(jdk.management@21.0.8/OperatingSystemImpl.java:175)
at com.sun.management.internal.PlatformMBeanProviderImpl.getOperatingSystemMXBean(jdk.management@21.0.8/PlatformMBeanProviderImpl.java:280)
- locked <0x000000008b335430> (a java.lang.Class for com.sun.management.internal.PlatformMBeanProviderImpl)
at com.sun.management.internal.PlatformMBeanProviderImpl$3.nameToMBeanMap(jdk.management@21.0.8/PlatformMBeanProviderImpl.java:199)
at sun.management.spi.PlatformMBeanProvider$PlatformComponent.getMBeans(java.management@21.0.8/PlatformMBeanProvider.java:195)
at java.lang.management.ManagementFactory.getPlatformMXBean(java.management@21.0.8/ManagementFactory.java:691)
at java.lang.management.ManagementFactory.getOperatingSystemMXBean(java.management@21.0.8/ManagementFactory.java:391)
at org.apache.skywalking.apm.agent.core.os.ProcessorUtil.getNumberOfProcessors(ProcessorUtil.java:25)
at org.apache.skywalking.apm.agent.core.jvm.cpu.CPUProvider.<init>(CPUProvider.java:31)
at org.apache.skywalking.apm.agent.core.jvm.cpu.CPUProvider.<clinit>(CPUProvider.java:27)
at org.apache.skywalking.apm.agent.core.jvm.JVMService.run(JVMService.java:101)
at org.apache.skywalking.apm.util.RunnableWithExceptionProtection.run(RunnableWithExceptionProtection.java:33)
at java.util.concurrent.Executors$RunnableAdapter.call(java.base@21.0.8/Executors.java:572)
at java.util.concurrent.FutureTask.runAndReset(java.base@21.0.8/FutureTask.java:358)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(java.base@21.0.8/ScheduledThreadPoolExecutor.java:305)
at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@21.0.8/ThreadPoolExecutor.java:1144)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@21.0.8/ThreadPoolExecutor.java:642)
at java.lang.Thread.runWith(java.base@21.0.8/Thread.java:1596)
at java.lang.Thread.run(java.base@21.0.8/Thread.java:1583)
Found 1 deadlock.text发生的逻辑大概如下: 线程 A (业务主线程/Spring 启动线程):
-
正在加载某个业务类,拿住了 ClassLoader 的锁。
-
加载过程中可能触发了某些监控逻辑,或者由于并发竞争,它需要等待管理系统的某些状态。
线程 B (SkyWalking JVMService 线程):
-
执行 CPUProvider 的静态初始化。
-
调用 ManagementFactory.getOperatingSystemMXBean()。
在 JDK 21 中,这个调用会触发 sun.management.ManagementFactoryHelper 或相关底层类的首次加载。
线程 B 去请求 ClassLoader 的锁。
然后就死锁了
应该是JDK21后类加载 JarFile 初始化路径变长或者其他原因导致的,出现的概率大概有1/10?
去issue找了找,结果是人家不打算修= =

解决方案大概可以配置 Agent 延迟启动,缓解一下,不过出现也不频繁,先不动了ƪ(˘⌣˘)ʃ
