Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JDK21 serviceability_jvmti_j9_0_FAILED serviceability/jvmti/RedefineClasses/RedefineRunningMethods.java Segmentation error vmState=0x00040000 #18260

Closed
JasonFengJ9 opened this issue Oct 10, 2023 · 15 comments

Comments

@JasonFengJ9
Copy link
Member

JasonFengJ9 commented Oct 10, 2023

Failure link

From an internal build(rhel9-aarch64-1):

java version "21-beta" 2023-09-19
IBM Semeru Runtime Certified Edition 21+35-202310071505 (build 21-beta+35-202310071505)
Eclipse OpenJ9 VM 21+35-202310071505 (build master-4e1d1c690, JRE 21 Linux aarch64-64-Bit Compressed References 20231007_7 (JIT enabled, AOT enabled)
OpenJ9   - 4e1d1c690
OMR      - 83cb59850
JCL      - 14ae7a4f3 based on jdk-21+35)

Rerun in Grinder - Change TARGET to run only the failed test targets.

Optional info

Failure output (captured from console output)

[2023-10-07T17:25:36.702Z] variation: Mode150
[2023-10-07T17:25:36.702Z] JVM_OPTIONS:  -XX:+UseCompressedOops 

[2023-10-07T17:30:41.152Z] TEST: serviceability/jvmti/RedefineClasses/RedefineRunningMethods.java

[2023-10-07T17:30:41.153Z] STDERR:
[2023-10-07T17:30:41.153Z] JVMJ9VM007W Command-line option unrecognised: -Xlog:redefine+class+iklass+add=trace,redefine+class+iklass+purge=trace,class+loader+data=debug,safepoint+cleanup,gc+phases=debug:rt.log
[2023-10-07T17:30:41.153Z] Unhandled exception
[2023-10-07T17:30:41.153Z] Type=Segmentation error vmState=0x00040000
[2023-10-07T17:30:41.153Z] J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
[2023-10-07T17:30:41.153Z] Handler1=0000FFFFAA8331F4 Handler2=0000FFFFAAAC7AF0 InaccessibleAddress=0000000000000010
[2023-10-07T17:30:41.153Z] R0=0000000000000000 R1=0000000000000040 R2=0000000000000000 R3=0000000000000000
[2023-10-07T17:30:41.153Z] R4=0000FFFFAA1EF1F0 R5=0000FFFFAAAD84E0 R6=0000000000000000 R7=000000000321E680
[2023-10-07T17:30:41.153Z] R8=0000000000000018 R9=00228649347891B2 R10=00FFFFFFFFFFFFFF R11=0000B0EDE2DE09B1
[2023-10-07T17:30:41.153Z] R12=000000007FFFFFFF R13=000000007FFFFFFF R14=98AC3510AEC6FEFA R15=0000FFFF2000D520
[2023-10-07T17:30:41.153Z] R16=0000FFFFAAB20308 R17=0000FFFFAB1B1744 R18=0000000000000000 R19=0000000000000000
[2023-10-07T17:30:41.153Z] R20=0000FFFFA408A2B0 R21=0000FFFF885631A0 R22=0000000000000040 R23=000000000000076C
[2023-10-07T17:30:41.153Z] R24=0000FFFFA42A1A58 R25=0000000000000000 R26=0000FFFFA4089F90 R27=0000FFFFA40618D8
[2023-10-07T17:30:41.153Z] R28=0000FFFFA4064038 R29=0000FFFF885630B0 R30=0000FFFFAA212278 R31=0000FFFF885630B0
[2023-10-07T17:30:41.153Z] PC=0000FFFFAA1F9CF4 SP=0000FFFF885630B0 PSTATE=0000000080000000
[2023-10-07T17:30:41.153Z] V0 40fd0122e147ae14 (f: 3779571200.000000, d: 1.188022e+05)
[2023-10-07T17:30:41.153Z] V1 408f400000000000 (f: 0.000000, d: 1.000000e+03)
[2023-10-07T17:30:41.153Z] V2 41cdcd6500000000 (f: 0.000000, d: 1.000000e+09)
[2023-10-07T17:30:41.153Z] V3 000000f000000000 (f: 0.000000, d: 5.092790e-312)
[2023-10-07T17:30:41.153Z] V4 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-10-07T17:30:41.153Z] V5 efffa72d00b82900 (f: 12069120.000000, d: -3.071369e+231)
[2023-10-07T17:30:41.153Z] V6 00000002000a0000 (f: 655360.000000, d: 4.244315e-314)
[2023-10-07T17:30:41.153Z] V7 0013000200040000 (f: 262144.000000, d: 2.642279e-308)
[2023-10-07T17:30:41.153Z] V8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-10-07T17:30:41.153Z] V9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-10-07T17:30:41.153Z] V10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-10-07T17:30:41.153Z] V11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-10-07T17:30:41.153Z] V12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-10-07T17:30:41.153Z] V13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-10-07T17:30:41.153Z] V14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-10-07T17:30:41.153Z] V15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-10-07T17:30:41.153Z] V16 0001002d00000033 (f: 51.000000, d: 1.391626e-309)
[2023-10-07T17:30:41.153Z] V17 0000000000000800 (f: 2048.000000, d: 1.011846e-320)
[2023-10-07T17:30:41.153Z] V18 4000000000000000 (f: 0.000000, d: 2.000000e+00)
[2023-10-07T17:30:41.153Z] V19 3f9eb851eb851eb8 (f: 3951369984.000000, d: 3.000000e-02)
[2023-10-07T17:30:41.153Z] V20 3fb1eb851eb851ec (f: 515396064.000000, d: 7.000000e-02)
[2023-10-07T17:30:41.153Z] V21 0000000000000008 (f: 8.000000, d: 3.952525e-323)
[2023-10-07T17:30:41.153Z] V22 3f0000003f800000 (f: 1065353216.000000, d: 3.051759e-05)
[2023-10-07T17:30:41.153Z] V23 3fc999999999999a (f: 2576980480.000000, d: 2.000000e-01)
[2023-10-07T17:30:41.153Z] V24 3fd6666666666666 (f: 1717986944.000000, d: 3.500000e-01)
[2023-10-07T17:30:41.153Z] V25 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-10-07T17:30:41.153Z] V26 000000000000000a (f: 10.000000, d: 4.940656e-323)
[2023-10-07T17:30:41.153Z] V27 0000000000000001 (f: 1.000000, d: 4.940656e-324)
[2023-10-07T17:30:41.153Z] V28 000000000000000a (f: 10.000000, d: 4.940656e-323)
[2023-10-07T17:30:41.153Z] V29 0000000000000300 (f: 768.000000, d: 3.794424e-321)
[2023-10-07T17:30:41.153Z] V30 000000003f400000 (f: 1061158912.000000, d: 5.242822e-315)
[2023-10-07T17:30:41.154Z] V31 000000003f400000 (f: 1061158912.000000, d: 5.242822e-315)
[2023-10-07T17:30:41.154Z] Module=/home/jenkins/workspace/Test_openjdk21_j9_extended.openjdk_aarch64_linux/openjdkbinary/j2sdk-image/lib/default/libj9vrb29.so
[2023-10-07T17:30:41.154Z] Module_base_address=0000FFFFAA1BE000
[2023-10-07T17:30:41.154Z] Target=2_90_20231007_7 (Linux 5.14.0-284.30.1.el9_2.aarch64)
[2023-10-07T17:30:41.154Z] CPU=aarch64 (8 logical CPUs) (0x1db1a8000 RAM)
[2023-10-07T17:30:41.154Z] ----------- Stack Backtrace -----------
[2023-10-07T17:30:41.154Z] _ZN21MM_VerboseHandlerJava13getThreadNameEPcmP12OMR_VMThread+0x24 (0x0000FFFFAA1F9CF4 [libj9vrb29.so+0x3bcf4])
[2023-10-07T17:30:41.154Z] _ZN23MM_VerboseHandlerOutput20handleExclusiveStartEPP15J9HookInterfacemPv+0x338 (0x0000FFFFAA212278 [libj9vrb29.so+0x54278])
[2023-10-07T17:30:41.154Z] J9HookDispatch+0x15c (0x0000FFFFAAA4F61C [libj9hookable29.so+0x161c])
[2023-10-07T17:30:41.154Z] _ZN18MM_EnvironmentBase28reportExclusiveAccessAcquireEv+0xec (0x0000FFFFA930494C [libj9gc29.so+0x10494c])
[2023-10-07T17:30:41.154Z] _ZN18MM_EnvironmentBase24acquireExclusiveVMAccessEv+0x3c (0x0000FFFFA930503C [libj9gc29.so+0x10503c])
[2023-10-07T17:30:41.154Z] _ZN18MM_EnvironmentBase29acquireExclusiveVMAccessForGCEP12MM_Collectorbb+0x120 (0x0000FFFFA93051C0 [libj9gc29.so+0x1051c0])
[2023-10-07T17:30:41.154Z] _ZN17MM_MemorySubSpace20systemGarbageCollectEP18MM_EnvironmentBasej+0xe8 (0x0000FFFFA931E398 [libj9gc29.so+0x11e398])
[2023-10-07T17:30:41.154Z] j9gc_modron_global_collect_with_overrides+0x84 (0x0000FFFFA922EE94 [libj9gc29.so+0x2ee94])
[2023-10-07T17:30:41.154Z] redefineClassesCommon.constprop.0+0x284 (0x0000FFFFA94EF418 [libj9jvmti29.so+0x8418])
[2023-10-07T17:30:41.154Z] jvmtiRedefineClasses+0x148 (0x0000FFFFA94F1C58 [libj9jvmti29.so+0xac58])
[2023-10-07T17:30:41.154Z] redefineClasses+0x488 (0x0000FFFFAA1A2668 [libinstrument.so+0x5668])
[2023-10-07T17:30:41.154Z] ffi_call_SYSV+0x50 (0x0000FFFFAA9B9610 [libj9vm29.so+0x1b9610])
[2023-10-07T17:30:41.154Z] ffi_call_int+0x204 (0x0000FFFFAA9B8BC4 [libj9vm29.so+0x1b8bc4])
[2023-10-07T17:30:41.154Z] bytecodeLoopCompressed+0xd450 (0x0000FFFFAA898750 [libj9vm29.so+0x98750])
[2023-10-07T17:30:41.154Z] c_cInterpreter+0x54 (0x0000FFFFAA948D38 [libj9vm29.so+0x148d38])
[2023-10-07T17:30:41.154Z] ---------------------------------------
[2023-10-07T17:30:41.154Z] JVMDUMP039I Processing dump event "gpf", detail "" at 2023/10/07 13:30:34 - please wait.

[2023-10-07T17:30:41.154Z] TEST RESULT: Failed. Unexpected exit from test [exit code: 255]
[2023-10-07T17:30:41.154Z] --------------------------------------------------
[2023-10-07T17:36:14.493Z] TEST: serviceability/jvmti/vthread/RedefineClasses/RedefineRunningMethods.java

[2023-10-07T17:36:14.494Z] STDERR:
[2023-10-07T17:36:14.494Z] JVMJ9VM007W Command-line option unrecognised: -Xlog:redefine+class+iklass+add=trace,redefine+class+iklass+purge=trace,class+loader+data=debug,safepoint+cleanup,gc+phases=debug:rt.log
[2023-10-07T17:36:14.494Z] Unhandled exception
[2023-10-07T17:36:14.494Z] Type=Segmentation error vmState=0x00040000
[2023-10-07T17:36:14.494Z] J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
[2023-10-07T17:36:14.494Z] Handler1=0000FFFF978331F4 Handler2=0000FFFF97AF8AF0 InaccessibleAddress=0000000000000010
[2023-10-07T17:36:14.494Z] R0=0000000000000000 R1=0000000000000040 R2=0000000000000000 R3=0000000000000000
[2023-10-07T17:36:14.494Z] R4=0000FFFF96F5D1F0 R5=0000FFFF97B094E0 R6=0000000000000000 R7=000000000322E660
[2023-10-07T17:36:14.494Z] R8=0000000000000018 R9=000FB8777E400C8F R10=00FFFFFFFFFFFFFF R11=0000B0FD3C9A0031
[2023-10-07T17:36:14.494Z] R12=000000007FFFFFFF R13=000000007FFFFFFF R14=056F2F51FA3A313A R15=0000FFFF3C000928
[2023-10-07T17:36:14.494Z] R16=0000FFFF97B51308 R17=0000FFFF98210744 R18=0000000000000000 R19=0000000000000000
[2023-10-07T17:36:14.494Z] R20=0000FFFF90077A40 R21=0000FFFF954251A0 R22=0000000000000040 R23=0000000000000680
[2023-10-07T17:36:14.494Z] R24=0000FFFF9025D398 R25=0000000000000000 R26=0000FFFF90077720 R27=0000FFFF90050668
[2023-10-07T17:36:14.494Z] R28=0000FFFF90052DC8 R29=0000FFFF954250B0 R30=0000FFFF96F80278 R31=0000FFFF954250B0
[2023-10-07T17:36:14.494Z] PC=0000FFFF96F67CF4 SP=0000FFFF954250B0 PSTATE=0000000080000000
[2023-10-07T17:36:14.494Z] V0 40f967762d0e5604 (f: 755914240.000000, d: 1.040554e+05)
[2023-10-07T17:36:14.494Z] V1 408f400000000000 (f: 0.000000, d: 1.000000e+03)
[2023-10-07T17:36:14.494Z] V2 41cdcd6500000000 (f: 0.000000, d: 1.000000e+09)
[2023-10-07T17:36:14.494Z] V3 000000f000000000 (f: 0.000000, d: 5.092790e-312)
[2023-10-07T17:36:14.494Z] V4 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-10-07T17:36:14.494Z] V5 b61b120d00b21d00 (f: 11672832.000000, d: -4.630599e-48)
[2023-10-07T17:36:14.494Z] V6 2300b360042300b2 (f: 69402800.000000, d: 4.382544e-140)
[2023-10-07T17:36:14.494Z] V7 0000002e00020000 (f: 131072.000000, d: 9.761187e-313)
[2023-10-07T17:36:14.494Z] V8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-10-07T17:36:14.494Z] V9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-10-07T17:36:14.494Z] V10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-10-07T17:36:14.494Z] V11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-10-07T17:36:14.494Z] V12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-10-07T17:36:14.494Z] V13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-10-07T17:36:14.494Z] V14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-10-07T17:36:14.494Z] V15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-10-07T17:36:14.494Z] V16 0802000400000031 (f: 49.000000, d: 4.259002e-270)
[2023-10-07T17:36:14.494Z] V17 0000000000000800 (f: 2048.000000, d: 1.011846e-320)
[2023-10-07T17:36:14.494Z] V18 4000000000000000 (f: 0.000000, d: 2.000000e+00)
[2023-10-07T17:36:14.494Z] V19 3f9eb851eb851eb8 (f: 3951369984.000000, d: 3.000000e-02)
[2023-10-07T17:36:14.494Z] V20 3fb1eb851eb851ec (f: 515396064.000000, d: 7.000000e-02)
[2023-10-07T17:36:14.494Z] V21 0000000000000008 (f: 8.000000, d: 3.952525e-323)
[2023-10-07T17:36:14.494Z] V22 3f0000003f800000 (f: 1065353216.000000, d: 3.051759e-05)
[2023-10-07T17:36:14.494Z] V23 3fc999999999999a (f: 2576980480.000000, d: 2.000000e-01)
[2023-10-07T17:36:14.494Z] V24 3fd6666666666666 (f: 1717986944.000000, d: 3.500000e-01)
[2023-10-07T17:36:14.494Z] V25 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-10-07T17:36:14.494Z] V26 000000000000000a (f: 10.000000, d: 4.940656e-323)
[2023-10-07T17:36:14.494Z] V27 0000000000000001 (f: 1.000000, d: 4.940656e-324)
[2023-10-07T17:36:14.494Z] V28 000000000000000a (f: 10.000000, d: 4.940656e-323)
[2023-10-07T17:36:14.494Z] V29 0000000000000300 (f: 768.000000, d: 3.794424e-321)
[2023-10-07T17:36:14.494Z] V30 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-10-07T17:36:14.494Z] V31 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-10-07T17:36:14.494Z] Module=/home/jenkins/workspace/Test_openjdk21_j9_extended.openjdk_aarch64_linux/openjdkbinary/j2sdk-image/lib/default/libj9vrb29.so
[2023-10-07T17:36:14.494Z] Module_base_address=0000FFFF96F2C000
[2023-10-07T17:36:14.494Z] Target=2_90_20231007_7 (Linux 5.14.0-284.30.1.el9_2.aarch64)
[2023-10-07T17:36:14.494Z] CPU=aarch64 (8 logical CPUs) (0x1db1a8000 RAM)
[2023-10-07T17:36:14.494Z] ----------- Stack Backtrace -----------
[2023-10-07T17:36:14.494Z] _ZN21MM_VerboseHandlerJava13getThreadNameEPcmP12OMR_VMThread+0x24 (0x0000FFFF96F67CF4 [libj9vrb29.so+0x3bcf4])
[2023-10-07T17:36:14.494Z] _ZN23MM_VerboseHandlerOutput20handleExclusiveStartEPP15J9HookInterfacemPv+0x338 (0x0000FFFF96F80278 [libj9vrb29.so+0x54278])
[2023-10-07T17:36:14.494Z] J9HookDispatch+0x15c (0x0000FFFF97E4461C [libj9hookable29.so+0x161c])
[2023-10-07T17:36:14.494Z] _ZN18MM_EnvironmentBase28reportExclusiveAccessAcquireEv+0xec (0x0000FFFF9710494C [libj9gc29.so+0x10494c])
[2023-10-07T17:36:14.494Z] _ZN18MM_EnvironmentBase24acquireExclusiveVMAccessEv+0x3c (0x0000FFFF9710503C [libj9gc29.so+0x10503c])
[2023-10-07T17:36:14.494Z] _ZN18MM_EnvironmentBase29acquireExclusiveVMAccessForGCEP12MM_Collectorbb+0x120 (0x0000FFFF971051C0 [libj9gc29.so+0x1051c0])
[2023-10-07T17:36:14.494Z] _ZN17MM_MemorySubSpace20systemGarbageCollectEP18MM_EnvironmentBasej+0xe8 (0x0000FFFF9711E398 [libj9gc29.so+0x11e398])
[2023-10-07T17:36:14.494Z] j9gc_modron_global_collect_with_overrides+0x84 (0x0000FFFF9702EE94 [libj9gc29.so+0x2ee94])
[2023-10-07T17:36:14.494Z] redefineClassesCommon.constprop.0+0x284 (0x0000FFFF96E1B418 [libj9jvmti29.so+0x8418])
[2023-10-07T17:36:14.494Z] jvmtiRedefineClasses+0x148 (0x0000FFFF96E1DC58 [libj9jvmti29.so+0xac58])
[2023-10-07T17:36:14.494Z] redefineClasses+0x488 (0x0000FFFF95892668 [libinstrument.so+0x5668])
[2023-10-07T17:36:14.494Z] ffi_call_SYSV+0x50 (0x0000FFFF979B9610 [libj9vm29.so+0x1b9610])
[2023-10-07T17:36:14.494Z] ffi_call_int+0x204 (0x0000FFFF979B8BC4 [libj9vm29.so+0x1b8bc4])
[2023-10-07T17:36:14.494Z] bytecodeLoopCompressed+0xd450 (0x0000FFFF97898750 [libj9vm29.so+0x98750])
[2023-10-07T17:36:14.494Z] c_cInterpreter+0x54 (0x0000FFFF97948D38 [libj9vm29.so+0x148d38])
[2023-10-07T17:36:14.494Z] ---------------------------------------
[2023-10-07T17:36:14.494Z] JVMDUMP039I Processing dump event "gpf", detail "" at 2023/10/07 13:36:04 - please wait.

[2023-10-07T17:36:14.495Z] TEST RESULT: Failed. Unexpected exit from test [exit code: 255]
[2023-10-07T17:36:14.495Z] --------------------------------------------------
[2023-10-07T17:38:02.645Z] Test results: passed: 151; failed: 2
[2023-10-07T17:38:14.729Z] Report written to /home/jenkins/workspace/Test_openjdk21_j9_extended.openjdk_aarch64_linux/jvmtest/openjdk/report/html/report.html
[2023-10-07T17:38:14.729Z] Results written to /home/jenkins/workspace/Test_openjdk21_j9_extended.openjdk_aarch64_linux/aqa-tests/TKG/output_16966994515821/serviceability_jvmti_j9_0/work
[2023-10-07T17:38:14.729Z] Error: Some tests failed or other problems occurred.
[2023-10-07T17:38:14.729Z] -----------------------------------
[2023-10-07T17:38:14.729Z] serviceability_jvmti_j9_0_FAILED

50x internal grinder - all failed

Also seen JDK21 aarch64_mac JDK21 ppc64_aix JDK21 ppc64le_linux

@tajila
Copy link
Contributor

tajila commented Oct 11, 2023

Might be a GC issue @dmitripivkine can you please take a quick look

@tajila
Copy link
Contributor

tajila commented Oct 12, 2023

@gacholio Please take a look, may be caused by #18236

@gacholio
Copy link
Contributor

The invocation of the GC is caused by #18236 but the crash is not caused by the change - it's in the GC verbose code.

@gacholio gacholio added comp:gc and removed comp:vm labels Oct 19, 2023
@dmitripivkine
Copy link
Contributor

dmitripivkine commented Oct 19, 2023

The invocation of the GC is caused by #18236 but the crash is not caused by the change - it's in the GC verbose code.

Subscribed to J9HOOK_MM_PRIVATE_EXCLUSIVE_ACCESS_ACQUIRE hook
calls verboseHandlerExclusiveStart()
calls MM_VerboseHandlerOutput::handleExclusiveStart()
calls getThreadName() with event->lastResponder as an OMR thread... and it is NULL

Verbose code has crashed an attempt to get _language_vmthread from it.

@dmitripivkine
Copy link
Contributor

NULL in event->lastResponder for J9HOOK_MM_PRIVATE_EXCLUSIVE_ACCESS_ACQUIRE hook event is coming from omrVM->exclusiveVMAccessStats.lastResponder.
The only place it is set is in updateExclusiveVMAccessStats() and there are two possibilities here:

vm->omrVM->exclusiveVMAccessStats.lastResponder = (NULL == currentThread ? NULL : currentThread->omrVMThread);
  • currentThread->omrVMThread is NULL - seems nor true as "current" crashing thread !j9vmthread 0x1fc500 has !omr_vmthread 0x00000000001FCFF0 set
  • updateExclusiveVMAccessStats() has been called with first parameter currentThread set to NULL

@dmitripivkine
Copy link
Contributor

> !omr_exclusivevmaccessstats 0x0000FFFFA4033C38
OMR_ExclusiveVMAccessStats at 0xffffa4033c38 {
  Fields for OMR_ExclusiveVMAccessStats:
	0x0: U64 startTime = 0x000154BE9E6F3B42 (374652655319874)
	0x8: U64 endTime = 0x000154BE9E7066E6 (374652655396582)
	0x10: U64 totalResponseTime = 0x0000000000000000 (0)
	0x18: struct OMR_VMThread* requester = !omr_vmthread 0x0000000000000000
	0x20: struct OMR_VMThread* lastResponder = !omr_vmthread 0x0000000000000000 <------
	0x28: UDATA haltedThreads = 0x0000000000000000 (0)
}

@dmitripivkine
Copy link
Contributor

@gacholio Would you please help to understand how acquireExclusiveVMAccess code did not set omrVM->exclusiveVMAccessStats.lastResponder?

@dmitripivkine
Copy link
Contributor

My understanding is in this case updateExclusiveVMAccessStats() has not been called from internalAcquireVMAccessNoMutexWithMask() somehow. Is it correct?

@gacholio
Copy link
Contributor

I bet we have safepoint exclusive, which doesn't update those stats. @dmitripivkine can you verify that in the core?

I'll need to think about how to address this. Prior to #18236 this situation could not occur.

@dmitripivkine
Copy link
Contributor

dmitripivkine commented Oct 19, 2023

I bet we have safepoint exclusive, which doesn't update those stats. @dmitripivkine can you verify that in the core?

I'll need to think about how to address this. Prior to #18236 this situation could not occur.

Are you talking about these?
vmThread->safePointCount = 0x0000000000000001
vm->exclusiveAccessState = 0x0000000000000000
vmThread->omrVMThread->exclusiveCount = 0x0000000000000002

Should I check something else?

@dmitripivkine
Copy link
Contributor

I have core uploaded to Bluebird /team/dmitri/18260/aqa-tests/TKG/output_16966994515821/serviceability_jvmti_j9_0/work/serviceability/jvmti/RedefineClasses/RedefineRunningMethods/core.20231007.133034.3825565.0001.dmp

@gacholio
Copy link
Contributor

The lastResponder field is initialized to the exclusive requesting thread except in the metronome or acquireExclusiveVMAccessFromExternalThread cases, so the field can validly be NULL if no threads require halting (e.g. they are already halted or out in a JNI native).

@dmitripivkine The GC should probably be updated to handle the NULL case.

The field is NULL in this case because safepoint does not call the stat initializing code. I'll add that in right away as a workaround and continue to look into adding the proper tracking code to safepoint, but that will need to be done very carefully (as with anything dealing with exclusive).

gacholio added a commit to gacholio/openj9 that referenced this issue Oct 20, 2023
Set a valid default requestor and lastResponder in the exclusive stats
when acquiring safepoint exclusive.

Fixes: eclipse-openj9#18260

Signed-off-by: Graham Chapman <[email protected]>
@gacholio
Copy link
Contributor

@dmitripivkine What is the GC doing in this case for metronome? There weren't a mass of metronome failures when #18236 was merged, so it's presumably working.

@dmitripivkine
Copy link
Contributor

@dmitripivkine What is the GC doing in this case for metronome? There weren't a mass of metronome failures when #18236 was merged, so it's presumably working.

Looks like metronome uses the same code. So, I think it is harder to reproduce.

@gacholio
Copy link
Contributor

This needs to be fixed in the GC (the fields can validly be NULL even before #18236) - the quick VM fix I had suggested won't work properly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants