Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DaaLoadTest ppc64le vmState=0x00000000 _ZN3OMR9CodeCache17replaceTrampolineEP20TR_OpaqueMethodBlockPvS3_S3_b+0xe4 #20263

Closed
pshipton opened this issue Sep 30, 2024 · 24 comments · Fixed by eclipse-omr/omr#7487
Labels
arch:power comp:jit os:linux segfault Issues that describe segfaults / JVM crashes test failure

Comments

@pshipton
Copy link
Member

pshipton commented Sep 30, 2024

Grinder for #20258 exposed some crashes.
DaaLoadTest_daa1_special_5m_8 -Xgcpolicy:gencon -Xshareclasses -Xjit -Xnocompressedrefs

https://openj9-jenkins.osuosl.org/job/Grinder_iteration_2/493 (2)
https://openj9-jenkins.osuosl.org/job/Grinder_iteration_1/538 (1)

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Grinder_iteration_1/538/system_test_output.tar.gz

10:51:55  DLT stderr Unhandled exception
10:51:55  DLT stderr Type=Segmentation error vmState=0x00000000
10:51:55  DLT stderr J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
10:51:55  DLT stderr Handler1=00007FFF844405E0 Handler2=00007FFF84B68880
10:51:55  DLT stderr R0=00007FFF7F79A340 R1=00007FFDFBEDB300 R2=00007FFF7FB8EA00 R3=00007FFE736EE1E0
10:51:55  DLT stderr R4=00007FFDE402F6F8 R5=0000000000000000 R6=0000000000000000 R7=00007FFF801E4200
10:51:55  DLT stderr R8=00007FFF80101250 R9=00007FFE6000A650 R10=00007FFE736EE200 R11=0000000000000000
10:51:55  DLT stderr R12=00007FFE735CEC34 R13=00007FFDFBEE68E0 R14=00007FFDE00080B0 R15=00007FFF8083D300
10:51:55  DLT stderr R16=00007FFF49462308 R17=00007FFF4A331998 R18=00007FFF494606E8 R19=00007FFF494606E8
10:51:55  DLT stderr R20=00007FFF4A67CBE8 R21=00007FFF49460730 R22=00007FFF4A67D250 R23=0000000048000001
10:51:55  DLT stderr R24=4BF7F9A948000001 R25=00007FFF80101250 R26=00007FFE735CEC34 R27=0000000000000000
10:51:55  DLT stderr R28=00007FFF801E4200 R29=0000000000000000 R30=00007FFE6000A650 R31=0000000000000000
10:51:55  DLT stderr NIP=00007FFF7F79A3E4 MSR=800000000280D033 ORIG_GPR3=00007FFF7F799234 CTR=00007FFF7F1D49E0
10:51:55  DLT stderr LINK=00007FFF7F79A3E0 XER=0000000000000000 CCR=0000000040084822 SOFTE=0000000000000001
10:51:55  DLT stderr TRAP=0000000000000300 DAR=0000000000000020 dsisr=0000000042000000 RESULT=0000000000000000
10:51:55  DLT stderr FPR0=00007fff8076efc0 (f: 2155278336.000000, d: 6.953250e-310)
10:51:55  DLT stderr FPR1=405c110ea0000000 (f: 2684354560.000000, d: 1.122665e+02)
10:51:55  DLT stderr FPR2=43b7c54880b0a574 (f: 2159060224.000000, d: 1.712855e+18)
10:51:55  DLT stderr FPR3=43e0000000000000 (f: 0.000000, d: 9.223372e+18)
10:51:55  DLT stderr FPR4=4317c54880b0a574 (f: 2159060224.000000, d: 1.672710e+15)
10:51:55  DLT stderr FPR5=3fe62e42fefa39ef (f: 4277811712.000000, d: 6.931472e-01)
10:51:55  DLT stderr FPR6=bfd57bf7808caade (f: 2156702464.000000, d: -3.356913e-01)
10:51:55  DLT stderr FPR7=bff0000000000000 (f: 0.000000, d: -1.000000e+00)
10:51:55  DLT stderr FPR8=703d746c75736572 (f: 1970496896.000000, d: 4.572908e+232)
10:51:55  DLT stderr FPR9=706f2e74656e3d73 (f: 1701723520.000000, d: 3.872783e+233)
10:51:55  DLT stderr FPR10=0000000000000000 (f: 0.000000, d: 0.000000e+00)
10:51:55  DLT stderr FPR11=0000000000000000 (f: 0.000000, d: 0.000000e+00)
10:51:55  DLT stderr FPR12=00007ffde0008c19 (f: 3758132224.000000, d: 6.952905e-310)
10:51:55  DLT stderr FPR13=3fc7c54880b0a574 (f: 2159060224.000000, d: 1.857081e-01)
10:51:55  DLT stderr FPR14=4016d3288f70bd1d (f: 2406530304.000000, d: 5.706209e+00)
10:51:55  DLT stderr FPR15=3fe62e42fefa39ef (f: 4277811712.000000, d: 6.931472e-01)
10:51:55  DLT stderr FPR16=fff0000000000000 (f: 0.000000, d: -inf)
10:51:55  DLT stderr FPR17=7ff0000000000000 (f: 0.000000, d: inf)
10:51:55  DLT stderr FPR18=3ca0000000000000 (f: 0.000000, d: 1.110223e-16)
10:51:55  DLT stderr FPR19=bebbbd41c5d26bf1 (f: 3318901760.000000, d: -1.653390e-06)
10:51:55  DLT stderr FPR20=3e66376972bea4d0 (f: 1925096704.000000, d: 4.138137e-08)
10:51:55  DLT stderr FPR21=be7bc8cec5241181 (f: 3307475456.000000, d: -1.035050e-07)
10:51:55  DLT stderr FPR22=be7bc8cec5400000 (f: 3309305856.000000, d: -1.035050e-07)
10:51:55  DLT stderr FPR23=bfd2d5b26c4fb800 (f: 1817163776.000000, d: -2.942930e-01)
10:51:55  DLT stderr FPR24=3ff0000000000000 (f: 0.000000, d: 1.000000e+00)
10:51:55  DLT stderr FPR25=3ff0000000000000 (f: 0.000000, d: 1.000000e+00)
10:51:55  DLT stderr FPR26=3fd050f800000000 (f: 0.000000, d: 2.549419e-01)
10:51:55  DLT stderr FPR27=bfa425d6db9798a8 (f: 3684145408.000000, d: -3.935119e-02)
10:51:55  DLT stderr FPR28=3ff0000000000000 (f: 0.000000, d: 1.000000e+00)
10:51:55  DLT stderr FPR29=3fe0000000000000 (f: 0.000000, d: 5.000000e-01)
10:51:55  DLT stderr FPR30=4000000000000000 (f: 0.000000, d: 2.000000e+00)
10:51:55  DLT stderr FPR31=0000000000000000 (f: 0.000000, d: 0.000000e+00)
10:51:55  DLT stderr Module=/home/jenkins/workspace/Grinder_iteration_1/jdkbinary/j2sdk-image/lib/default/libj9jit29.so
10:51:55  DLT stderr Module_base_address=00007FFF7EC00000
10:51:55  DLT stderr Target=2_90_20240928_122 (Linux 5.14.0-503.el9.ppc64le)
10:51:55  DLT stderr CPU=ppc64le (8 logical CPUs) (0x3b9630000 RAM)
10:51:55  DLT stderr ----------- Stack Backtrace -----------
10:51:55  DLT stderr _ZN3OMR9CodeCache17replaceTrampolineEP20TR_OpaqueMethodBlockPvS3_S3_b+0xe4 (0x00007FFF7F79A3E4 [libj9jit29.so+0xb9a3e4])
10:51:55  DLT stderr _ZN3OMR16CodeCacheManager17replaceTrampolineEP20TR_OpaqueMethodBlockPvS3_S3_S3_b+0x94 (0x00007FFF7F79CE44 [libj9jit29.so+0xb9ce44])
10:51:55  DLT stderr mcc_replaceTrampoline+0x58 (0x00007FFF7F17F7E8 [libj9jit29.so+0x57f7e8])
10:51:55  DLT stderr _Z15ppcCodePatchingPvS_S_S_S_S_+0x310 (0x00007FFF7F1D4CF0 [libj9jit29.so+0x5d4cf0])
10:51:55  DLT stderr _ZN3OMR9CodeCache14patchCallPointEP20TR_OpaqueMethodBlockPvS3_S3_+0x100 (0x00007FFF7F799710 [libj9jit29.so+0xb99710])
10:51:55  DLT stderr mcc_callPointPatching_unwrapper+0x50 (0x00007FFF7F17F640 [libj9jit29.so+0x57f640])
10:51:55  DLT stderr old_slow_jitCallCFunction+0x54 (0x00007FFF7F852744 [libj9jit29.so+0xc52744])
10:51:55  DLT stderr  (0x00007FFF7F8897A0 [libj9jit29.so+0xc897a0])
10:51:55  DLT stderr runJavaThread+0x240 (0x00007FFF84416C90 [libj9vm29.so+0x16c90])
10:51:55  DLT stderr javaProtectedThreadProc+0x148 (0x00007FFF844AF718 [libj9vm29.so+0xaf718])
10:51:55  DLT stderr omrsig_protect+0x3e4 (0x00007FFF84B69D34 [libj9prt29.so+0x39d34])
10:51:55  DLT stderr javaThreadProc+0x60 (0x00007FFF844AAD80 [libj9vm29.so+0xaad80])
10:51:55  DLT stderr thread_wrapper+0x190 (0x00007FFF843CCBC0 [libj9thr29.so+0xcbc0])
10:51:55  DLT stderr start_thread+0x170 (0x00007FFF84CA2C30 [libc.so.6+0xa2c30])
10:51:55  DLT stderr clone+0xa0 (0x00007FFF84D4C6C8 [libc.so.6+0x14c6c8])
@pshipton pshipton added comp:jit test failure segfault Issues that describe segfaults / JVM crashes labels Sep 30, 2024
Copy link

Issue Number: 20263
Status: Open
Recommended Components: comp:gc, comp:vm, comp:test
Recommended Assignees: linhu2016, chengjin01, pshipton

@pshipton
Copy link
Member Author

@hzongaro pls take a look

@dmitripivkine
Copy link
Contributor

from #20256

11:06:31  MLT stderr Module_base_address=0000000108948000 Symbol=_ZN3OMR9CodeCache17replaceTrampolineEP20TR_OpaqueMethodBlockPvS3_S3_b

I guess it is the same issue

@pshipton
Copy link
Member Author

pshipton commented Sep 30, 2024

The grinder from #20260

https://openj9-jenkins.osuosl.org/job/Grinder_iteration_4/376 - ubu24-ppc64le-5

11:12:51  DLT stderr *** Invalid JIT return address 00007CBA72B02554 in 00007CBA70F6A390
11:12:51  DLT stderr 
11:12:51  DLT stderr 15:12:48.389 0x7cbb0885cd00    j9vm.249    *   ** ASSERTION FAILED ** at /home/jenkins/workspace/Build_JDK17_ppc64le_linux_Release/openj9/runtime/vm/swalk.c:1633: ((0 ))
11:34:04  DLT stderr *** Invalid JIT return address 0000729400EAF310 in 000072949885DF00
11:34:04  DLT stderr 
11:34:04  DLT stderr 15:34:02.989 0x72949885dc00    j9vm.249    *   ** ASSERTION FAILED ** at /home/jenkins/workspace/Build_JDK17_ppc64le_linux_Release/openj9/runtime/vm/swalk.c:1633: ((0 ))

@pshipton
Copy link
Member Author

The grinder from #20261

https://openj9-jenkins.osuosl.org/job/Grinder_iteration_0/539 - ubu22-ppc64le-3

11:13:41  DLT stderr *** Invalid JIT return address 00007DCE468AC408 in 00007DCE0CE5AE60
11:13:41  DLT stderr 
11:13:41  DLT stderr 15:13:37.824 0x7dcee040d100    j9vm.249    *   ** ASSERTION FAILED ** at /home/jenkins/workspace/Build_JDK17_ppc64le_linux_Release/openj9/runtime/vm/swalk.c:1633: ((0 ))
12:18:39  DLT stderr *** Invalid JIT return address 00007F78D43810D0 in 00007F78E0923100
12:18:39  DLT stderr 
12:18:39  DLT stderr 16:18:37.463 0x7f78e0922e00    j9vm.249    *   ** ASSERTION FAILED ** at /home/jenkins/workspace/Build_JDK17_ppc64le_linux_Release/openj9/runtime/vm/swalk.c:1633: ((0 ))

@pshipton
Copy link
Member Author

The grinder from #20261

https://openj9-jenkins.osuosl.org/job/Grinder_iteration_1/540 - cent9-ppc64le-2

11:20:04  DLT stderr Type=Segmentation error vmState=0x00000000
11:20:04  DLT stderr J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
11:20:04  DLT stderr Handler1=00007FFF97842290 Handler2=00007FFF9C178880
11:20:04  DLT stderr R0=00007FFF97094A40 R1=00007FFE7939B560 R2=00007FFF97AE6C00 R3=00007FFE00001B98
11:20:04  DLT stderr R4=00007FFE00001B98 R5=0000000000000010 R6=000000000000000E R7=00007FFE7939B680
11:20:04  DLT stderr R8=00007FFF973D8280 R9=00007FFF98414A70 R10=0000000000000000 R11=00007FFDEC005438
11:20:04  DLT stderr R12=0000000080084444 R13=00007FFE793A68E0 R14=00007FFDEC008C10 R15=00007FFF98943200
11:20:04  DLT stderr R16=0000000000000000 R17=FFFFFFFFFFFFFFFF R18=0000000000000000 R19=00007FFF95A49808
11:20:04  DLT stderr R20=00007FFF95A4F928 R21=0000000000000001 R22=00007FFF943B1C18 R23=000000000000206C
11:20:04  DLT stderr R24=00007FFF87B13258 R25=00007FFDEC01A000 R26=0000000000000000 R27=0000000000000000
11:20:04  DLT stderr R28=0000000000000001 R29=00007FFF87B13258 R30=00007FFE00001B98 R31=0000000000000012
11:20:04  DLT stderr NIP=00007FFE7A415E74 MSR=800000000280D033 ORIG_GPR3=00007FFF979C734C CTR=00007FFE7A415E0C
11:20:04  DLT stderr LINK=00007FFF97094A40 XER=0000000020040000 CCR=0000000020084444 SOFTE=0000000000000001
11:20:04  DLT stderr TRAP=0000000000000300 DAR=00007FFE00001B98 dsisr=0000000040000000 RESULT=0000000000000000
11:20:04  DLT stderr FPR0=00007ffdec008dd0 (f: 3959459328.000000, d: 6.952915e-310)
11:20:04  DLT stderr FPR1=4028000000000000 (f: 0.000000, d: 1.200000e+01)
11:20:04  DLT stderr FPR2=41d0000000000000 (f: 0.000000, d: 1.073742e+09)
11:20:04  DLT stderr FPR3=4030000000000000 (f: 0.000000, d: 1.600000e+01)
11:20:04  DLT stderr FPR4=0000000000000000 (f: 0.000000, d: 0.000000e+00)
11:20:04  DLT stderr FPR5=3fe62e42fefa39ef (f: 4277811712.000000, d: 6.931472e-01)
11:20:04  DLT stderr FPR6=3fbc5e53aa362eb4 (f: 2855677696.000000, d: 1.108143e-01)
11:20:04  DLT stderr FPR7=bff0000000000000 (f: 0.000000, d: -1.000000e+00)
11:20:04  DLT stderr FPR8=654c657461726170 (f: 1634886016.000000, d: 9.205540e+179)
11:20:04  DLT stderr FPR9=36657a6953746573 (f: 1400137088.000000, d: 1.175677e-46)
11:20:04  DLT stderr FPR10=0000000000000000 (f: 0.000000, d: 0.000000e+00)
11:20:04  DLT stderr FPR11=00000000000064bb (f: 25787.000000, d: 1.274047e-319)
11:20:04  DLT stderr FPR12=00007ffdec0098d9 (f: 3959462144.000000, d: 6.952915e-310)
11:20:04  DLT stderr FPR13=4000000000000000 (f: 0.000000, d: 2.000000e+00)
11:20:04  DLT stderr FPR14=4016d11b38b1a849 (f: 951167040.000000, d: 5.704205e+00)
11:20:04  DLT stderr FPR15=3fe62e42fefa39ef (f: 4277811712.000000, d: 6.931472e-01)
11:20:04  DLT stderr FPR16=4085988abac120e8 (f: 3133219072.000000, d: 6.910677e+02)
11:20:04  DLT stderr FPR17=40026bb1bbb55516 (f: 3149223168.000000, d: 2.302585e+00)
11:20:04  DLT stderr FPR18=408f280000000000 (f: 0.000000, d: 9.970000e+02)
11:20:04  DLT stderr FPR19=0000000600000006 (f: 6.000000, d: 1.273197e-313)
11:20:04  DLT stderr FPR20=4018000000000000 (f: 0.000000, d: 6.000000e+00)
11:20:04  DLT stderr FPR21=4014000000000000 (f: 0.000000, d: 5.000000e+00)
11:20:04  DLT stderr FPR22=3ff0000000000000 (f: 0.000000, d: 1.000000e+00)
11:20:04  DLT stderr FPR23=4015ac95bab3693e (f: 3132320000.000000, d: 5.418540e+00)
11:20:04  DLT stderr FPR24=4028f40b5ed9812d (f: 1591312640.000000, d: 1.247665e+01)
11:20:04  DLT stderr FPR25=40026bb1bbb55516 (f: 3149223168.000000, d: 2.302585e+00)
11:20:04  DLT stderr FPR26=4032000000000000 (f: 0.000000, d: 1.800000e+01)
11:20:04  DLT stderr FPR27=3fe62e42fefa39ef (f: 4277811712.000000, d: 6.931472e-01)
11:20:04  DLT stderr FPR28=bfd218cce1e443c8 (f: 3789833216.000000, d: -2.827637e-01)
11:20:04  DLT stderr FPR29=4000000000000000 (f: 0.000000, d: 2.000000e+00)
11:20:04  DLT stderr FPR30=3f88f58b1fbbabd6 (f: 532392928.000000, d: 1.218709e-02)
11:20:04  DLT stderr FPR31=3fc54eae586af9ec (f: 1483405824.000000, d: 1.664637e-01)
11:20:04  DLT stderr 
11:20:04  DLT stderr Compiled_method=org/junit/runners/model/FrameworkMethod.isShadowedBy(Lorg/junit/runners/model/FrameworkMethod;)Z
11:20:04  DLT stderr Target=2_90_20240928_112 (Linux 5.14.0-410.el9.ppc64le)
11:20:04  DLT stderr CPU=ppc64le (8 logical CPUs) (0x3f9720000 RAM)
11:20:04  DLT stderr ----------- Stack Backtrace -----------
11:20:04  DLT stderr  (0x00007FFE7A415E74 [<unknown>+0x0])
11:20:04  DLT stderr runJavaThread+0x240 (0x00007FFF97817A40 [libj9vm29.so+0x17a40])
11:20:04  DLT stderr javaProtectedThreadProc+0x148 (0x00007FFF978B8318 [libj9vm29.so+0xb8318])
11:20:04  DLT stderr omrsig_protect+0x3e4 (0x00007FFF9C179D34 [libj9prt29.so+0x39d34])
11:20:04  DLT stderr javaThreadProc+0x60 (0x00007FFF978B3920 [libj9vm29.so+0xb3920])
11:20:04  DLT stderr thread_wrapper+0x190 (0x00007FFF9C10CBC0 [libj9thr29.so+0xcbc0])
11:20:04  DLT stderr start_thread+0x170 (0x00007FFF9C4A2C70 [libc.so.6+0xa2c70])
11:20:04  DLT stderr clone+0xa0 (0x00007FFF9C54C748 [libc.so.6+0x14c748])
11:20:04  DLT stderr ---------------------------------------

https://openj9-jenkins.osuosl.org/job/Grinder_iteration_4/377

11:44:50  DLT stderr Type=Illegal instruction vmState=0x00000000
11:44:50  DLT stderr J9Generic_Signal_Number=00000048 Signal_Number=00000004 Error_Value=00000000 Signal_Code=00000001
11:44:50  DLT stderr Handler1=000062B47B042290 Handler2=000062B47B368880
11:44:50  DLT stderr R0=000062B474371F00 R1=000062B3DC6EB5D0 R2=000062B46F595E20 R3=000062B46F595E20
11:44:50  DLT stderr R4=000062B46F59B5D8 R5=000062B3F9AD3B10 R6=000062B3F9600000 R7=0000000000EC0000
11:44:50  DLT stderr R8=0000000000000008 R9=000000000007D000 R10=0000000000000000 R11=000062B368009F00
11:44:50  DLT stderr R12=000062B47B03BE50 R13=000062B3DC6F68E0 R14=000062B368008890 R15=000062B474932200
11:44:50  DLT stderr R16=000062B46F4A82A8 R17=000062B46F31AC58 R18=000062B46F31AC58 R19=000062B46F31B3E8
11:44:50  DLT stderr R20=000062B46F31B3E8 R21=000062B46F59B528 R22=000062B46F59B558 R23=000062B46F59B5D8
11:44:50  DLT stderr R24=000062B46F59B5D8 R25=000062B3F9A13F10 R26=0000000000000001 R27=000062B46F59B558
11:44:50  DLT stderr R28=000062B3F9AD3B10 R29=000062B3F9AD3B10 R30=0000000000000001 R31=000062B368009F00
11:44:50  DLT stderr NIP=000062B3DEC79858 MSR=800000000288C033 ORIG_GPR3=000062B3DEA5F304 CTR=000062B47B03BE50
11:44:50  DLT stderr LINK=000062B3DEA5F308 XER=0000000000000037 CCR=0000000024084222 SOFTE=0000000000000001
11:44:50  DLT stderr TRAP=0000000000000700 DAR=000062B399DF0000 dsisr=0000000042000000 RESULT=0000000000000000
11:44:50  DLT stderr FPR0=000062b3f9660a50 (f: 4184214016.000000, d: 5.361853e-310)
11:44:50  DLT stderr FPR1=4042d4f920000000 (f: 536870912.000000, d: 3.766385e+01)
11:44:50  DLT stderr FPR2=408e000000000000 (f: 0.000000, d: 9.600000e+02)
11:44:50  DLT stderr FPR3=4030000000000000 (f: 0.000000, d: 1.600000e+01)
11:44:50  DLT stderr FPR4=0000000000000000 (f: 0.000000, d: 0.000000e+00)
11:44:50  DLT stderr FPR5=3fe62e42fefa39ef (f: 4277811712.000000, d: 6.931472e-01)
11:44:50  DLT stderr FPR6=bfb1973c5a611ccc (f: 1516313856.000000, d: -6.871392e-02)
11:44:50  DLT stderr FPR7=bff0000000000000 (f: 0.000000, d: -1.000000e+00)
11:44:50  DLT stderr FPR8=bfdffffef20a4123 (f: 4060758272.000000, d: -4.999997e-01)
11:44:50  DLT stderr FPR9=3fd5575b0be00b6a (f: 199232368.000000, d: 3.334568e-01)
11:44:50  DLT stderr FPR10=0000000000000000 (f: 0.000000, d: 0.000000e+00)
11:44:50  DLT stderr FPR11=41cdcd6500000000 (f: 0.000000, d: 1.000000e+09)
11:44:50  DLT stderr FPR12=000062b368009511 (f: 1744868608.000000, d: 5.361733e-310)
11:44:50  DLT stderr FPR13=c1b6977ae0000000 (f: 3758096384.000000, d: -3.790261e+08)
11:44:50  DLT stderr FPR14=0000000000000000 (f: 0.000000, d: 0.000000e+00)
11:44:50  DLT stderr FPR15=0000000000000000 (f: 0.000000, d: 0.000000e+00)
11:44:50  DLT stderr FPR16=0000000100000001 (f: 1.000000, d: 2.121996e-314)
11:44:50  DLT stderr FPR17=36a0000000000000 (f: 0.000000, d: 1.401298e-45)
11:44:50  DLT stderr FPR18=7f7fffff7f7fffff (f: 2139095040.000000, d: 1.404447e+306)
11:44:50  DLT stderr FPR19=47efffffe0000000 (f: 3758096384.000000, d: 3.402823e+38)
11:44:50  DLT stderr FPR20=cb095440cb095440 (f: 3406386176.000000, d: -3.032559e+53)
11:44:50  DLT stderr FPR21=4b0954404b095440 (f: 1258902528.000000, d: 3.032558e+53)
11:44:50  DLT stderr FPR22=0000000000000000 (f: 0.000000, d: 0.000000e+00)
11:44:50  DLT stderr FPR23=0000000000000000 (f: 0.000000, d: 0.000000e+00)
11:44:50  DLT stderr FPR24=0000000000000000 (f: 0.000000, d: 0.000000e+00)
11:44:50  DLT stderr FPR25=0000000000000000 (f: 0.000000, d: 0.000000e+00)
11:44:50  DLT stderr FPR26=0000000100000001 (f: 1.000000, d: 2.121996e-314)
11:44:50  DLT stderr FPR27=36a0000000000000 (f: 0.000000, d: 1.401298e-45)
11:44:50  DLT stderr FPR28=7f7fffff7f7fffff (f: 2139095040.000000, d: 1.404447e+306)
11:44:50  DLT stderr FPR29=47efffffe0000000 (f: 3758096384.000000, d: 3.402823e+38)
11:44:50  DLT stderr FPR30=cb095440cb095440 (f: 3406386176.000000, d: -3.032559e+53)
11:44:50  DLT stderr FPR31=c1612a8800000000 (f: 0.000000, d: -9.000000e+06)
11:44:50  DLT stderr 
11:44:50  DLT stderr Compiled_method=unknown (In JIT code segment 000062B4740E8AD8 but no method found)
11:44:50  DLT stderr Target=2_90_20240928_112 (Linux 6.8.0-45-generic)
11:44:50  DLT stderr CPU=ppc64le (4 logical CPUs) (0x1ea0e0000 RAM)
11:44:50  DLT stderr ----------- Stack Backtrace -----------
11:44:50  DLT stderr  (0x000062B3DEC79858 [<unknown>+0x0])
11:44:50  DLT stderr runJavaThread+0x240 (0x000062B47B017A40 [libj9vm29.so+0x17a40])
11:44:50  DLT stderr javaProtectedThreadProc+0x148 (0x000062B47B0B8318 [libj9vm29.so+0xb8318])
11:44:50  DLT stderr omrsig_protect+0x3e4 (0x000062B47B369D34 [libj9prt29.so+0x39d34])
11:44:50  DLT stderr javaThreadProc+0x60 (0x000062B47B0B3920 [libj9vm29.so+0xb3920])
11:44:50  DLT stderr thread_wrapper+0x190 (0x000062B47B77CBC0 [libj9thr29.so+0xcbc0])
11:44:50  DLT stderr  (0x000062B47BAB2A5C [libc.so.6+0xb2a5c])
11:44:50  DLT stderr ---------------------------------------

@pshipton
Copy link
Member Author

pshipton commented Oct 1, 2024

https://openj9-jenkins.osuosl.org/job/Test_openjdk11_j9_sanity.system_ppc64le_linux_Nightly_testList_1/618 - ubu24-ppc64le-1
DaaLoadTest_daa2_5m_0

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk11_j9_sanity.system_ppc64le_linux_Nightly_testList_1/618/system_test_output.tar.gz

23:26:36  DLT 03:26:35.476 - Completed 66.7%. Number of tests started=22822 (+2808)
23:26:39  DLT stderr 
23:26:39  DLT stderr 
23:26:39  DLT stderr *** Invalid JIT return address 000066DA34D8BC88 in 000066DA2C5FA218
23:26:39  DLT stderr 
23:26:39  DLT stderr 03:26:38.426 0x66dacc9cbb00    j9vm.249    *   ** ASSERTION FAILED ** at /home/jenkins/workspace/Build_JDK11_ppc64le_linux_Nightly/openj9/runtime/vm/swalk.c:1633: ((0 ))

@vij-singh
Copy link

Are all these on POWER?

@vij-singh
Copy link

@zl-wang FYI... can you (or someone on P) have a look?

@zl-wang
Copy link
Contributor

zl-wang commented Oct 1, 2024

interesting ... always DAA tests on ppc64le uncompressedRefs.

@zl-wang
Copy link
Contributor

zl-wang commented Oct 1, 2024

@IBMJimmyk have another look at this seemingly recurring issue.

@dmitripivkine
Copy link
Contributor

just FYI crashing symbol in #20256 is _ZN3OMR9CodeCache17replaceTrampolineEP20TR_OpaqueMethodBlockPvS3_S3_b, similar as _ZN3OMR9CodeCache17replaceTrampolineEP20TR_OpaqueMethodBlockPvS3_S3 in this case.
But 20256 failure is on arm Mac, not PPC.

@pshipton
Copy link
Member Author

pshipton commented Oct 3, 2024

Not sure this belongs to this issue, but another DaaLoadTest crash on pccle, in the head stream. All the original failures in this issue were from the 0.48 stream. They all occur on ppcle.

https://openj9-jenkins.osuosl.org/job/Test_openjdk11_j9_extended.system_ppc64le_linux_Nightly_testList_0/623 - ubu22-ppc64le-5
DaaLoadTest_all_5m_0 -Xjit -Xgcpolicy:gencon -Xnocompressedrefs

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk11_j9_extended.system_ppc64le_linux_Nightly_testList_0/623/system_test_output.tar.gz

22:10:59  DLT 02:10:55.125 - Completed 40.0%. Number of tests started=7223 (+1523)
22:10:59  DLT stderr Unhandled exception
22:10:59  DLT stderr Type=Segmentation error vmState=0x00000000
22:10:59  DLT stderr J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
22:10:59  DLT stderr Handler1=00007DBCDD310520 Handler2=00007DBCDD238820
22:10:59  DLT stderr R0=00007DBCD8266068 R1=00007DBC2CADB560 R2=0008200000000000 R3=00007DBBCC06CA00
22:10:59  DLT stderr R4=00007DBCCECA0820 R5=00007DBCCECA1048 R6=00007DBC583D0000 R7=00000000016F0000
22:10:59  DLT stderr R8=00007DBCCECA1048 R9=00007DBBD82E37E8 R10=00007DBBCC007F60 R11=00007DBBCC06CA00
22:10:59  DLT stderr R12=00007DBBCC06CD08 R13=00007DBC2CAE68E0 R14=00007DBBCC007DE0 R15=00007DBCD89C9500
22:10:59  DLT stderr R16=00007DBCCECA0BD8 R17=00007DBCCECA0820 R18=00007DBCCECA0FA8 R19=00007DBCCECA1020
22:10:59  DLT stderr R20=00007DBCCECA1010 R21=00007DBCCECA1020 R22=00007DBCCEC4D478 R23=0000000000000001
22:10:59  DLT stderr R24=00007DBCCECA1048 R25=00007DBCCECA0B98 R26=00007DBCCECA0820 R27=00007DBCCEC4BD78
22:10:59  DLT stderr R28=00007DBC58635618 R29=00007DBC590428E0 R30=0000000000000001 R31=00007DBBCC06CA00
22:10:59  DLT stderr NIP=00007DBC3D8E1F28 MSR=800000000280D033 ORIG_GPR3=00007DBC3D8E21F0 CTR=00007DBC3D9C6B80
22:10:59  DLT stderr LINK=00007DBC3D9C6D78 XER=0000000000000004 CCR=0000000040084824 SOFTE=0000000000000001
22:10:59  DLT stderr TRAP=0000000000000300 DAR=0008200000000034 dsisr=0000000040000000 RESULT=0000000000000000
22:10:59  DLT stderr FPR0=00007dbcd89782e8 (f: 3633808128.000000, d: 6.830460e-310)
22:10:59  DLT stderr FPR1=0000000000000001 (f: 1.000000, d: 4.940656e-324)
22:10:59  DLT stderr FPR2=0000000000000001 (f: 1.000000, d: 4.940656e-324)
22:10:59  DLT stderr FPR3=00000891aff124bb (f: 2951816448.000000, d: 4.654995e-311)
22:10:59  DLT stderr FPR4=3fe8000000000000 (f: 0.000000, d: 7.500000e-01)
22:10:59  DLT stderr FPR5=3fe62e42fefa39ef (f: 4277811712.000000, d: 6.931472e-01)
22:10:59  DLT stderr FPR6=3faaa5aa5df25984 (f: 1576163712.000000, d: 5.204518e-02)
22:10:59  DLT stderr FPR7=bff0000000000000 (f: 0.000000, d: -1.000000e+00)
22:10:59  DLT stderr FPR8=6942424f49417473 (f: 1229026432.000000, d: 1.091905e+199)
22:10:59  DLT stderr FPR9=65706f2e74656e28 (f: 1952804352.000000, d: 4.262150e+180)
22:10:59  DLT stderr FPR10=0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:10:59  DLT stderr FPR11=0000000000000000 (f: 0.000000, d: 0.000000e+00)
22:10:59  DLT stderr FPR12=00007dbbcc008939 (f: 3422587136.000000, d: 6.830237e-310)
22:10:59  DLT stderr FPR13=00800e7913bfd983 (f: 331340160.000000, d: 2.858158e-306)
22:10:59  DLT stderr FPR14=3f11562ac78a4541 (f: 3347727616.000000, d: 6.613384e-05)
22:10:59  DLT stderr FPR15=be2ff3cdcc756fce (f: 3430248448.000000, d: -3.719744e-09)
22:10:59  DLT stderr FPR16=3f11566aaf25de2c (f: 2938494464.000000, d: 6.613756e-05)
22:10:59  DLT stderr FPR17=bebbbcdb6774cecb (f: 1735708416.000000, d: -1.653297e-06)
22:10:59  DLT stderr FPR18=3dd9979767497dad (f: 1732869504.000000, d: 9.310371e-11)
22:10:59  DLT stderr FPR19=bebbbd41c5d26bf1 (f: 3318901760.000000, d: -1.653390e-06)
22:10:59  DLT stderr FPR20=3e66376972bea4d0 (f: 1925096704.000000, d: 4.138137e-08)
22:10:59  DLT stderr FPR21=be63c064abe8bbf2 (f: 2884156416.000000, d: -3.679010e-08)
22:10:59  DLT stderr FPR22=be63c064abf00000 (f: 2884632576.000000, d: -3.679010e-08)
22:10:59  DLT stderr FPR23=bfa84923ecc28000 (f: 3972169728.000000, d: -4.743302e-02)
22:10:59  DLT stderr FPR24=000f4240000f4240 (f: 1000000.000000, d: 2.121996e-308)
22:10:59  DLT stderr FPR25=3ff0000000000000 (f: 0.000000, d: 1.000000e+00)
22:10:59  DLT stderr FPR26=3fa7b80000000000 (f: 0.000000, d: 4.632568e-02)
22:10:59  DLT stderr FPR27=bf5224a5191957e0 (f: 421091296.000000, d: -1.107370e-03)
22:10:59  DLT stderr FPR28=bfa8492528c8cabf (f: 684247744.000000, d: -4.743305e-02)
22:10:59  DLT stderr FPR29=bf5224a5191957d2 (f: 421091296.000000, d: -1.107370e-03)
22:10:59  DLT stderr FPR30=3c4baf4e740749c8 (f: 1946634752.000000, d: 3.001591e-18)
22:10:59  DLT stderr FPR31=3c4d103800000000 (f: 0.000000, d: 3.151055e-18)
22:10:59  DLT stderr 
22:10:59  DLT stderr Compiled_method=java/util/HashMap$EntryIterator.next()Ljava/lang/Object;
22:10:59  DLT stderr Target=2_90_20241003_815 (Linux 5.15.0-122-generic)
22:10:59  DLT stderr CPU=ppc64le (4 logical CPUs) (0x1fb170000 RAM)
22:10:59  DLT stderr ----------- Stack Backtrace -----------
22:10:59  DLT stderr  (0x00007DBC3D8E1F28 [<unknown>+0x0])
22:10:59  DLT stderr runJavaThread+0x240 (0x00007DBCDD2E6830 [libj9vm29.so+0x16830])
22:10:59  DLT stderr javaProtectedThreadProc+0x148 (0x00007DBCDD37F3D8 [libj9vm29.so+0xaf3d8])
22:10:59  DLT stderr omrsig_protect+0x3e4 (0x00007DBCDD239CD4 [libj9prt29.so+0x39cd4])
22:10:59  DLT stderr javaThreadProc+0x60 (0x00007DBCDD37AA40 [libj9vm29.so+0xaaa40])
22:10:59  DLT stderr thread_wrapper+0x190 (0x00007DBCDD1CCBC0 [libj9thr29.so+0xcbc0])
22:10:59  DLT stderr  (0x00007DBCDDC45804 [libc.so.6+0xb5804])
22:10:59  DLT stderr ---------------------------------------

@pshipton pshipton changed the title DaaLoadTest vmState=0x00000000 _ZN3OMR9CodeCache17replaceTrampolineEP20TR_OpaqueMethodBlockPvS3_S3_b+0xe4 DaaLoadTest ppc64le vmState=0x00000000 _ZN3OMR9CodeCache17replaceTrampolineEP20TR_OpaqueMethodBlockPvS3_S3_b+0xe4 Oct 3, 2024
@pshipton
Copy link
Member Author

pshipton commented Oct 3, 2024

https://openj9-jenkins.osuosl.org/job/Test_openjdk17_j9_sanity.system_ppc64le_linux_Nightly_testList_1/677 - ubu22-ppc64le-5
DaaLoadTest_daa2_5m_0 -Xjit -Xgcpolicy:gencon -Xnocompressedrefs

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk17_j9_sanity.system_ppc64le_linux_Nightly_testList_1/677/system_test_output.tar.gz

20:21:26  DLT 00:21:25.982 - Completed 40.0%. Number of tests started=12683 (+2848)
20:21:48  DLT stderr 
20:21:48  DLT stderr 
20:21:48  DLT stderr *** Invalid JIT return address 000071344DCF9E34 in 00007133EFE0A218
20:21:48  DLT stderr 
20:21:48  DLT stderr 00:21:44.243 0x7134e8803f00    j9vm.249    *   ** ASSERTION FAILED ** at /home/jenkins/workspace/Build_JDK17_ppc64le_linux_Nightly/openj9/runtime/vm/swalk.c:1633: ((0 ))

@pshipton
Copy link
Member Author

pshipton commented Oct 7, 2024

https://openj9-jenkins.osuosl.org/job/Test_openjdk8_j9_special.system_ppc64le_linux_Personal_testList_0/109 - cent9-ppc64le-4
DaaLoadTest_daa2_special_5m_8 -Xgcpolicy:gencon -Xshareclasses -Xjit -Xnocompressedrefs

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk8_j9_special.system_ppc64le_linux_Personal_testList_0/109/system_test_output.tar.gz

10:33:35  DLT 14:33:33.189 - Completed 60.0%. Number of tests started=57465 (+7021)
10:33:42  DLT stderr 
10:33:42  DLT stderr 
10:33:42  DLT stderr *** Invalid JIT return address 00007FFE71F85680 in 00007FFF84851F00
10:33:42  DLT stderr 
10:33:42  DLT stderr 14:33:40.026 0x7fff84851c00    j9vm.249    *   ** ASSERTION FAILED ** at /home/jenkins/workspace/Build_JDK8_ppc64le_linux_Personal/openj9/runtime/vm/swalk.c:1633: ((0 ))

https://openj9-jenkins.osuosl.org/job/Test_openjdk11_j9_special.system_ppc64le_linux_Personal_testList_0/113 - ubu22-ppc64le-4
DaaLoadTest_all_special_5m_12 -Xjit -Xgcpolicy:balanced -Xnocompressedrefs

https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Test_openjdk11_j9_special.system_ppc64le_linux_Personal_testList_0/113/system_test_output.tar.gz

11:22:23  DLT 15:22:21.780 - Completed 46.7%. Number of tests started=5519 (+893)
11:22:24  DLT stderr 
11:22:24  DLT stderr 
11:22:24  DLT stderr *** Invalid JIT return address 0000731C3A3A8294 in 0000731CD8A5E000
11:22:24  DLT stderr 
11:22:24  DLT stderr 15:22:24.204 0x731cd8a5dd00    j9vm.249    *   ** ASSERTION FAILED ** at /home/jenkins/workspace/Build_JDK11_ppc64le_linux_Personal/openj9/runtime/vm/swalk.c:1633: ((0 ))

@rmnattas
Copy link
Contributor

rmnattas commented Oct 7, 2024

Looking at 2 of the cores here show the same story:
The failing stackwalk is walking a thread that’s executing an “invalidated” body.

There’s no metadata for the method at the failing return address in the AVLTree, but the body its on is valid instructions and within the code-cache.

0x7ffe71f85670      0000e563 ori       r5, r31, 0
0x7ffe71f85674      00001004 
0x7ffe71f85678      740280e5 pld       r12, 0x274(0), 1
0x7ffe71f8567c      45960548 bl        0x7ffe71fdecc0
0x7ffe71f85680      00fdff4b b         0x7ffe71f85380      // <-- return address
0x7ffe71f85684      0000603c lis       r3, 0 CONST 0x7ffe71f852f4 Ptr 
0x7ffe71f85688      fe7f6360 ori       r3, r3, 0x7ffe
0x7ffe71f8568c      c6076378 sldi      r3, r3, 0x20

When getting the following return address from the stack-pointer we find the caller being org/junit/runners/BlockJUnit4ClassRunner$1.runReflectiveCall (specifically from the inlined body of DelegatingConstructorAccessorImpl.newInstance) which calls the abstract method ConstructorAccessorImpl.newInstance.

0x7ffe71ee090c {org/.../BlockJUnit4ClassRunner$1.runReflectiveCall} +100          |||  0000e2eb ld        r31, 0(r2)
0x7ffe71ee0910 {org/.../BlockJUnit4ClassRunner$1.runReflectiveCall} +101          |||  e405ff7b rldicr    r31, r31, 0, 0x37
0x7ffe71ee0914 {org/.../BlockJUnit4ClassRunner$1.runReflectiveCall} +102          |||  00001004 
0x7ffe71ee0918 {org/.../BlockJUnit4ClassRunner$1.runReflectiveCall} +103          |||  7c0560e5 pld       r11, 0x57c(0), 1 
0x7ffe71ee091c {org/.../BlockJUnit4ClassRunner$1.runReflectiveCall} +104          |||  40583f7c cmpld     r31, r11
0x7ffe71ee0920 {org/.../BlockJUnit4ClassRunner$1.runReflectiveCall} +105          |||* 0c008240 bne       0x7ffe71ee092c C>> +108 
0x7ffe71ee0924 {org/.../BlockJUnit4ClassRunner$1.runReflectiveCall} +106   7:5    |||| dd490a48 bl        0x7ffe71f85300 // invokevirtual 4 {sun/.../ConstructorAccessorImpl.newInstance([Ljava/lang/Object;)Ljava/lang/Object;}
 

The call is de-virtualized in the caller body with the class guard checking against the J9Class sun/reflect/GeneratedConstructorAccessor182 . The method GeneratedConstructorAccessor182.newInstance which is what should have been the callee, does not exist in the AVLTree.

(kca) j9m 0x00007ffdf802bbd8
Method   {ClassPath/Name.MethodName}: {sun/reflect/GeneratedConstructorAccessor182.newInstance}
                           Signature: ([Ljava/lang/Object;)Ljava/lang/Object;
                              Access: Public 
                    J9Class/J9Method: 0x00007ffdf802b900 / 0x00007ffdf802bbd8
               Compiled Method Start: Not Compiled! (count=450)
                      ByteCode Start: 0x00007ffe39aa27e0 (49 bytes)
                   ROM Constant Pool: 0x00007ffe39aa2710 (15 entries)
                       Constant Pool: 0x00007ffdf802bac0 (15 entries)

Following the address from the caller it calls to around 220 instructions before the failing return address. The entry to the body is intact and valid for execution, but doesn’t have a JITW eye-cacher and seem to should have been invalidated.

0x7ffe71f852ec      00000000 invalid instruction
0x7ffe71f852f0      15000c00 invalid instruction         // <-- jittedBodyInfo = SamplingPrologue | HasBeenRecompiled
0x7ffe71f852f4      10006ee8 ld        r3, 0x10(r14)
0x7ffe71f852f8      08008ee8 ld        r4, 8(r14)
0x7ffe71f852fc      0000aee8 ld        r5, 0(r14)
0x7ffe71f85300      50006fe9 ld        r11, 0x50(r15)    // <-- JIT entry point
0x7ffe71f85304      c0ffce39 addi      r14, r14, -0x40
0x7ffe71f85308      a602087c mflr      r0
0x7ffe71f8530c      40582e7c cmpld     r14, r11
0x7ffe71f85310      38000ef8 std       r0, 0x38(r14)
0x7ffe71f85314      a805c140 ble-      0x7ffe71f858bc

(kca) search "JITW" (0x7ffe71f85300-0x5000)/0x50000
...
HIT @ 0x00007ffe71f84084 {sun/reflect/GeneratedMethodAccessor31.invoke} -28 
HIT @ 0x00007ffe71f84e84 {com/ibm/dataaccess/ByteArrayMarshaller.writeDouble_} -27 
HIT @ 0x00007ffe71f85a04 {net/openj9/test/binaryData/TestFloat2ByteArray.<init>} -30 
HIT @ 0x00007ffe71f85e04 {net/openj9/test/Utils.toPrecision} -29 
...

Getting the back-trace starting from the caller as first frame with metadata (using KCA's where [pc] [sp])

(kca) where 0x00007ffe71ee0928 0x7ffde4009180
Using 0x00007ffe71ee0928 as the initial program counter and 0x00007ffde4009180 as initial SP...
0x00007ffe71ee0928 {org/junit/runners/BlockJUnit4ClassRunner$1.runReflectiveCall} JIT [0x7ffde4009180] // <-- the caller
0x00007ffe71c5b6e4 {org/junit/internal/runners/model/ReflectiveCallable.run} JIT [0x7ffde4009220]
0x00007ffe71ce4098 {org/junit/runners/BlockJUnit4ClassRunner.methodBlock} JIT [0x7ffde4009270]
0x00007ffe71e7dcf0 {org/junit/runners/ParentRunner.runChildren} JIT [0x7ffde4009350]
0x00007ffe71f6c528 {org/junit/runners/Suite.runChild} JIT [0x7ffde4009500]
0x00007ffe71e7e130 {org/junit/runners/ParentRunner.runChildren} JIT [0x7ffde40095a0]
0x00007ffe71f22d94 {org/junit/runner/JUnitCore.run} JIT [0x7ffde4009750]
0x00007ffe72048998 {net/adoptopenjdk/loadTest/adaptors/JUnitAdaptor.executeTest} JIT [0x7ffde40099f0]
0x00007ffe7264aec8 {net/adoptopenjdk/loadTest/LoadTestRunner$2.run} JIT [0x7ffde4009b10]

So executing the caller caused this thread to reach and execute this "invalidated" body, which doesn’t have metadata in the AVLTree, and when a stackwalk is triggered it crashes.

@zl-wang
Copy link
Contributor

zl-wang commented Oct 7, 2024

especially, please notice that the caller (de-virtualized/static call) still points to the callee code which doesn't have metaData anymore, as below:

0x7ffe71ee0924 {org/.../BlockJUnit4ClassRunner$1.runReflectiveCall} +106 7:5 |||| dd490a48 bl 0x7ffe71f85300 // invokevirtual 4 {sun/.../ConstructorAccessorImpl.newInstance([Ljava/lang/Object;)Ljava/lang/Object;}

the supposed callee has been recompiled (marked as SamplingPrologue | HasBeenRecompiled), but j9method.extra showed it is still interpreted and the old-body was not patched (should have been) to jump to sampling-patch pre-prologue. it sounds like some kind of compilation control problem. @dsouzai please review and help diagnose ...

@dsouzai
Copy link
Contributor

dsouzai commented Oct 8, 2024

I'm taking a look right now, but I wanted to clarify what I believe is incorrect in #20263 (comment)

This comment

0x7ffe71f852f0      15000c00 invalid instruction         // <-- jittedBodyInfo = SamplingPrologue | HasBeenRecompiled

is not correct. Since this is LE, the lowest byte is 0x15

(kca) (0x7ffe71f852f0)/b
%1 = 0x00007ffe71f852f0: 15

as per the enum in the Private Linkage:

         {
         ReturnInfoMask                     = 0x0000000F, // bottom 4 bits
         // The VM depends on these four bits - word to the wise: don't mess

         SamplingPrologue                   = 0x00000010,

So that means that the 0x5 is part of the ReturnInfo, and in fact the only bit set is 0x10 which is SamplingPrologue. This also makes sense, because if HasBeenRecompiled was set, I would also expect to see IsBeingRecompiled because although resetIsBeingRecompiled exists, no one ever calls it, so that bit never goes away.

@dsouzai
Copy link
Contributor

dsouzai commented Oct 8, 2024

Looking back at one of the problematic JIT'd bodies:

0x7ffe71f852c8      e8a30cf4 invalid instruction
0x7ffe71f852cc      fd7f0000 invalid instruction
0x7ffe71f852d0      b003a889 lbz       r13, 0x3b0(r8)
0x7ffe71f852d4      ff7f0000 invalid instruction
0x7ffe71f852d8      00000000 invalid instruction
0x7ffe71f852dc      a602087c mflr      r0
0x7ffe71f852e0      e1a70548 bl        0x7ffe71fdfac0
0x7ffe71f852e4      30066e6b xori      r14, r27, 0x630
0x7ffe71f852e8      fe7f0000 invalid instruction
0x7ffe71f852ec      00000000 invalid instruction
0x7ffe71f852f0      15000c00 invalid instruction
0x7ffe71f852f4      10006ee8 ld        r3, 0x10(r14)
0x7ffe71f852f8      08008ee8 ld        r4, 8(r14)
0x7ffe71f852fc      0000aee8 ld        r5, 0(r14)
0x7ffe71f85300      50006fe9 ld        r11, 0x50(r15)

the J9Method is 0x00007ffdf40ca3e8 as per

0x7ffe71f852c8      e8a30cf4 invalid instruction
0x7ffe71f852cc      fd7f0000 invalid instruction

this is also confirmed by looking at the body info from

0x7ffe71f852e4      30066e6b xori      r14, r27, 0x630
0x7ffe71f852e8      fe7f0000 invalid instruction
(kca) x/4gx 0x00007ffe6b6e0630
0x00007ffe6b6e0630: 00000000000003e8 00007ffe6b6e05e0 0000000000000000 00007ffe0c05f110

and then the MethodInfo:

(kca) x/4gx 0x00007ffe6b6e05e0
0x00007ffe6b6e05e0: 00007ffdf40ca3e8 0000000300000000 0000000000000000 0000b09700000000

which shows the J9Method of 0x00007ffdf40ca3e8. Looking at the fields of the J9Method:

(kca) x/4gx 0x00007ffdf40ca3e8
0x00007ffdf40ca3e8: 0000000000000000 0000000000000000 0000000000000000 0000000000000000

It does look like it's been unloaded altogether.

Given that the guard is testing for 0x00007ffdf802b900 which is

(kca) j9c 0x00007ffdf802b900
  Class Path/Name: {sun/reflect/GeneratedConstructorAccessor182} J9Class 0x00007ffdf802b900
      ClassObject: 0x00007ffe85f83818
           Access: Public  (1)
      ClassLoader: 0x00007ffde401bf88  Object: 0x00007fff833d0138 {sun/reflect/DelegatingClassLoader}
     SubClassLink: 0x00007ffe000ad200  {sun/reflect/GeneratedConstructorAccessor181}
    Instance Size: 8 (with header: 16) bytes
        Hierarchy: (depth 3)
                   {sun/reflect/ConstructorAccessorImpl} J9Class 0x00007fff84288500
                   {sun/reflect/MagicAccessorImpl} J9Class 0x00007fff84288100
                   {java/lang/Object} J9Class 0x00007fff841d4c00
       Interfaces:
                   {sun/reflect/ConstructorAccessor} J9Class 0x00007fff84288300
        J9Methods: 0x00007ffdf802bbb8 (2 methods)

but this class' newInstance method is not compiled:

(kca) j9m 0x00007ffdf802bbb8+0x20
Method   {ClassPath/Name.MethodName}: {sun/reflect/GeneratedConstructorAccessor182.newInstance}
                           Signature: ([Ljava/lang/Object;)Ljava/lang/Object;
                              Access: Public
                    J9Class/J9Method: 0x00007ffdf802b900 / 0x00007ffdf802bbd8
               Compiled Method Start: Not Compiled! (count=450)
                      ByteCode Start: 0x00007ffe39aa27e0 (49 bytes)
                   ROM Constant Pool: 0x00007ffe39aa2710 (15 entries)
                       Constant Pool: 0x00007ffdf802bac0 (15 entries)

as Julian suggested offline, it must be that the old class got unloaded and this current class got loaded in the same location.

The only thing I can think is that we missed a Class Unload assumption for the guard; otherwise, even if a class was loaded in the same location, because class unloading is a STW event, the guard should have been patched to -1. I don't think redefinition could be the culprit here because the Method Info flags don't indicate that the method was redefined.

@dsouzai
Copy link
Contributor

dsouzai commented Oct 8, 2024

That said, from the snap file:

14:33:39.588491504  0x0000000000000000 j9mm.60             Event       Class unloading start
14:33:39.588509754  0x0000000000000000 j9mm.94             Event       Class unloading end: classloadersunloaded=0 classesunloaded=0

doesn't look like any unloading happened at all.

@IBMJimmyk
Copy link
Contributor

I tried running DaaLoadTest_daa2_special_5m_8 locally with the -verbose:class on a JDK11 build. My run is successful, but I see a very large number of class unloads. A sample looks like this:

class unload: jdk/internal/reflect/GeneratedConstructorAccessor139
class unload: jdk/internal/reflect/GeneratedMethodAccessor1146
class unload: jdk/internal/reflect/GeneratedMethodAccessor1145
class unload: jdk/internal/reflect/GeneratedMethodAccessor1144
class unload: jdk/internal/reflect/GeneratedMethodAccessor1141
class unload: jdk/internal/reflect/GeneratedConstructorAccessor180
class unload: jdk/internal/reflect/GeneratedMethodAccessor1245
class unload: jdk/internal/reflect/GeneratedMethodAccessor1246
class unload: jdk/internal/reflect/GeneratedMethodAccessor1247
class unload: jdk/internal/reflect/GeneratedMethodAccessor1248
class unload: jdk/internal/reflect/GeneratedMethodAccessor1249
class unload: jdk/internal/reflect/GeneratedMethodAccessor1250
class unload: jdk/internal/reflect/GeneratedConstructorAccessor181
class unload: jdk/internal/reflect/GeneratedConstructorAccessor182
class unload: jdk/internal/reflect/GeneratedMethodAccessor1251
class unload: jdk/internal/reflect/GeneratedMethodAccessor1252
class unload: jdk/internal/reflect/GeneratedMethodAccessor1253
class unload: jdk/internal/reflect/GeneratedMethodAccessor1254
class unload: jdk/internal/reflect/GeneratedMethodAccessor1255
class unload: jdk/internal/reflect/GeneratedMethodAccessor1256
class unload: jdk/internal/reflect/GeneratedConstructorAccessor183

It might be worth double checking and running the test on a machine where it fails with the -verbose:class option.

@zl-wang
Copy link
Contributor

zl-wang commented Oct 9, 2024

class unloading patching registration is done as below:

      if (acursor->isUnloadablePicSite())
         {
         // Register an unload assumption on the lower 32bit of the class constant.
         // The patching code thinks it's low bit tagging an instruction not a class pointer!!
         cg()->
         jitAddPicToPatchOnClassUnload((void *)acursor->getConstantValue(), (void *)(codeCursor+((cg()->comp()->target().is64Bit())?4:0)) );
         }

that didn't take LE into account, so the high-order 4-byte was registered. that certainly was not intended. On POWER, class unloading is compensated like below:

#elif defined(TR_HOST_POWER)
   // On PPC, the patching is on a 4-byte entity regardless of 32/64bit JIT
   extern void ppcCodeSync(unsigned char *codeStart, unsigned int codeSize);
   *((int32_t *)_picLocation) |= 1;
   ppcCodeSync(_picLocation, 4);
#elif defined(TR_HOST_ARM)

Considering the J9Class is 0x00007ffdf802b900 to begin with. The patching only set the 0xd last bit again.

@zl-wang
Copy link
Contributor

zl-wang commented Oct 9, 2024

In my mind, this was fixed by Kevin many years ago. I have assumed picSite is healthy

Copy link

Issue Number: 20263
Status: Closed
Actual Components: comp:jit, test failure, arch:power, os:linux, segfault
Actual Assignees: No one :(
PR Assignees: No one :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch:power comp:jit os:linux segfault Issues that describe segfaults / JVM crashes test failure
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants