Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JDK21 serviceability_jvmti_j9_0_FAILED serviceability/jvmti/vthread/BreakpointInYieldTest/BreakpointInYieldTest.java Segmentation error vmState=0x0002000f #18088

Closed
JasonFengJ9 opened this issue Sep 7, 2023 · 44 comments · Fixed by #18413
Assignees
Labels

Comments

@JasonFengJ9
Copy link
Member

JasonFengJ9 commented Sep 7, 2023

Failure link

From an internal build(osxrt1):

03:27:31  openjdk version "21-internal" 2023-09-19
03:27:31  OpenJDK Runtime Environment (build 21-internal-adhoc.jenkins.BuildJDK21x86-64macPersonal)
03:27:31  Eclipse OpenJ9 VM (build master-7599bde8a13, JRE 21 Mac OS X amd64-64-Bit Compressed References 20230906_50 (JIT enabled, AOT enabled)
03:27:31  OpenJ9   - 7599bde8a13
03:27:31  OMR      - 873ac5d377a
03:27:31  JCL      - 154f45ddce4 based on jdk-21+35)

Rerun in Grinder - Change TARGET to run only the failed test targets.

Optional info

Failure output (captured from console output)

03:29:53  variation: Mode150
03:29:53  JVM_OPTIONS:  -XX:+UseCompressedOops 

03:32:59  TEST: serviceability/jvmti/vthread/BreakpointInYieldTest/BreakpointInYieldTest.java

03:32:59  STDERR:
03:32:59  Unhandled exception
03:32:59  Type=Segmentation error vmState=0x0002000f
03:32:59  J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000000
03:32:59  Handler1=0000000007C37E60 Handler2=0000000007F42910
03:32:59  RDI=000000003949B6F4 RSI=00007000108A8300 RAX=0000000000000018 RBX=0000000000000018
03:32:59  RCX=FFFF802E6284C5C4 RDX=FFFF802E9BCE7CB8 R8=00007000108A82FC R9=00007000108A82F8
03:32:59  R10=0000000000000018 R11=000000003949B70C R12=00007000108A8290 R13=0000000000000007
03:32:59  R14=00007000108A8360 R15=0000000000000000
03:32:59  RIP=0000000007E17EA2 GS=0000 FS=0000 RSP=00007000108A8218
03:32:59  RFlags=0000000000010282 CS=002B RBP=00007000108A8240 ERR=D9E2700000000000
03:32:59  TRAPNO=000000000000000D CPU=7000000000000000 FAULTVADDR=00007FD1D9E27000
03:32:59  XMM0 0000000000000000 (f: 0.000000, d: 0.000000e+00)
03:32:59  XMM1 0000000000000468 (f: 1128.000000, d: 5.573060e-321)
03:32:59  XMM2 69767265732f6b72 (f: 1932487552.000000, d: 1.073873e+200)
03:32:59  XMM3 2f69746d766a2f79 (f: 1986670464.000000, d: 2.683495e-80)
03:32:59  XMM4 696f706b61657242 (f: 1634038400.000000, d: 7.520345e+199)
03:32:59  XMM5 2f64616572687476 (f: 1919448192.000000, d: 2.148548e-80)
03:32:59  XMM6 6c6569596e49746e (f: 1850307712.000000, d: 1.441632e+214)
03:32:59  XMM7 746e696f706b6165 (f: 1886085504.000000, d: 6.967698e+252)
03:32:59  XMM8 616d2f642e747365 (f: 779383680.000000, d: 2.051584e+161)
03:32:59  XMM9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
03:32:59  XMM10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
03:32:59  XMM11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
03:32:59  XMM12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
03:32:59  XMM13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
03:32:59  XMM14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
03:32:59  XMM15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
03:32:59  Module=/Users/jenkins/workspace/Test_openjdk21_j9_extended.openjdk_x86-64_mac_Personal/openjdkbinary/j2sdk-image/lib/default/libj9vm29.dylib
03:32:59  Module_base_address=0000000007C00000 Symbol=mapLocalSet
03:32:59  Symbol_address=0000000007E17E10
03:32:59  Target=2_90_20230906_50 (Mac OS X 10.15.7)
03:32:59  CPU=amd64 (8 logical CPUs) (0x300000000 RAM)
03:32:59  ----------- Stack Backtrace -----------
03:32:59  mapLocalSet+0x93 (0x0000000007E17EA3 [libj9vm29.dylib+0x217ea3])
03:32:59  j9localmap_LocalBitsForPC+0x5fb (0x0000000007E17C7B [libj9vm29.dylib+0x217c7b])
03:32:59  walkBytecodeFrameSlots+0x178 (0x0000000007C7AD98 [libj9vm29.dylib+0x7ad98])
03:32:59  walkStackFrames+0x1136 (0x0000000007C7A6F6 [libj9vm29.dylib+0x7a6f6])
03:32:59  walkContinuationStackFrames+0x19d (0x0000000007C9049D [libj9vm29.dylib+0x9049d])
03:32:59  _ZN28GC_VMThreadStackSlotIterator21scanContinuationSlotsEP10J9VMThreadP8J9ObjectPvPFvP8J9JavaVMPS3_S4_P16J9StackWalkStatePKvEbb+0xbf (0x0000000009BBB26F [libj9gc29.dylib+0xe726f])
03:32:59  _ZN20MM_ScavengerDelegate27scanContinuationNativeSlotsEP22MM_EnvironmentStandardP8J9Object21MM_ScavengeScanReasonb+0xc9 (0x0000000009B96C29 [libj9gc29.dylib+0xc2c29])
03:32:59  _ZN20MM_ScavengerDelegate16getObjectScannerEP22MM_EnvironmentStandardP8J9ObjectPvm21MM_ScavengeScanReasonPb+0x2ed (0x0000000009B96F4D [libj9gc29.dylib+0xc2f4d])
03:32:59  _ZN12MM_Scavenger26incrementalScanCacheBySlotEP22MM_EnvironmentStandardP24MM_CopyScanCacheStandard+0x5d6 (0x0000000009B64976 [libj9gc29.dylib+0x90976])
03:32:59  _ZN12MM_Scavenger12completeScanEP22MM_EnvironmentStandard+0x1a6 (0x0000000009B65396 [libj9gc29.dylib+0x91396])
03:32:59  _ZN12MM_Scavenger24workThreadGarbageCollectEP22MM_EnvironmentStandard+0x292 (0x0000000009B65762 [libj9gc29.dylib+0x91762])
03:32:59  _ZN21MM_ParallelDispatcher16workerEntryPointEP18MM_EnvironmentBase+0x77 (0x0000000009B09607 [libj9gc29.dylib+0x35607])
03:32:59  _Z23dispatcher_thread_proc2P14OMRPortLibraryPv+0xf6 (0x0000000009B094F6 [libj9gc29.dylib+0x354f6])
03:32:59  omrsig_protect+0x392 (0x0000000007F41402 [libj9prt29.dylib+0x21402])
03:32:59  dispatcher_thread_proc+0x42 (0x0000000009B09582 [libj9gc29.dylib+0x35582])
03:32:59  thread_wrapper+0x13a (0x0000000006BB06BA [libj9thr29.dylib+0xa6ba])
03:32:59  _pthread_start+0x94 (0x00007FFF6C384109 [libsystem_pthread.dylib+0x6109])
03:32:59  ---------------------------------------
03:32:59  JVMDUMP039I Processing dump event "gpf", detail "" at 2023/09/07 03:31:44 - please wait.

03:33:09  serviceability_jvmti_j9_0_FAILED

50x serviceability_jvmti_j9_0 internal grinder - reproduced 7/50

This seems similar to

FYI @babsingh

@JasonFengJ9 JasonFengJ9 added this to the Java 21 (0.42) milestone Sep 7, 2023
@fengxue-IS fengxue-IS added the project:loom Used to track Project Loom related work label Sep 16, 2023
@JasonFengJ9
Copy link
Member Author

JDK21 aarch64_mac/ milestone 0(macaarch64rt8)

[2023-09-22T18:52:07.426Z] variation: Mode650
[2023-09-22T18:52:07.426Z] JVM_OPTIONS:  -XX:-UseCompressedOops 

[2023-09-22T18:53:37.732Z] TEST: serviceability/jvmti/vthread/BreakpointInYieldTest/BreakpointInYieldTest.java

[2023-09-22T18:53:37.733Z] STDERR:
[2023-09-22T18:53:37.733Z] Unhandled exception
[2023-09-22T18:53:37.733Z] Type=Segmentation error vmState=0x0002000f
[2023-09-22T18:53:37.733Z] J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000002
[2023-09-22T18:53:37.733Z] Handler1=0000000104B41094 Handler2=00000001049F51D8 InaccessibleAddress=FFFE8006AAA6A180
[2023-09-22T18:53:37.733Z] x0=FFFFA0014F9EF438 x1=000000016C2AD0A0 x2=FFFFA0014F9EF438 x3=0000000000000000
[2023-09-22T18:53:37.733Z] x4=000000016C2AD09C x5=000000016C2AD098 x6=000000016C2AD094 x7=0000000104BF7748
[2023-09-22T18:53:37.733Z] x8=000000016C2AD100 x9=00000001501177C8 x10=00000001501177E0 x11=0000000104C679CC
[2023-09-22T18:53:37.733Z] x12=0000000000000001 x13=0000000104C682A3 x14=0000000104C67BCC x15=0000000000000007
[2023-09-22T18:53:37.733Z] x16=000000016C2AD100 x17=FFFFA0029FB06C00 x18=0000000000000000 x19=00000000FFFFFFFF
[2023-09-22T18:53:37.733Z] x20=000000016C2AD95C x21=000000010491DEE0 x22=0000000000000000 x23=00000001501177B4
[2023-09-22T18:53:37.733Z] x24=FFFFA0014F9EF438 x25=0000000104BF76A4 x26=000000016C2AD0A0 x27=0000000000000000
[2023-09-22T18:53:37.733Z] x28=0000000000000003 x29(FP)=000000016C2AD900 x30(LR)=0000000104BF72C0 x31(SP)=000000016C2ACFD0
[2023-09-22T18:53:37.733Z] PC=0000000104BF74AC SP=000000016C2ACFD0
[2023-09-22T18:53:37.733Z] v0 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-09-22T18:53:37.733Z] v1 00000002801c8778 (f: 2149353216.000000, d: 5.305913e-314)
[2023-09-22T18:53:37.733Z] v2 000001be0000013e (f: 318.000000, d: 9.464101e-312)
[2023-09-22T18:53:37.733Z] v3 000001be000001be (f: 446.000000, d: 9.464101e-312)
[2023-09-22T18:53:37.733Z] v4 00000000000001be (f: 446.000000, d: 2.203533e-321)
[2023-09-22T18:53:37.733Z] v5 0000013e0000013e (f: 318.000000, d: 6.747947e-312)
[2023-09-22T18:53:37.733Z] v6 000000000000013e (f: 318.000000, d: 1.571129e-321)
[2023-09-22T18:53:37.733Z] v7 0000013e00000000 (f: 0.000000, d: 6.747947e-312)
[2023-09-22T18:53:37.733Z] v8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-09-22T18:53:37.733Z] v9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-09-22T18:53:37.733Z] v10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-09-22T18:53:37.733Z] v11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-09-22T18:53:37.733Z] v12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-09-22T18:53:37.733Z] v13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-09-22T18:53:37.733Z] v14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-09-22T18:53:37.733Z] v15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-09-22T18:53:37.733Z] v16 bfd0000000000000 (f: 0.000000, d: -2.500000e-01)
[2023-09-22T18:53:37.733Z] v17 3fd57028ca0c5555 (f: 3389805824.000000, d: 3.349707e-01)
[2023-09-22T18:53:37.733Z] v18 bf7aea0b0c8a2ba6 (f: 210381728.000000, d: -6.570857e-03)
[2023-09-22T18:53:37.733Z] v19 3fe62e42fefa39ef (f: 4277811712.000000, d: 6.931472e-01)
[2023-09-22T18:53:37.733Z] v20 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-09-22T18:53:37.733Z] v21 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-09-22T18:53:37.733Z] v22 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-09-22T18:53:37.733Z] v23 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-09-22T18:53:37.733Z] v24 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-09-22T18:53:37.733Z] v25 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-09-22T18:53:37.733Z] v26 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-09-22T18:53:37.733Z] v27 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-09-22T18:53:37.733Z] v28 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-09-22T18:53:37.733Z] v29 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-09-22T18:53:37.733Z] v30 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-09-22T18:53:37.733Z] v31 0000000000000000 (f: 0.000000, d: 0.000000e+00)
[2023-09-22T18:53:37.733Z] Module=/Users/jenkins/workspace/Test_openjdk21_j9_extended.openjdk_aarch64_mac/openjdkbinary/j2sdk-image/Contents/Home/lib/default/libj9vm29.dylib
[2023-09-22T18:53:37.733Z] Module_base_address=0000000104B1C000 Symbol=mapLocalSet
[2023-09-22T18:53:37.733Z] Symbol_address=0000000104BF7420
[2023-09-22T18:53:37.733Z] Target=2_90_20230919_43 (Mac OS X 13.0)
[2023-09-22T18:53:37.733Z] CPU=aarch64 (8 logical CPUs) (0x400000000 RAM)
[2023-09-22T18:53:37.733Z] ----------- Stack Backtrace -----------
[2023-09-22T18:53:37.733Z] ---------------------------------------
[2023-09-22T18:53:37.733Z] JVMDUMP039I Processing dump event "gpf", detail "" at 2023/09/22 14:53:11 - please wait.

[2023-09-22T18:53:56.143Z] serviceability_jvmti_j9_1_FAILED

@tajila
Copy link
Contributor

tajila commented Oct 5, 2023

@gacholio can you please take a look at this test_output

@gacholio
Copy link
Contributor

gacholio commented Oct 6, 2023

The tar file appears to be corrupt

j9build@736bb006f300:DOCKER-IMAGE $ tar xzf openjdk_test_output.tar.gz 
tar: This does not look like a tar archive
tar: Skipping to next header
tar: Exiting with failure status due to previous errors
j9build@736bb006f300:DOCKER-IMAGE $ gunzip openjdk_test_output.tar.gz 
j9build@736bb006f300:DOCKER-IMAGE $ ls
openjdk_test_output.tar
j9build@736bb006f300:DOCKER-IMAGE $ tar xvf openjdk_test_output.tar 
tar: This does not look like a tar archive
tar: Skipping to next header
tar: Exiting with failure status due to previous errors

@gacholio
Copy link
Contributor

gacholio commented Oct 6, 2023

Looks like http download corrupts it - fetching with curl now.

@gacholio
Copy link
Contributor

gacholio commented Oct 6, 2023

A link to a closely matching JDK11 xa64 build would be helpful (for DDR). The SDK which produced the cores may also help, though I've never been able to run openj9 on my mac (millions of permission errors).

@gacholio
Copy link
Contributor

gacholio commented Oct 6, 2023

I think this was actually an x86 mac build.

@tajila
Copy link
Contributor

tajila commented Oct 6, 2023

I think this was actually an x86 mac build.

Are you talking about the original failure? or the one Jason posted 2 weeks ago?

@gacholio
Copy link
Contributor

gacholio commented Oct 6, 2023

I was looking at the start of the PR description. If the cores I have are from amac, the SDK won't help me anyway. I'll see if I can figure anything out from DDR. I also notice that no native stack traces appear in the javacore, which is unhelpful.

@tajila
Copy link
Contributor

tajila commented Oct 6, 2023

@gacholio
Copy link
Contributor

gacholio commented Oct 6, 2023

It's a crash in GC, so the native stacks probably aren't going to be very informative anyway.

@JasonFengJ9
Copy link
Member Author

I think the original failure was found in a personal build @JasonFengJ9 can confirm.

@tajila yes, the failure was from a personal build before JDK21 nightly/weekly builds run regularly.
All codes were current with the main branches.

@gacholio
Copy link
Contributor

The mac x86 link above doesn't work (it's not even a link to a file!). Even signed in, it just presents me with thousands of unrelated artifacts.

@pshipton
Copy link
Member

You need to open it twice, it doesn't work the first time.

@gacholio
Copy link
Contributor

That worked, but what I really want is the linux x86-64 build. This looks like the right place to look:

https://na-public.artifactory.swg-devops.com/ui/native/sys-rt-generic-local/hyc-runtimes-jenkins.swg-devops.com/Build_JDK11_x86-64_linux_Nightly/

but it hasn't been updated in years. Where should I be looking for the latest builds?

@pshipton
Copy link
Member

@gacholio
Copy link
Contributor

Thanks, the mac x86 build is also working for me, now trying to get the original core downloaded (keeps failing half way through).

@gacholio
Copy link
Contributor

I have the original failure. Not surprisingly, the stack being walked appears to be in an unmounted continuation.

@gacholio
Copy link
Contributor

The DDR stack extensions don't appear to work on these stacks:

> !stackslots 0x00007000108A8CA0
Oct 11, 2023 2:33:25 P.M. com.ibm.j9ddr.vm29.events.DefaultEventListener corruptData
SEVERE: CDE thrown extracting initial stack walk state. walkThread = 0x00007000108A8CA0
com.ibm.j9ddr.NullPointerDereference: Memory Fault reading 0x00000000 : 
	at com.ibm.j9ddr.vm29.pointer.AbstractPointer.getLongAtOffset(AbstractPointer.java:456)
	at com.ibm.j9ddr.vm29.pointer.generated.J9ThreadPointer.tid(Unknown Source)
	at com.ibm.j9ddr.vm29.pointer.helper.J9ThreadHelper.getOSThread(J9ThreadHelper.java:60)
	at com.ibm.j9ddr.vm29.j9.stackwalker.StackWalker$StackWalker_29_V0.walkStackFrames(StackWalker.java:171)
	at com.ibm.j9ddr.vm29.j9.stackwalker.StackWalker.walkStackFrames(StackWalker.java:99)

Why DDR needs to know the OS thread ID is a mystery to me, but it's clear we don't fill that in for the stack-allocated threads used to drive the stack walker for unmounted continuations.

@gacholio
Copy link
Contributor

As the continuation is unmounted, the only way to find the failing stack is to examine every continuation using the new extension.

@babsingh Is there a command to list all of the continuations? I don't see anything obvious in the DDR help.

@babsingh
Copy link
Contributor

babsingh commented Oct 11, 2023

@babsingh Is there a command to list all of the continuations? I don't see anything obvious in the DDR help.

It will show in j9help.

> !j9help | grep vthread
vthreads                                       Lists virtual threads

> !j9help | grep contin
continuationstack         <continuation>       Walks the Java stack for <continuation>
continuationstackslots    <continuation>       Walks the Java stack (including objects) for <continuation>

Example:
     !vthreads
 Example output:
     !continuationstack 0x00007fe78c0f9600 !j9vmcontinuation 0x00007fe78c0f9600 !j9object 0x0000000706401588 (Continuation) !j9object 0x0000000706400FB0 (VThread) - name1
     !continuationstack 0x00007fe78c23aa80 !j9vmcontinuation 0x00007fe78c23aa80 !j9object 0x0000000706424F90 (Continuation) !j9object 0x0000000706424EF0 (VThread) - name2
     !continuationstack 0x00007fe78c244ac0 !j9vmcontinuation 0x00007fe78c244ac0 !j9object 0x00000007064250D8 (Continuation) !j9object 0x0000000706425038 (VThread) - name3

@gacholio
Copy link
Contributor

!vthreads provides no output at all in this core

> !vthreads
> 

@babsingh
Copy link
Contributor

babsingh commented Oct 12, 2023

These cmds are only available in jdmpview for JDK21+. @fengxue-IS, fyi, if there is an actual bug.

@babsingh
Copy link
Contributor

babsingh commented Nov 1, 2023

Given that we have exclusive, how do we go about finding all continuations?

Through j9gc_flush_nonAllocationCaches_for_walk and j9mm_iterate_all_continuation_objects.

vmFuncs->acquireExclusiveVMAccess(currentThread);
vm->memoryManagerFunctions->j9gc_flush_nonAllocationCaches_for_walk(vm);
vm->memoryManagerFunctions->j9mm_iterate_all_continuation_objects(currentThread, PORTLIB, 0, jvmtiSuspendResumeCallBack, (void*)&data);
vmFuncs->releaseExclusiveVMAccess(currentThread);

@gacholio
Copy link
Contributor

gacholio commented Nov 1, 2023

Thanks, I'll prototype a fix for this.

@gacholio
Copy link
Contributor

gacholio commented Nov 1, 2023

jitCodeBreakpointAdded and jitCodeBreakpointRemoved also need to be updated.

gacholio added a commit to gacholio/openj9 that referenced this issue Nov 13, 2023
The breakpoint handling code was only fixing stacks actively running on
a thread. The stacks for continuations also need to be fixed.

Fixes: eclipse-openj9#18088

Signed-off-by: Graham Chapman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants