-
Notifications
You must be signed in to change notification settings - Fork 729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DaaLoadTest ppc64le vmState=0x00000000 _ZN3OMR9CodeCache17replaceTrampolineEP20TR_OpaqueMethodBlockPvS3_S3_b+0xe4 #20263
Comments
Issue Number: 20263 |
@hzongaro pls take a look |
from #20256
I guess it is the same issue |
The grinder from #20260 https://openj9-jenkins.osuosl.org/job/Grinder_iteration_4/376 - ubu24-ppc64le-5
|
The grinder from #20261 https://openj9-jenkins.osuosl.org/job/Grinder_iteration_0/539 - ubu22-ppc64le-3
|
The grinder from #20261 https://openj9-jenkins.osuosl.org/job/Grinder_iteration_1/540 - cent9-ppc64le-2
https://openj9-jenkins.osuosl.org/job/Grinder_iteration_4/377
|
https://openj9-jenkins.osuosl.org/job/Test_openjdk11_j9_sanity.system_ppc64le_linux_Nightly_testList_1/618 - ubu24-ppc64le-1
|
Are all these on POWER? |
@zl-wang FYI... can you (or someone on P) have a look? |
interesting ... always DAA tests on ppc64le uncompressedRefs. |
@IBMJimmyk have another look at this seemingly recurring issue. |
just FYI crashing symbol in #20256 is |
Not sure this belongs to this issue, but another DaaLoadTest crash on pccle, in the head stream. All the original failures in this issue were from the 0.48 stream. They all occur on ppcle. https://openj9-jenkins.osuosl.org/job/Test_openjdk11_j9_extended.system_ppc64le_linux_Nightly_testList_0/623 - ubu22-ppc64le-5
|
https://openj9-jenkins.osuosl.org/job/Test_openjdk17_j9_sanity.system_ppc64le_linux_Nightly_testList_1/677 - ubu22-ppc64le-5
|
https://openj9-jenkins.osuosl.org/job/Test_openjdk8_j9_special.system_ppc64le_linux_Personal_testList_0/109 - cent9-ppc64le-4
https://openj9-jenkins.osuosl.org/job/Test_openjdk11_j9_special.system_ppc64le_linux_Personal_testList_0/113 - ubu22-ppc64le-4
|
Looking at 2 of the cores here show the same story: There’s no metadata for the method at the failing return address in the AVLTree, but the body its on is valid instructions and within the code-cache.
When getting the following return address from the stack-pointer we find the caller being
The call is de-virtualized in the caller body with the class guard checking against the J9Class
Following the address from the caller it calls to around 220 instructions before the failing return address. The entry to the body is intact and valid for execution, but doesn’t have a
Getting the back-trace starting from the caller as first frame with metadata (using KCA's where [pc] [sp])
So executing the caller caused this thread to reach and execute this "invalidated" body, which doesn’t have metadata in the AVLTree, and when a stackwalk is triggered it crashes. |
especially, please notice that the caller (de-virtualized/static call) still points to the callee code which doesn't have metaData anymore, as below:
the supposed callee has been recompiled (marked as |
I'm taking a look right now, but I wanted to clarify what I believe is incorrect in #20263 (comment) This comment
is not correct. Since this is LE, the lowest byte is
as per the enum in the Private Linkage:
So that means that the |
Looking back at one of the problematic JIT'd bodies:
the J9Method is
this is also confirmed by looking at the body info from
and then the MethodInfo:
which shows the J9Method of
It does look like it's been unloaded altogether. Given that the guard is testing for
but this class'
as Julian suggested offline, it must be that the old class got unloaded and this current class got loaded in the same location. The only thing I can think is that we missed a Class Unload assumption for the guard; otherwise, even if a class was loaded in the same location, because class unloading is a STW event, the guard should have been patched to |
That said, from the snap file:
doesn't look like any unloading happened at all. |
I tried running
It might be worth double checking and running the test on a machine where it fails with the |
class unloading patching registration is done as below:
that didn't take LE into account, so the high-order 4-byte was registered. that certainly was not intended. On POWER, class unloading is compensated like below:
Considering the J9Class is |
In my mind, this was fixed by Kevin many years ago. I have assumed picSite is healthy |
Issue Number: 20263 |
Grinder for #20258 exposed some crashes.
DaaLoadTest_daa1_special_5m_8 -Xgcpolicy:gencon -Xshareclasses -Xjit -Xnocompressedrefs
https://openj9-jenkins.osuosl.org/job/Grinder_iteration_2/493 (2)
https://openj9-jenkins.osuosl.org/job/Grinder_iteration_1/538 (1)
https://openj9-artifactory.osuosl.org/artifactory/ci-openj9/Test/Grinder_iteration_1/538/system_test_output.tar.gz
The text was updated successfully, but these errors were encountered: