-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Guard members of MonoType union & fix related bugs #111645
base: main
Are you sure you want to change the base?
Conversation
tracing\eventpipe\gcdump\gcdump\gcdump.cmd test failure is a real bug in mono. |
@@ -1296,6 +1296,92 @@ m_type_is_byref (const MonoType *type) | |||
return type->byref__; | |||
} | |||
|
|||
MONO_NEVER_INLINE void |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are going to have this in production than I believe we should adopt a similar pattern as we have when using MONO_CLASS_DEF_PRIVATE, either extend that or make a similar thing for MonoType, that way we would run all these additional checks when running ENABLE_CHECKED_BUILD_PRIVATE_TYPES (we should have CI legs doing that), but use inline wrappers with zero overhead on release builds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to do actual full checks in Release unless benchmarks show that it meaningfully regresses performance, at least during the preview period, in order to flush out any remaining issues of this type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would still add the same support for types as we have for class so we could do a MONO_TYPE_DEF_PRIVATE under ENABLED_CHECKED_BUILD_PRIVATE_TYPES to catch errors bypassing the getters on CI. We could also add support for the full check vs zero overhead wrappers as part of that and then just decide to initially opt into full checks during preview (and monitor perf hit during that time) and then have the option to move back to zero wrappers closer to release, but keep the full checks on CI builds running with the ENABLED_CHECKED_BUILD_PRIVATE_TYPES.
case MONO_TYPE_SZARRAY: | ||
{ | ||
// FIXME: Previously was handled by below ARRAY block but that is incorrect; this implementation is speculative -kg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume you should be able to handle both cases similar, except that you can't assume mono_type is MonoArrayType it could also be MonoClass, so you would just need to handle the case where you get the type, rank and mono_type_parameter differently.
Add comments above some apparent bugs
I went through and audited all the union accesses I could find and changed the ones that were blatantly and correctly guarded to I found some more bugs (mostly around SZARRAY handling) and added FIXME comments if I couldn't trivially fix them. I will probably try to fix all the FIXMEs I've found so far. |
The AOT compiler failure isn't generating usable stacks for some reason so I'll have to try and reproduce it locally in Debug configuration and hopefully get stacks there. I'm not sure if the aot compiler lacking stack traces when it crashes is intentional, I'm surprised by it. EDIT: Lesson learned: If you fix a bug in |
Build linux-x64 Release AllSubsets_Mono_LLVMFULLAOT_RuntimeTests llvmfullaot and its sibling look hung, but maybe they're just normally slow, or they're very slow because of the changes in this PR? I'm not familiar with these lanes and I can't tell whether the stdout on both of them stops at 1999 by coincidence or because azdo is truncating the actual output. |
There is a class of issue where we fail to properly check the type of a MonoType before accessing the members of its data union, which can lead to type confusion or null pointer dereference bugs.
This PR adds dedicated access helpers for every member of the union that do a type check, renames the union to make direct accesses obvious, and removes almost all direct uses of the union members.
The process of adding the access helpers and auditing related code revealed multiple bugs; I've either fixed those bugs in this PR or added FIXME comments describing the bug.
For accesses that were obviously correct (I could see a type check right above the accessor call) I used the
_unchecked
variant to optimize out the type check.