-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HSA_STATUS_ERROR_MEMORY_APERTURE_VIOLATION #1101
Comments
We've seen this error in AMReX codes due to a HIP compiler bug (e.g.: AMReX-Astro/Microphysics#1386 (comment)) Adding |
Which compiler are you using? |
I think I've used only |
Although we only saw this problem for very large kernels (e.g., with reaction networks), so it may not be related. |
I've also tried https://rocm.docs.amd.com/en/latest/conceptual/using-gpu-sanitizer.html#compiling-for-address-sanitizer to debug these memory errors. This sometimes worked, but it also produces some false positives with global vars... |
I now tried the warpx recommendations, i.e.,
still same issue. |
Ah, well, nevermind :/ |
Does running with |
On Frontier I see a (or to be more specific many of the following)
:0:rocdevice.cpp :2660: 556940992572 us: 32834: [tid:0x7f9e41945700] Device::callbackQueue aborting with error : HSA_STATUS_ERROR_MEMORY_APERTURE_VIOLATION: The agent attempted to access memory beyond the largest legal address. code: 0x29
when running the following input file
and current
develop
(b28c738).Changing
to
shows no issues.
The text was updated successfully, but these errors were encountered: