-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Serialization for Kokkos containers residing in host-inaccessible memory space with limited cost #196
Comments
Serialization to a buffer inherently implies the footprint of the active data being doubled for the serialization buffer. Doing better than that requires some form of direct transfer (e.g. RDMA, GPUDirect MPI, etc). Any approach based on mirror view construction implies a footprint up to triple the active data. An incremental / streaming bounce buffer can offer Since we effectively assume that the underlying type is byte-copyable through use of |
Jonathan mentioned an increment-offset-and-get-pointer method on the serializer that should be good for the in-memory buffer use case. We'll pass it the size of the |
Application codes are moving away from allocating data in memory that's accessible to both host and device code to avoid performance overhead and pitfalls. We still need to be able to serialize instances of device-space containers for checkpoint/restart and messaging/communication.
This will be a concern for any view that doesn't satisfy this predicate:
The most expedient implementation would be to use
auto host_view = Kokkos::create_mirror_view(view_to_serialize);
with appropriate copies (orcreate_mirror_view_and_copy
). The problem with this is that it may allocate a lot of memory, and move a lot of data synchronously all at once ifview_to_serialize
is large. We may need to do that as a stop-gap measure anyway, to guarantee functionality.A more thorough implementation would create and copy through limited-size bounce buffers in host memory to limit added memory footprint. To ensure good performance, it would use a streaming approach with an
exec_space
argument toKokkos::deep_copy
, so that parts of the view's contents can be serialized while other parts are being copied.The text was updated successfully, but these errors were encountered: