Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: run VortexFileArrayStream on dedicated IoDispatcher (#1232)
This PR separates IO for the `LayoutBatchStream` from the calling thread's runtime. This is beneficial for a few reasons 1. Now decoding into Arrow will not block the executor where IO occurs, which previously could lead to timeouts 2. It allows us to use non-Tokio runtimes such as compio which uses io_uring To enable (2), we revamp the `VortexReadAt ` trait to return `!Send` futures that can be launched on thread-per-core runtimes. We also enforce the `VortexReadAt` implementers are clonable, again so they can be shared across several concurrent futures. We implement a clonable `TokioFile` type, and un-implement for things like `Vec<u8>` and `&[u8]` that don't have cheap cloning. Although read operations are now `!Send`, it is preferable for`LayoutBatchStream` to be `Send` so that it can be polled from Tokio or any other runtime. To achieve this, we add an `IoDispatcher`, which sits in front of either a tokio or compio executor to dispatch read requests. A couple of things to consider: * **Cancellation safety**: On Drop, the dispatcher will join its worker threads, gracefully awaiting any already-submitted tasks before it completes. It's probably more preferable to find a way to force an immediate shutdown, so we may want to ferry a `shutdown_now` channel and `select!` over both of them inside of the inner loop of the dispatcher * **Async runtime safety**: The way you can implement cross-runtime IO is that you provide both a reader (`VortexReadAt`) and a `IoDispatcher` to the `LayoutBatchStreamBuilder`. There is nothing at compile time that enforces that the reader and the dispatcher are compatible. Perhaps we should parameterize `VortexReadAt` on a runtime and then we can do type-level checks to ensure that all IO operations can only be paired with the right runtime. That would likely inflate the PR a good bit
- Loading branch information