-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"unexec/pdump" - VM memory serialization and loading #23
Comments
Thanks for writing this up. I never fully understood how pdumper works. It doesn't sounds like something you could implement with serde, more like it is taking a snapshot of the heap? Taking a snapshot of the heap seems easy enough, but how would load that back into the runtime? You can't just mark the image as mutable because then it would not be reusable. Do you have copy all the objects from the image and update all the pointers?
Is the dump primarily to speed up Emacs startup, or is it to make it easier to start a new thread? currently all threads share functions, but I could see an alternative where functions are thread local and each thread loads an image instead. |
pdumper is more of a snapshot into the heap. From pdumper.c:
My initial thoughts would me something a little slower, but more portable: a 2 pass solution that would look something like this: Serialize:
Deserialize:
Emacs itself has a reference to this kind of pattern in pdumper.c in a TODO:
The "two-pass" solution that I proposed above allows us to have a portable dump without having to couple to the specific VM that we dumped from, and we can use serde to achieve this scheme. We can even dump this kind of scheme to a human readable format for debugging. |
In emacs, the dump is to improve startup times. The way I used threading was incorrect in my previous post. I was alluding to scheme more like v8's Isolates, which allow for separate instances of the VM to be run in the same process. In that context, we would use the dump to seed a |
Executive Summary: I propose that it would be worth while to have Rune dump it's serialized state into a binary file that could be reloaded at a later time to cut down on load times. Usage being that I evaluate a large amount of elisp, dump to a file, and load my VM using that dump'd elisp to cut down on load time. Creating this binary file may require a special
mode
when creating the VM (depending on implementation), but loading the file would not require any special mode. Loading the file would be done at VM initialization and would not be expected to be done "mid run"For a long time, part of emacs build process was it's famous "unexec" flow, where you would load a minimal version of emacs, evaluate a large amount of elisp, and if I recall correctly, then dump part of your process heap into a binary that would be loaded into emacs BSS memory area. Eventually emacs replaced unexec with the portable dumper (https://github.com/emacs-mirror/emacs/blob/master/src/pdumper.h) which isn't as fast, but is much more maintainable.
v8 (Google's Javascript engine) also has a somewhat similar functionality for it's
Isolates
- this is how Deno is able to load the typescript interpreter so quickly. They actually load the interpreter in v8 with their hooks during build time, and dump the binary state that is loaded at run time.Advantages are a notable speedup for targeted applications that load a large amount of elisp. The downside is complexity, but I think with Rust's great serialization libraries and support, this could be done with moderate effort.
A step further (and more similar to v8) is that instead of seeing the entire VM with this file, we can seed a thread with this file containing binary state, so that I can have separate threads loaded up very quickly with pre-seeded memory content with minimal overhead.
The text was updated successfully, but these errors were encountered: