Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"unexec/pdump" - VM memory serialization and loading #23

Open
DavidDeSimone opened this issue Jan 18, 2023 · 3 comments
Open

"unexec/pdump" - VM memory serialization and loading #23

DavidDeSimone opened this issue Jan 18, 2023 · 3 comments
Labels
design needed Items where more design help is needed

Comments

@DavidDeSimone
Copy link

DavidDeSimone commented Jan 18, 2023

Executive Summary: I propose that it would be worth while to have Rune dump it's serialized state into a binary file that could be reloaded at a later time to cut down on load times. Usage being that I evaluate a large amount of elisp, dump to a file, and load my VM using that dump'd elisp to cut down on load time. Creating this binary file may require a special mode when creating the VM (depending on implementation), but loading the file would not require any special mode. Loading the file would be done at VM initialization and would not be expected to be done "mid run"

For a long time, part of emacs build process was it's famous "unexec" flow, where you would load a minimal version of emacs, evaluate a large amount of elisp, and if I recall correctly, then dump part of your process heap into a binary that would be loaded into emacs BSS memory area. Eventually emacs replaced unexec with the portable dumper (https://github.com/emacs-mirror/emacs/blob/master/src/pdumper.h) which isn't as fast, but is much more maintainable.

v8 (Google's Javascript engine) also has a somewhat similar functionality for it's Isolates - this is how Deno is able to load the typescript interpreter so quickly. They actually load the interpreter in v8 with their hooks during build time, and dump the binary state that is loaded at run time.

Advantages are a notable speedup for targeted applications that load a large amount of elisp. The downside is complexity, but I think with Rust's great serialization libraries and support, this could be done with moderate effort.

A step further (and more similar to v8) is that instead of seeing the entire VM with this file, we can seed a thread with this file containing binary state, so that I can have separate threads loaded up very quickly with pre-seeded memory content with minimal overhead.

@CeleritasCelery
Copy link
Owner

Thanks for writing this up. I never fully understood how pdumper works. It doesn't sounds like something you could implement with serde, more like it is taking a snapshot of the heap? Taking a snapshot of the heap seems easy enough, but how would load that back into the runtime? You can't just mark the image as mutable because then it would not be reusable. Do you have copy all the objects from the image and update all the pointers?

A step further (and more similar to v8) is that instead of seeing the entire VM with this file, we can seed a thread with this file containing binary state, so that I can have separate threads loaded up very quickly with pre-seeded memory content with minimal overhead.

Is the dump primarily to speed up Emacs startup, or is it to make it easier to start a new thread? currently all threads share functions, but I could see an alternative where functions are thread local and each thread loads an image instead.

@DavidDeSimone
Copy link
Author

DavidDeSimone commented Jan 18, 2023

pdumper is more of a snapshot into the heap. From pdumper.c:

/* Format of an Emacs dump file.  All offsets are relative to
   the beginning of the file.  An Emacs dump file is coupled
   to exactly the Emacs binary that produced it, so details of
   alignment and endianness are unimportant.
   An Emacs dump file contains the contents of the Lisp heap.
   On startup, Emacs can start faster by mapping a dump file into
   memory and using the objects contained inside it instead of
   performing initialization from scratch.
   The dump file can be loaded at arbitrary locations in memory, so it
   includes a table of relocations that let Emacs adjust the pointers
   embedded in the dump file to account for the location where it was
   actually loaded.
   Dump files can contain pointers to other objects in the dump file
   or to parts of the Emacs binary.  */

My initial thoughts would me something a little slower, but more portable: a 2 pass solution that would look something like this:

Serialize:

  1. All gc objects get resolved to a universal reference (possibly by guid)
  2. We serialize the object graph replacing pointers by their assigned guids

Deserialize:

  1. All gc objects are unserialized, resolving guids in a recursive manner.

Emacs itself has a reference to this kind of pattern in pdumper.c in a TODO:

/*
  TODO:
  - Two-pass dumping: first assemble object list, then write all.
    This way, we can perform arbitrary reordering or maybe use fancy
    graph algorithms to get better locality.
  - Don't emit relocations that happen to set Emacs memory locations
    to values they will already have.
  - Nullify frame_and_buffer_state.
  - Preferred base address for relocation-free non-PIC startup.
  - Compressed dump support.

The "two-pass" solution that I proposed above allows us to have a portable dump without having to couple to the specific VM that we dumped from, and we can use serde to achieve this scheme. We can even dump this kind of scheme to a human readable format for debugging.

@DavidDeSimone
Copy link
Author

DavidDeSimone commented Jan 18, 2023

Is the dump primarily to speed up Emacs startup, or is it to make it easier to start a new thread? currently all threads share functions, but I could see an alternative where functions are thread local and each thread loads an image instead.

In emacs, the dump is to improve startup times.

The way I used threading was incorrect in my previous post. I was alluding to scheme more like v8's Isolates, which allow for separate instances of the VM to be run in the same process. In that context, we would use the dump to seed a thread, which would be an isolated instance of the VM. I am working on another post to discuss that approach for threading, but I got a little ahead of myself.

@CeleritasCelery CeleritasCelery added the design needed Items where more design help is needed label Jan 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design needed Items where more design help is needed
Projects
None yet
Development

No branches or pull requests

2 participants