-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement separate create process syscall in kernel #4918
base: main
Are you sure you want to change the base?
Conversation
@conradgrobler @andrisaar this is an in-progress PR for concurrency on the restricted kernel. There's now two syscalls. One to create a process from an elf file (without switching to it), and a second one to switch to a process. I've been stuck at this counter-intuitive point for a bit, and could help some use debugging. Basically this sequence of syscalls works fine when creating the orchestrator, but then fails when the orchestrator creates a new app. The syscall to do so execs fine, but a crash occurs when returning to application space. Logs
|
Page tables appear to get corrupted smh. Here's what we get when logging the physical addr that is mapped to the failing pointer:
|
Why are there some many PML4 addresses? I would expect one for the kernel and one for each process, but there seems to be at least 4 different ones. Also, I would expect the kernel pml4 address to be in kernel space rather than user space. |
c8b89e8
to
716a2d3
Compare
pub struct Process { | ||
pml4: x86_64::structures::paging::PageTable, | ||
pml4_frame: PhysFrame, | ||
entry: VirtAddr, | ||
} | ||
|
||
impl Process { | ||
/// Creates a process from the application, without executing it. | ||
/// Creates a process from the application, without executing it. Returns | ||
/// the PID of the new process. | ||
/// | ||
/// # Safety | ||
/// | ||
/// The application must be built from a valid ELF file representing an Oak | ||
/// Restricted Application. | ||
pub unsafe fn from_application(application: &Application) -> Result<Self, anyhow::Error> { | ||
pub unsafe fn from_application(application: &Application) -> Result<usize, anyhow::Error> { | ||
let pml4 = crate::BASE_L4_PAGE_TABLE.get().context("base l4 table should be set")?.clone(); | ||
let pml4_frame: PhysFrame = identify_pml4_frame(&pml4)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Storing the pml4_frame (instead of the rust struct it holds) in Process is the change to that brought us from failing as soon as the new application is being executed, to failing only once it attempts to allocate memory.
To recap, the prior logs had a confusing number of pml4 physical addresses:
# kernel loads the pml4 that is created as a process is created from an ELF file
[2024-03-19T20:55:26Z INFO oak_launcher_utils::launcher] console: "kernel DEBUG: replacing current pml4 with: 0x0000000001075000"
# kernel switches back to the initial pml4, at the end of this function
[2024-03-19T20:55:26Z INFO oak_launcher_utils::launcher] console: "kernel DEBUG: replacing current pml4 with: 0x0000000001200000"
[2024-03-19T20:55:26Z INFO oak_launcher_utils::launcher] console: "kernel DEBUG: Created process (pid: 0)"
[2024-03-19T20:55:26Z INFO oak_launcher_utils::launcher] console: "kernel DEBUG: Executing process (pid: 0)"
# Here the kernel again loaded the pml4 created in the first step, now to execute the application. Somehow this now is a different address. One would expect it to be 0x0000000001075000
[2024-03-19T20:55:26Z INFO oak_launcher_utils::launcher] console: "kernel DEBUG: replacing current pml4 with: 0x0000000001964000"
In this state this sequence would work for loading the orchestrator, but fail when it loads the application.
The change highlighted above fixed that. Now we get the expected sequence of logs:
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: Writing new pml4 to Cr3: PhysFrame[4KiB](0x1075000)"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: Writing new pml4 to Cr3: PhysFrame[4KiB](0x1200000)"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel DEBUG: Created process (pid: 0)"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel DEBUG: Executing process (pid: 0)"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: Writing new pml4 to Cr3: PhysFrame[4KiB](0x1075000)"
It's not clear to me exactly why this led to different behavior. And either way, we still crash as soon as the enclave application attempts to allocate, as seen in the full logs:
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] executing: Command { std: "/nix/store/hb3zh2dk3rxpsdy51ik52iwprfirsr7y-qemu-host-cpu-only-8.2.2/bin/qemu-system-x86_64" "-enable-kvm" "-cpu" "IvyBridge-IBRS,enforce" "-m" "8G" "-nodefaults" "-nographic" "-no-reboot" "-machine" "microvm,acpi=on" "-chardev" "socket,id=consock,fd=9" "-serial" "chardev:consock" "-chardev" "socket,id=commsock,fd=11" "-device" "virtio-serial-device,max_ports=1" "-device" "virtconsole,chardev=commsock" "-bios" "stage0_bin/target/x86_64-unknown-none/release/stage0_bin" "-kernel" "oak_restricted_kernel_wrapper/target/x86_64-unknown-none/release/oak_restricted_kernel_wrapper_bin" "-initrd" "enclave_apps/target/x86_64-unknown-none/release/oak_orchestrator", kill_on_drop: false }
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "stage0 INFO: starting..."
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "stage0 INFO: Enabled SEV features: SevStatus(0x0)"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "stage0 DEBUG: early E820 entry: [0x0000000000000000-0x0000000000080000), len 524288, type RAM"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "stage0 DEBUG: early E820 entry: [0x0000000000080000-0x00000000000a0000), len 131072, type ACPI"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "stage0 DEBUG: early E820 entry: [0x0000000000100000-0x00000000c0000000), len 3220176896, type RAM"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "stage0 DEBUG: early E820 entry: [0x00000000feffc000-0x00000000ff000000), len 16384, type RESERVED"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "stage0 DEBUG: early E820 entry: [0x0000000100000000-0x0000000240000000), len 5368709120, type RAM"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "stage0 DEBUG: Kernel cmdline: "
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "stage0 DEBUG: Kernel image size 3297440"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "stage0 DEBUG: Kernel image start address 0x0000000002000000"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "stage0 DEBUG: Kernel entry point 0x0000000002000200"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "stage0 INFO: Using x2APIC for AP initialization."
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "stage0 DEBUG: Local APIC: ProcessorLocalApic { header: ControllerHeader { structure_type: 0, len: 8 }, processor_uid: 0, apic_id: 0, flags: LocalApicFlags(ENABLED) }"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "stage0 DEBUG: uninteresting structure: 1"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "stage0 DEBUG: uninteresting structure: 1"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "stage0 DEBUG: uninteresting structure: 4"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "stage0 INFO: Expected number of APs: 0, started number of APs: 0"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "stage0 DEBUG: Initial RAM disk size 1434256"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "stage0 DEBUG: Initial RAM disk address 0x000000003fea1000"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "stage0 DEBUG: Kernel image digest: sha2-256:a5d5f252ae19644ca2aa0c7bc5acdf99d2d9d134ee21f150380d0d6bd1bfe4c7"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "stage0 DEBUG: Kernel setup data digest: sha2-256:ebd71745942f79eec4687dd1c2773363cefa00c42d1578040e1150bfeb8ff3e1"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "stage0 DEBUG: Kernel command-line: "
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "stage0 DEBUG: Initial RAM disk digest: sha2-256:e8491f147b64557b1d57a4edfbb2a1036f64e0e30c88977a4280978299ee90a0"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "stage0 DEBUG: ACPI table generation digest: sha2-256:8460c6eaf5f2facb20bbb3ca639d9b5970f1e224f344b1dd3007e7aac34b80c1"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "stage0 DEBUG: E820 table digest: sha2-256:7272421a29871fe5e8420f86ac4bdb7da92e5527d6bf5f1ba61b75a4280b3960"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "stage0 INFO: jumping to kernel at 0x0000000002000200"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: Logging initialised."
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: Kernel boot args: --oak-dice=0x16000"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: Boot protocol: Linux Boot Protocol"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: E820 entry: [0x0000000000000000..0x0000000000016000) (90112), type RAM"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: E820 entry: [0x0000000000016000..0x0000000000017000) (4096), type RESERVED"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: E820 entry: [0x0000000000017000..0x0000000000080000) (430080), type RAM"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: E820 entry: [0x0000000000080000..0x00000000000a0000) (131072), type ACPI"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: E820 entry: [0x0000000000100000..0x00000000c0000000) (3220176896), type RAM"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: E820 entry: [0x00000000feffc000..0x00000000ff000000) (16384), type RESERVED"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: E820 entry: [0x0000000100000000..0x0000000240000000) (5368709120), type RAM"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: marking [0x0000000000200000..0x0000000000400000) as reserved"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: marking [0x0000000000200000..0x0000000000400000) as reserved"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: marking [0x0000000000400000..0x0000000000600000) as reserved"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: marking [0x0000000000600000..0x0000000000800000) as reserved"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: marking [0x0000000000800000..0x0000000000a00000) as reserved"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: marking [0x0000000000a00000..0x0000000001200000) as reserved"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: marking [0x000000003fe00000..0x0000000040000000) as reserved (ramdisk)"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: Writing new pml4 to Cr3: PhysFrame[4KiB](0x1200000)"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: Marking [0xffff880001400000..0xffff880001800000) for guest-host communication."
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel DEBUG: Extending kernel heap to [0xffffc90000000000..0xffffc90000200000)."
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: ACPI device: \\_SB_.COM1 PNP0501 16550A-compatible COM Serial Port"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: IO port: IOPortDescriptor { decodes_full_address: true, memory_range: (1016, 1016), base_alignment: 0, range_length: 8 }"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: IRQ: 16"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: ACPI device: \\_SB_.FWCF QEMU0002 QEMU fw_cfg Device"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: IO port: IOPortDescriptor { decodes_full_address: true, memory_range: (1296, 1296), base_alignment: 1, range_length: 12 }"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: ACPI device: \\_SB_.GED_ ACPI0013 ACPI Generic Event Device"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: IRQ: 9"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: ACPI device: \\_SB_.PWRB PNP0C0C Power Button Device"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: ACPI device: \\_SB_.VR23 LNRO0005 Memory-mapped virtio device bus"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: Memory range: [0x00000000feb02e00..0x00000000feb03000)"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: IRQ: 47"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: Using virtio console over MMIO; ACPI device name \\_SB_.VR23, vendor ID: 554d4551"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel DEBUG: Device features: Features(EMERG_WRITE | NOTIFY_ON_EMPTY | ANY_LAYOUT | RING_INDIRECT_DESC | RING_EVENT_IDX)"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: Copying application from ramdisk..."
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: marking [0x000000003fe00000..0x0000000040000000) as available"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: Binary loaded, size: 1434256"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: Writing new pml4 to Cr3: PhysFrame[4KiB](0x1075000)"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: Writing new pml4 to Cr3: PhysFrame[4KiB](0x1200000)"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel DEBUG: Created process (pid: 0)"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel DEBUG: Executing process (pid: 0)"
[2024-03-25T19:08:24Z INFO oak_launcher_utils::launcher] console: "kernel INFO: Writing new pml4 to Cr3: PhysFrame[4KiB](0x1075000)"
[2024-03-25T19:08:25Z INFO oak_launcher_utils::launcher] console: "orchestrator INFO: Binary loaded, size: 1020272"
[2024-03-25T19:08:25Z INFO oak_launcher_utils::launcher] console: "orchestrator INFO: Application digest (sha2-256): 90af89d0dcd5cd971b942b4743422fbe6ea3ee84a8b8cbe676ee6fb9a6df6c40"
[2024-03-25T19:08:25Z INFO oak_launcher_utils::launcher] connecting to guest instance
[2024-03-25T19:08:25Z INFO oak_launcher_utils::launcher] waiting for guest instance to terminate
[2024-03-25T19:08:25Z INFO oak_launcher_utils::launcher] console: "kernel DEBUG: Extending kernel heap to [0xffffc90000000000..0xffffc90000400000)."
[2024-03-25T19:08:25Z INFO oak_launcher_utils::launcher] console: "kernel INFO: Writing new pml4 to Cr3: PhysFrame[4KiB](0x1ffd000)"
[2024-03-25T19:08:25Z INFO oak_launcher_utils::launcher] console: "kernel INFO: Writing new pml4 to Cr3: PhysFrame[4KiB](0x1075000)"
[2024-03-25T19:08:25Z INFO oak_launcher_utils::launcher] console: "kernel DEBUG: Created process (pid: 1)"
[2024-03-25T19:08:25Z INFO oak_launcher_utils::launcher] console: "orchestrator INFO: created application with pid: 1"
[2024-03-25T19:08:25Z INFO oak_launcher_utils::launcher] console: "kernel DEBUG: Switching to a different process (pid: 1)"
[2024-03-25T19:08:25Z INFO oak_launcher_utils::launcher] console: "kernel DEBUG: Executing process (pid: 1)"
[2024-03-25T19:08:25Z INFO oak_launcher_utils::launcher] console: "kernel INFO: Writing new pml4 to Cr3: PhysFrame[4KiB](0x1ffd000)"
[2024-03-25T19:08:25Z INFO oak_launcher_utils::launcher] console: "INFO: In main!"
[2024-03-25T19:08:25Z INFO oak_launcher_utils::launcher] console: "INFO: about to allocate bytes"
[2024-03-25T19:08:25Z ERROR oak_restricted_kernel_launcher] Unexpected VMM exit, status: Ok(ExitStatus(unix_wait_status(0)))
@conradgrobler @andrisaar can you help debug? I'm pretty much at my wits end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Command to obtain the logs is
just stage0_bin oak_restricted_kernel_wrapper oak_orchestrator oak_echo_raw_enclave_app && RUST_LOG=DEBUG cargo run --package=oak_restricted_kernel_launcher -- --kernel=oak_restricted_kernel_wrapper/target/x86_64-unknown-none/release/oak_restricted_kernel_wrapper_bin --vmm-binary=$(which qemu-system-x86_64) --memory-size=8G --bios-binary=stage0_bin/target/x86_64-unknown-none/release/stage0_bin --initrd=enclave_apps/target/x86_64-unknown-none/release/oak_orchestrator --app-binary=enclave_apps/target/x86_64-unknown-none/release/oak_echo_raw_enclave_app
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also backported just this change onto main, where it results in the same behavior: panic upon allocating in the enclave application (though strangely not in the orchestrator)
No description provided.