Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wasm2c: Add experimental no-sandbox option #2084

Closed
wants to merge 1 commit into from

Conversation

dimitry-
Copy link
Contributor

@dimitry- dimitry- commented Dec 1, 2022

No description provided.

@sbc100
Copy link
Member

sbc100 commented Dec 1, 2022

Before we land anything like this can we land some documentation about no sandbox mode and why its useful to you? (we can add those docs are part of this initial CL if you like).

wasm2c/README.md Outdated Show resolved Hide resolved
wasm2c/README.md Outdated Show resolved Hide resolved
wasm2c/README.md Outdated Show resolved Hide resolved
src/tools/wasm2c.cc Outdated Show resolved Hide resolved
src/tools/wasm2c.cc Outdated Show resolved Hide resolved
@keithw
Copy link
Member

keithw commented Dec 12, 2022

Would it be possible to share a little more about where/who this is intended for, more broadly? Given the opposition at #1432 from some areas of the Wasm community, perhaps we could gate it behind a flag like --enable-unsafe or --enable-nonconforming that would be required?

BTW, what would it mean to "unify host and WebAssembly address space" in a world with multi-memory? How does a load from memory index #x translate into a native load? Is it expected that running with --no-sandbox will still run normal valid WebAssembly programs, or will the programs have to be written completely differently if memory indexes are relative to the host VM space instead of the base of the memory? (E.g. 0 will no longer be a valid memory address ,etc.)

@sbc100
Copy link
Member

sbc100 commented Dec 12, 2022

Would it be possible to share a little more about where/who this is intended for, more broadly? Given the opposition at #1432 from some areas of the Wasm community, perhaps we could gate it behind a flag like --enable-unsafe or --enable-nonconforming that would be required?

I'll let @dimitry elaborate on the intent here, not sure how much they can share.

But yes, I think it would likely make sense to put this behind --enable-unsafe and/or --enable-nonconforming, to those flags exist yet? (Do either of those flags exist today BTW?). I would have thought that --no-sandbox would imply both unsafe and non-conforming though, does it not imply that to you?

BTW, what would it mean to "unify host and WebAssembly address space" in a world with multi-memory? How does a load from memory index #x translate into a native load?

I think it would only apply to a first/primary/chosen memory. But I also think it would fine (initially) to simply not support multi-memory in this mode.

Is it expected that running with --no-sandbox will still run normal valid WebAssembly programs, or will the programs have to be written completely differently if memory indexes are relative to the host VM space instead of the base of the memory? (E.g. 0 will no longer be a valid memory address ,etc.)

Yes, it would crash just like native program accessing address zero.

@keithw
Copy link
Member

keithw commented Dec 13, 2022

But yes, I think it would likely make sense to put this behind --enable-unsafe and/or --enable-nonconforming, to those flags exist yet? (Do either of those flags exist today BTW?).

Not today, no...

I would have thought that --no-sandbox would imply both unsafe and non-conforming though, does it not imply that to you?

Hmm, honestly, I didn't quite realize the implications. Currently I think basically all of the flags in WABT are about adding "extra" abilities to the module -- e.g. they implement a proposal to extend the Wasm spec in a backwards-compatible way, or they relax some of the mandatory validation-time traps (the --no-check option).

A --no-sandbox option initially sounded like it was about relaxing some of the mandatory execution-time traps (which is why I brought up #1432) while preserving well-behaving programs. But maybe that wasn't the right analogy -- this sounds more like prototyping a different Wasm-like language that isn't intended to work with existing modules or producers. E.g. if the module includes an active data segment, or a malloc implementation (like wasi-libc's) that relies on the link-time __heap_base symbol, I guess that might not work in this mode because the segment's offset or __heap_base are unlikely to be within the mapped virtual address space of a typical Linux process at runtime.

@dimitry-
Copy link
Contributor Author

dimitry- commented Dec 13, 2022

Would it be possible to share a little more about where/who this is intended for, more broadly? Given the opposition at #1432 from some areas of the Wasm community, perhaps we could gate it behind a flag like --enable-unsafe or --enable-nonconforming that would be required?

I'll let @dimitry- elaborate on the intent here, not sure how much they can share.

The context here is using Wasm as a portable intermediate representation for native libraries for Android apps. So the idea is to have wasm2c in no-sandbox code generate a code (almost) as if it was generated by compiling it directly to native platform.
This is the link to FR in NDK for more context: android/ndk#1771

Hmm, honestly, I didn't quite realize the implications. Currently I think basically all of the flags in WABT are about adding "extra" abilities to the module -- e.g. they implement a proposal to extend the Wasm spec in a backwards-compatible way, or they relax some of the mandatory validation-time traps (the --no-check option).

A --no-sandbox option initially sounded like it was about relaxing some of the mandatory execution-time traps (which is why I brought up #1432) while preserving well-behaving programs. But maybe that wasn't the right analogy -- this sounds more like prototyping a different Wasm-like language that isn't intended to work with existing modules or producers. E.g. if the module includes an active data segment, or a malloc implementation (like wasi-libc's) that relies on the link-time __heap_base symbol, I guess that might not work in this mode because the segment's offset or __heap_base are unlikely to be within the mapped virtual address space of a typical Linux process at runtime.

yes - the end goal is to have modules using --no-sandbox to link directly with native libraries (libc.so instead of wasi-libc).

@sbc100
Copy link
Member

sbc100 commented Dec 13, 2022

One of the implications of --no-sandbox is the it would only work with position-independent or relocatable code.

Position independent code can be generated using wasm-ld -shared (this is used by emscripten's -sSIDE_MODULE setting). Another options which we might consider is to have wasm2c somehow process relocations found in the wasm binary when linked with wasm-ld --emit-relocs, it could make then turn these relocations into native relocations in the resulting native shared library.

@matthias-blume
Copy link

One of the implications of --no-sandbox is the it would only work with position-independent or relocatable code.

I'm actually not 100% sure why that would be the case. In my mind it works like this: We will use wasm2c with --no-sandbox to generate C code that looks as close as we can make it to what original C would have looked like. It is then up to how the final translation of that generated C code is configured to determine things like PiC and relocatability.

But I admit that I'm still too new to this space to be confident about what I just wrote.

In any case, is this ready to be merged?

@sbc100
Copy link
Member

sbc100 commented Dec 19, 2022

One of the implications of --no-sandbox is the it would only work with position-independent or relocatable code.

I'm actually not 100% sure why that would be the case. In my mind it works like this: We will use wasm2c with --no-sandbox to generate C code that looks as close as we can make it to what original C would have looked like. It is then up to how the final translation of that generated C code is configured to determine things like PiC and relocatability.

I agree its up to the final translation of that generated C to determine memory layout, but it can't do that without either being given PIC or relocatable wasm module.

If you don't build the wasm module as PIC or relocatable it would contain unannounced absolute addresses .. which obviously won't work in the general case. For example, static data would start at address 1024 which would just crash when compiled on most native platforms.

But I admit that I'm still too new to this space to be confident about what I just wrote.

In any case, is this ready to be merged?

@matthias-blume
Copy link

One of the implications of --no-sandbox is the it would only work with position-independent or relocatable code.

I'm actually not 100% sure why that would be the case. In my mind it works like this: We will use wasm2c with --no-sandbox to generate C code that looks as close as we can make it to what original C would have looked like. It is then up to how the final translation of that generated C code is configured to determine things like PiC and relocatability.

I agree its up to the final translation of that generated C to determine memory layout, but it can't do that without either being given PIC or relocatable wasm module.

If you don't build the wasm module as PIC or relocatable it would contain unannounced absolute addresses .. which obviously won't work in the general case. For example, static data would start at address 1024 which would just crash when compiled on most native platforms.

Ah - right. For some reason I had already tacitly assumed that we have relocation info, so "relocatable" would be true in that case. In fact, the part of the un-sandboxing implementation that I already have sitting around fundamentally depends on relocation info being present - precisely so that it can identify integer constants that are actually addresses.

Thanks for the clarification!

@sbc100
Copy link
Member

sbc100 commented Dec 19, 2022

Perhaps we can call this --experimental-no-sandbox just to be clear that this is an experimental feature?

@dimitry-
Copy link
Contributor Author

Perhaps we can call this --experimental-no-sandbox just to be clear that this is an experimental feature?

Sure. I will rename it.

@dimitry- dimitry- changed the title wasm2c: Add no-sandbox option wasm2c: Add experimental-no-sandbox option Dec 19, 2022
@matthias-blume
Copy link

On second thought ... would it make sense to have a separate --experimental flag, gating the acceptance of --no-sandbox and (for the time being, while general support is not fully baked) --memory64 on that?

(Sorry for the late suggestion. I don't know if this has been discussed already.)

@sbc100
Copy link
Member

sbc100 commented Dec 20, 2022

On second thought ... would it make sense to have a separate --experimental flag, gating the acceptance of --no-sandbox and (for the time being, while general support is not fully baked) --memory64 on that?

(Sorry for the late suggestion. I don't know if this has been discussed already.)

sgtm

It is hidden behind --experimental flag.
@dimitry- dimitry- changed the title wasm2c: Add experimental-no-sandbox option wasm2c: Add experimental no-sandbox option Dec 22, 2022
@dimitry-
Copy link
Contributor Author

On second thought ... would it make sense to have a separate --experimental flag, gating the acceptance of --no-sandbox and (for the time being, while general support is not fully baked) --memory64 on that?
(Sorry for the late suggestion. I don't know if this has been discussed already.)

sgtm

done.

Copy link
Member

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm to this initial addition of the flag. @keithw WDYT?

@keithw
Copy link
Member

keithw commented Dec 22, 2022

No objections to adding the flag. (Perhaps --experimental should even be a WABT-wide flag, and some of the other in-progress features could be behind it as well...)

@calvin2021y
Copy link

calvin2021y commented Mar 15, 2023

I plan to use wasm2c build sqlite to provide multi thread lock-free sqlite instance. (original sqlite must use lock with multi thread)

with this the VFS IO speed will have a huge speedup. (no need copy from host page into WASM instance)

@dimitry-
Copy link
Contributor Author

this PR is outdated and no longer needed.

@dimitry- dimitry- closed this Mar 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants