Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

process: add execve #56496

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

ShogunPanda
Copy link
Contributor

@ShogunPanda ShogunPanda commented Jan 7, 2025

This PR adds a new process.execve method (absolutely willing to change the name if you want), which is a wrapper for the execve UNIX function.

The function will never return and will swap the current process with a new one.
All memory and system resources are automatically collected from execve, except for std{in,out,err}.

The primary use of this function is in shell scripts to allow to setup proper logics and then spawn another command.

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/startup

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. process Issues and PRs related to the process subsystem. labels Jan 7, 2025
@ShogunPanda ShogunPanda added the request-ci Add this label to start a Jenkins CI on a PR. label Jan 7, 2025
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Jan 7, 2025
@nodejs-github-bot
Copy link
Collaborator

@ljharb
Copy link
Member

ljharb commented Jan 7, 2025

What does it do on non-Unix systems?

@ShogunPanda
Copy link
Contributor Author

@ljharb By non Unix we only mean Windows, isn't it? If that's the case, Windows also supports execve via _execve (doc here) so we should be good to go.

Once the CI ends I'll see which platform needs special assistance.

Copy link

codecov bot commented Jan 7, 2025

Codecov Report

Attention: Patch coverage is 88.05970% with 16 lines in your changes missing coverage. Please review.

Project coverage is 89.12%. Comparing base (afafee2) to head (e56bf2d).
Report is 28 commits behind head on main.

Files with missing lines Patch % Lines
src/node_process_methods.cc 79.48% 14 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #56496      +/-   ##
==========================================
+ Coverage   88.53%   89.12%   +0.58%     
==========================================
  Files         657      662       +5     
  Lines      190761   191709     +948     
  Branches    36616    36862     +246     
==========================================
+ Hits       168899   170861    +1962     
+ Misses      15048    13712    -1336     
- Partials     6814     7136     +322     
Files with missing lines Coverage Δ
lib/internal/bootstrap/node.js 99.57% <100.00%> (+<0.01%) ⬆️
lib/internal/process/per_thread.js 99.39% <100.00%> (+0.06%) ⬆️
src/node_errors.h 88.00% <100.00%> (+3.00%) ⬆️
src/node_process_methods.cc 88.88% <79.48%> (+3.06%) ⬆️

... and 97 files with indirect coverage changes

@targos
Copy link
Member

targos commented Jan 7, 2025

On Windows:

D:\a\node\node\src\node_process_methods.cc(544,26): error C2065: 'F_GETFD': undeclared identifier [D:\a\node\node\libnode.vcxproj]
  (compiling source file '/src/node_process_methods.cc')
  
D:\a\node\node\src\node_process_methods.cc(544,17): error C3861: 'fcntl': identifier not found [D:\a\node\node\libnode.vcxproj]
  (compiling source file '/src/node_process_methods.cc')
  
D:\a\node\node\src\node_process_methods.cc(546,17): error C2065: 'FD_CLOEXEC': undeclared identifier [D:\a\node\node\libnode.vcxproj]
  (compiling source file '/src/node_process_methods.cc')
  
D:\a\node\node\src\node_process_methods.cc(547,16): error C2065: 'F_SETFD': undeclared identifier [D:\a\node\node\libnode.vcxproj]
  (compiling source file '/src/node_process_methods.cc')
  
D:\a\node\node\src\node_process_methods.cc(547,7): error C3861: 'fcntl': identifier not found [D:\a\node\node\libnode.vcxproj]
  (compiling source file '/src/node_process_methods.cc')

Copy link
Member

@targos targos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test/parallel/test-process-replace.js Outdated Show resolved Hide resolved
test/parallel/test-process-replace.js Outdated Show resolved Hide resolved
test/parallel/test-process-replace-fail.js Outdated Show resolved Hide resolved
test/parallel/test-process-replace-fail.js Outdated Show resolved Hide resolved
test/parallel/test-process-replace-socket.js Outdated Show resolved Hide resolved
test/parallel/test-process-replace-socket.js Outdated Show resolved Hide resolved
src/node_process_methods.cc Outdated Show resolved Hide resolved
src/node_process_methods.cc Outdated Show resolved Hide resolved
node::Utf8Value pathname_string(env->isolate(), args[0].As<String>());

argv = new char*[2];
argv[0] = strdup(*pathname_string);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need check not null also in here, maybe(?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good spot. Added.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use std::memcpy or a similar modern C++ function other than this.

Copy link
Member

@mcollina mcollina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ShogunPanda
Copy link
Contributor Author

@ljharb I was totally wrong. Despite of naming and similar signatures, the function _execve only creates a new process without replacing the old one.

I'll disable this function on Windows.

@ShogunPanda ShogunPanda added the request-ci Add this label to start a Jenkins CI on a PR. label Jan 7, 2025
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Jan 7, 2025
@nodejs-github-bot
Copy link
Collaborator

Copy link
Member

@addaleax addaleax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is to be added, I'd absolutely recommend referring to it by a standard name such as execve(), or otherwise something that makes it clear that it spawns a new process.

This requires integration with the permissions API or is otherwise an immediate security hole.

Overall I'd recommend not adding this though, unless there's a concrete reason to believe that it fills a significant gap that the existing child_process API doesn't cover.

doc/api/errors.md Outdated Show resolved Hide resolved
resources from the current process are preserved, except for the standard input,
standard output and standard error file descriptor.

All other resources are discarded by system when the processes are swapped.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if this isn't the a desirable behavior in a given situation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please expand this? What do you mean?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's any point in adding execve(), it's the customizability that the method brings with it. Leaving file descriptors open and/or redirecting them intentionally can be part of that -- just like you could in a bash script do exec bash 0<&4 1>&4 to create a shell that reads and writes to e.g. a pre-opened socket or something along those lines.

(With a similar reasoning, you could also allow users to actually specify argv[0] with a value they control -- it doesn't need to equal the filename that's being executed)


THROW_ERR_PROCESS_REPLACE_FAILED(env, error_code);
}
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite hard-to-follow C++ with lots of unnecessary explicit memory management that Node.js has been doing a lot of effort to move away from. I'd recommend looking a bit a how other parts of the Node.js code base handle strings and conversion between C++ and JS values.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm absolutely willing to. Since I'm not familiar with the C++ codebase, do you have any suggestion of places I can look to?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ShogunPanda Well, mostly anywhere else works. As a general rule, you'll want to get rid of new, new[], char*, memcpy() and strdup() as much as possible, and replace them with std::vector, std::string/Utf8Value as much as possible.

(You won't be entirely able to avoid something like std::vector<char*> because execve expects char** arguments, but the std::vector<char*>'s entries could point to the entries of a std::vector<std::string> or std::vector<Utf8Value> rather than having to manage memory manually).

@ljharb
Copy link
Member

ljharb commented Jan 7, 2025

Why would we want to add a function that can't work on all tier 1 platforms?

@jasnell
Copy link
Member

jasnell commented Jan 7, 2025

Why would we want to add a function that can't work on all tier 1 platforms?

Well, we already have a number of such apis... process.getegid() for instance. There are actually quite a few already on process.

@ljharb
Copy link
Member

ljharb commented Jan 7, 2025

Gotcha, i wasn't aware of that.

@jasnell
Copy link
Member

jasnell commented Jan 7, 2025

I don't consider it to be ideal. I would have preferred a pattern like process.posix.getegid() similar to what we have with path.posix but these predate my involvement so they are what they are.

@jasnell
Copy link
Member

jasnell commented Jan 7, 2025

+1 on @addaleax's alternative name suggestion. I'm fine with adding this so long as the cleanup logic/expectations are clearly documented.

src/node_errors.h Outdated Show resolved Hide resolved
src/node_errors.h Outdated Show resolved Hide resolved
char** target = nullptr;
int length = js_array->Length();

CHECK_LT(length, INT_MAX);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CHECK_LT(length, INT_MAX);
CHECK_LT(length, INT_MAX);
CHECK_BT(length, 0);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find CHECK_BT anywhere. Did you mean CHECK_GT or what?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes!

src/node_process_methods.cc Outdated Show resolved Hide resolved
src/node_process_methods.cc Outdated Show resolved Hide resolved
Comment on lines +537 to +544
for (unsigned int i = 0; i < argv_array->Length(); i++) {
full_argv_array
->Set(context, i + 1, argv_array->Get(context, i).ToLocalChecked())
.Check();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this make a copy? I didn't quite understand the goal here. Sorry for the questions!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's the idea. execve wants a null-terminated char** while I have a JsArray.
On the Javascript side I have validated they are all strings with no null-bytes in the middle.
Any better idea on how to handle this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My gut feeling would be that it's much easier to handle the conditional and array copying on the JS side of things and instead only pass a version of the array(s) down to C++ that's close to what the syscall expects 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean now.
I would love to make all the manipulation on the JS side, but it is then possible to pass the data from JS to be interpreted on C++ as char**. Out of my mind I'm thinking about using Uint8Array but I might be wrong.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can still pass an array of strings, it's just the array copying/argument handling that would end up simplified (i.e. no need to check if (args.Length() > 1) {, no need to full_argv_array->Set(context, i + 1, argv_array->Get(context, i).ToLocalChecked()))

src/node_process_methods.cc Outdated Show resolved Hide resolved
src/node_process_methods.cc Outdated Show resolved Hide resolved
src/node_process_methods.cc Outdated Show resolved Hide resolved
@ShogunPanda
Copy link
Contributor Author

Why would we want to add a function that can't work on all tier 1 platforms?

Well, we already have a number of such apis... process.getegid() for instance. There are actually quite a few already on process.

Thanks for involuntary hint. :)
I updated the code with #ifdef __POSIX__ and the docs to use the same wording already used there.

@ShogunPanda ShogunPanda changed the title process: add replace process: add execve Jan 8, 2025
@ShogunPanda
Copy link
Contributor Author

@addaleax

If this is to be added, I'd absolutely recommend referring to it by a standard name such as execve(), or otherwise something that makes it clear that it spawns a new process.

As requested, I've renamed it to execve. I didn't like the original name either :)

This requires integration with the permissions API or is otherwise an immediate security hole.

I just added integration with the permission API. It will require --allow-child-process.

Overall I'd recommend not adding this though, unless there's a concrete reason to believe that it fills a significant gap that the existing child_process API doesn't cover.

In the shell scripting context, there is no way to create a new process which would replace the current one. Think about using Node.js to build a complex command to run. After it, Node.js was not used anymore but since there was no way to swap the process, you would have to spawn a new process, manage it's stdin/stdout/stderr and so forth. That's why I added execve.

@ShogunPanda
Copy link
Contributor Author

@jasnell

+1 on @addaleax's alternative name suggestion. I'm fine with adding this so long as the cleanup logic/expectations are clearly documented.

I added two tests which check that:

  1. Test what happens if we leave a socket open and we try to reopen in the swapped process: the operation succeeds, which means the fd is destroyed. I also tried to install a on('close') but it is not invoked.

  2. Test what happens if there is a process.on('exit') on the original process. It does not get invoked, which means that the system call immediately swap the programs without running any cleanup logic.

I've added this to the documentation to reflect that. Does it look good now?

@addaleax
Copy link
Member

addaleax commented Jan 8, 2025

After it, Node.js was not used anymore but since there was no way to swap the process, you would have to spawn a new process, manage it's stdin/stdout/stderr and so forth. That's why I added execve.

I mean, to be clear, this is still a one-line operation (managing stdio literally just comes down to setting stdio: 'inherit') for the most part. I know where you're coming from but I wouldn't add this just for the sake of adding it.

@ShogunPanda
Copy link
Contributor Author

After it, Node.js was not used anymore but since there was no way to swap the process, you would have to spawn a new process, manage it's stdin/stdout/stderr and so forth. That's why I added execve.

I mean, to be clear, this is still a one-line operation (managing stdio literally just comes down to setting stdio: 'inherit') for the most part. I know where you're coming from but I wouldn't add this just for the sake of adding it.

I kinda agree.
You still would have to handle the exit code to ensure proper broadcasting to the shell.
Also, I'm not sure what would happen with "TUI" (ncurses and similar) programs and so forth.

Is your objection a full block or just an "intent"?

@addaleax
Copy link
Member

addaleax commented Jan 8, 2025

@ShogunPanda Just fyi, have you seen https://www.npmjs.com/package/foreground-child? That also works on Windows 🙂

My "request changes" marker only refers to the C++ memory management here, i.e. this comment specifically (and the integration with the permissions API, of course). I don't intend to block this feature, I've given my opinion but it's pretty clear overall that the Node.js project has a different stance from my own when it comes to 'what should go into core' 🙂

Comment on lines +263 to +267
char message[128];
snprintf(message,
sizeof(message),
"process.execve failed with error code %s",
errors::errno_string(code));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not?

Suggested change
char message[128];
snprintf(message,
sizeof(message),
"process.execve failed with error code %s",
errors::errno_string(code));
auto message = std::string("process.execve failed with error code " + errors::errno_string(code);

Comment on lines +541 to +543
full_argv_array
->Set(context, i + 1, argv_array->Get(context, i).ToLocalChecked())
.Check();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't want this to throw, you can surround this with USE() and remove .Check()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Ideally, this should use neither of the two and just do proper error handling

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. this should probably be something like...

  if (full_argv_array->Set(...).IsNothing()) {
    return;
  }

Likewise the Check() and ToLocalChecked() uses immediately above should be changed to properly handle the error cases without resorting to abort.

This API call should only return nothing if an error is scheduled and returning immediately in that case allows v8 to proceed with throwing that error.

if (!isMainThread) {
throw new ERR_WORKER_UNSUPPORTED_OPERATION('Calling process.execve');
} else if (process.platform === 'win32') {
throw new ERR_FEATURE_UNAVAILABLE_ON_PLATFORM('process.execve');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the docs say this is also unavailable on Android. Should that be checked here also?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove Android from the docs. It's not a supported platform.

int error_code = errno;

// Free up memory, then throw.
delete[] argv;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The abundance of manual memory management in this function really makes me nervous.

skip('process.execve is not available in Workers');
}

if (!isWindows) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs also mention Android as not being supported. Should that be covered here also?

if (!isWindows) {
// Non existent path name
{
throws(mustCall(() => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing a must call into throws doesn't make a lot of sense in my opinion. We know that throws is goingt to call that function immediately (we have tests proving this).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. process Issues and PRs related to the process subsystem.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants