Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace threads with internal implementation #16

Open
CMCDragonkai opened this issue Oct 17, 2023 · 8 comments
Open

Replace threads with internal implementation #16

CMCDragonkai opened this issue Oct 17, 2023 · 8 comments

Comments

@CMCDragonkai
Copy link
Member

Specification

The threads package is quite complex and its development is stalled. It's preventing us from doing the ESM migration #12, and it's also affecting bundling and packaging in Polykey-CLI.

It's best to replace it with our own implementation.

There are several situations we need to consider:

  1. In the context of nodejs, we have to use worker_threads - this is where we should start, as that's what the agent runs on.
  2. In the context of browsers, we have to use WebWorkers - ensuring API consistency between the 2 is something that the libraries have been doing, but I think we can do this better.
  3. In the context of mobile, this depends on what the underlying system is, probably react-native so whatever threading mechanism it exposes

In order to do this, we should start by analyzing how does one define a "worker". For some reason it always ends up being a separate file to the main JS script. So when bundling something together that results in having to have 2 files at the very least. The main script and the worker script.

Is it possible to have the worker code "embedded" into the main script so one does not have to worry about the bundling problem? I think it should be doable. If we can get some way of specifying worker code that can be embedded, then when bundling occurs we don't have to worry about it. This solves deployment complexity when it comes to specifying where the worker script is, and also removes the need to have the threads or workers package be an external package, it can just be fully bundled. That's really the goal.

One nice thing would be if the worker script itself can be written in typescript. However I understand that this may overcomplicate things because it would mean a typescript compilation stage prior to bundling, we definitely do not want to embed tsc or ts-node or similar at production, so most likely it has to be written in JS-only. However IDE-wise it is possible to use jsdoc types which allows the type-checker to still check if it typed correctly, I saw this in Docusaurus.

Additional context

Tasks

  1. ...
  2. ...
  3. ...
@CMCDragonkai CMCDragonkai added the development Standard development label Oct 17, 2023
@CMCDragonkai
Copy link
Member Author

If embedding works perfectly, it would mean pkg does not require a scripts option and PK CLI will not need to re-export the polykeyWorker.js.

So I believe we should aim to make it so our workers are perfectly embedded. That's the only way to reduce complexity downstream.

@CMCDragonkai
Copy link
Member Author

To support all the different platforms, we should have a clear strategy pattern. With separate directories for different threading backends.

I think this will be focused on the theadpool used from JS (JS origin caller), and not for the native code (like in quic's native threading problem MatrixAI/js-quic#54). That will be implementation specific.

@CMCDragonkai
Copy link
Member Author

Discovered that the reason why it loads locally to the main script is due to ./polykeyWorker path used here:

  return await WorkerManager.createWorkerManager<PolykeyWorkerModule>({
    workerFactory: () => spawn(new Worker('./polykeyWorker')),
    cores,
    logger,
  });

This ends up being relative to the main script after bundling.

To make it bit more deterministic, one can use require.main.filename to be path always to the main script.

However it won't work in ESM.

The best is always get the path of the main script passed down into the subcomponents, then we can centralise how to get such a path.

This would be an intermediate solution until we can fully embed the worker.

@CMCDragonkai
Copy link
Member Author

So basically workers should support base64 data url strings. That should allow us to fully embed a worker inside without a separate file that creates complexity during bundling.

@CMCDragonkai
Copy link
Member Author

This is actually going to block full ESM migration. @amydevs @tegefaulkes

But I also think alot of usage of js-workers has actually been removed...

Without reimplementing the worker pool system, we cannot complete ESM for Polykey repo.

Another way is to remove the workers, but then we don't have any multithreading at all.

@tegefaulkes
Copy link
Contributor

Does ESM require all dependencies be ESM as well? I thought ESM could import CJS modules. It's only CJS that can't import ESM right? Am I mistaken with this?

@CMCDragonkai
Copy link
Member Author

ESM can load CJS, but CJS cannot load ESM...

@CMCDragonkai
Copy link
Member Author

We are almost all completely ESM now. PKE is ESM, our docusaurus and CF websites are ESM (almost), PK however is blocked by this library, which blocks js-db migration to ESM. We need some resources assigned to this too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants