Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Peer management enhancements #488

Open
peterbroadhurst opened this issue Dec 20, 2024 · 2 comments
Open

Peer management enhancements #488

peterbroadhurst opened this issue Dec 20, 2024 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@peterbroadhurst
Copy link
Contributor

What would you like to be added?

  • Better thread management of state distribution backlog and retries
  • Status information to users for "active peers"
  • Clear code responsibility for managing the node-to-node transport definitions

Why is this needed?

See:

// TODO: This needs to be a worker per-peer - probably a whole state distributor per peer that can be swapped in/out.
// Currently it only runs on startup, and pushes all state distributions from before the startup time into the distributor.

@peterbroadhurst peterbroadhurst added the enhancement New feature or request label Dec 20, 2024
@peterbroadhurst peterbroadhurst self-assigned this Dec 20, 2024
@peterbroadhurst
Copy link
Contributor Author

peterbroadhurst commented Dec 20, 2024

Initial architecture plan:

  • The responsibility is combined into transportmgr from statedistribution (which gets removed)
  • There is an in-memory status + go-routine for every "active peer"
  • This go-routine is responsible for completing state distributions
  • This go-routine also processes requests to send things from a dedicated go channel to that peer
  • transportmgr takes ownership of the payload spec for messages
  • Only spec-defined messages can be sent/received
  • A short in-memory retry is done by transport manager to handle very intermittent errors to minimize complexity in transport plugins
  • Messages are still one-way fire&forget, but these short retries manage very short reconnect-style error scenarios
  • Status (bytes sent, lifetime of active, etc.) on failures is available over JSON/RPC on transport_peers / transport_peerInfo
  • The transport plugin interface is updated with an explicit lifecycle of activate/deactivate on peers, rather than just send
    image

@peterbroadhurst
Copy link
Contributor Author

Implementation is in #491

peterbroadhurst added a commit that referenced this issue Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant