Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: capping mechanism for relay and service connections #3184

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

darshankabariya
Copy link
Contributor

This PR prevents nodes from becoming isolated from the network by implementing a capping mechanism for service and relay peers. This ensures there is always room available for relay peers at any given time.

close #3163

Copy link

github-actions bot commented Nov 28, 2024

You can find the image built from this PR at

quay.io/wakuorg/nwaku-pr:3184

Built from 62d5a1d

@darshankabariya darshankabariya changed the title chore: Capping mechanism for relay and service Connections chore: capping mechanism for relay and service connections Nov 28, 2024
@darshankabariya darshankabariya marked this pull request as draft December 3, 2024 10:45
@darshankabariya darshankabariya marked this pull request as ready for review December 9, 2024 15:00
Copy link
Contributor

@SionoiS SionoiS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

waku/node/peer_manager/peer_manager.nim Outdated Show resolved Hide resolved
waku/node/peer_manager/peer_manager.nim Show resolved Hide resolved
Copy link
Contributor

@gabrielmer gabrielmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks so much!

Added just a couple questions/comments

@@ -993,6 +999,24 @@ proc new*(
# Leave by default 20% of connections for service peers
maxRelayPeersValue = maxConnections - (maxConnections div 5)

var maxServicePeersValue = 0
var d_low = 4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we shouldn't hardcode this and take it directly from WakuRelay

# relay has higher priority then service peers
else:
maxServicePeersValue =
max(maxConnections - maxRelayPeersValue, maxConnections div 5)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the logic of this line? Why the maxConnections div 5?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My logic is to reserve 20% for service peers if the user does not provide maxService, as we discuss in nwaku pm we need to change whole logic

Copy link
Collaborator

@Ivansete-status Ivansete-status left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR doesn't compile snif snif :s
Re-thinking about this, I suggest doing the following:
We need to keep maxConnections config param because this needs to be passed to the nim-libp2p-Switch object.
On the other hand, I would only add an additional parameter, named protoConnWeights , a.k.a. proto-conn-weights, with the following logic:

If proto-conn-weights is not passed, then we apply the internal limits of 70% for relay, and the other 30% is evenly distributed for other protocols.

If proto-conn-weights=relay:80;services:20, we apply the internal limits of 80% for relay, and the other 20% is evenly distributed for other protocols.

We need to make sure the relay limit is bigger than DHigh ( we can create a
public Constant in waku_relay/protocol.nim )

wdyt @NagyZoltanPeter , @gabrielmer , @darshankabariya ?

@@ -993,6 +995,20 @@ proc new*(
# Leave by default 20% of connections for service peers
maxRelayPeersValue = maxConnections - (maxConnections div 5)

var maxServicePeersValue = 0
var d_low = 4
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need to use the d_low value extracted from waku_relay. Perhaps defining a const in waku_relay and use that value

@darshankabariya darshankabariya removed the request for review from NagyZoltanPeter December 12, 2024 21:37
@darshankabariya darshankabariya marked this pull request as draft December 12, 2024 21:37
@darshankabariya
Copy link
Contributor Author

The PR doesn't compile snif snif

My bad, I pushed the code without a local build. I’ll update this PR until we finalize it.

@darshankabariya
Copy link
Contributor Author

darshankabariya commented Dec 12, 2024

The PR doesn't compile snif snif :s Re-thinking about this, I suggest doing the following: We need to keep maxConnections config param because this needs to be passed to the nim-libp2p-Switch object. On the other hand, I would only add an additional parameter, named protoConnWeights , a.k.a. proto-conn-weights, with the following logic:

If proto-conn-weights is not passed, then we apply the internal limits of 70% for relay, and the other 30% is evenly distributed for other protocols.

If proto-conn-weights=relay:80;services:20, we apply the internal limits of 80% for relay, and the other 20% is evenly distributed for other protocols.

We need to make sure the relay limit is bigger than DHigh ( we can create a public Constant in waku_relay/protocol.nim )

wdyt @NagyZoltanPeter , @gabrielmer , @darshankabariya ?

At the moment, we have parameters like maxConnections, maxRelayPeers, maxServicePeers, and another one called protoConnWeight. This can make things more confusing, as users need to ensure that maxConnections is never greater than the sum of maxRelayPeers and maxServicePeers. Additionally, aligning proto-conn-weights with maxRelayPeers and maxServicePeers adds another layer of complexity.

Instead, I believe users generally don’t think in terms of absolute numbers for services and relays. From a psychological perspective, they might find it more intuitive to think in percentages—e.g., 80% for services, 20% for relays.

My suggestion is to simplify this structure. We should keep maxConnections bcz ( represent hardware capacity ? right ) , which users are familiar with, and deprecate maxRelayPeers and maxServicePeers. Instead, we can ask users to specify the percentage they want to allocate for relays and services. This feels more straightforward and user-friendly. maxConnection, relayWeight and serviceWeight will be deadly combination. the user to easily convey their priorities.

If users don’t provide this percentage, we could default to a ratio, as Ivan suggested. I believe this would simplify the experience while maintaining flexibility.

@gabrielmer @NagyZoltanPeter @Ivansete-status @SionoiS

@SionoiS
Copy link
Contributor

SionoiS commented Dec 13, 2024

I like your idea! IMO maxConnection and relayToServiceRatio (float) would be simpler.

I feel we still need hardcoded minimum still so that you can't, by mistake brick your node.

@gabrielmer
Copy link
Contributor

Same opinion here, having 2 params, a maximum and a ratio to know how many to allocate for each should make things simple yet customizable :)

@darshankabariya
Copy link
Contributor Author

darshankabariya commented Dec 14, 2024

I like your idea! IMO maxConnection and relayToServiceRatio (float) would be simpler.

I feel we still need hardcoded minimum still so that you can't, by mistake brick your node.

Same opinion here, having 2 params, a maximum and a ratio to know how many to allocate for each should make things simple yet customizable :)

Thanks @gabrielmer @SionoiS for reply ! i also like suggestion.

If this approach looks good to you, I can begin the implementation. Just to confirm, we're only discussing the deprecation of the external configuration parameter, correct? I believe we still need to maxRelayPeer and maxServicePeer within the peerManager.

Regarding your thoughts on the ratio, I have an idea: instead of using a float, we could represent it as a string, like 0.33 becoming 40:60 or 0.43 as 30:70. it's just need calculator 😂for some ratio, Let me know what you guys think, and I’ll implement accordingly.

@darshankabariya darshankabariya marked this pull request as ready for review December 16, 2024 08:15
@gabrielmer
Copy link
Contributor

If this approach looks good to you, I can begin the implementation. Just to confirm, we're only discussing the deprecation of the external configuration parameter, correct? I believe we still need to maxRelayPeer and maxServicePeer within the peerManager.

Regarding your thoughts on the ratio, I have an idea: instead of using a float, we could represent it as a string, like 0.33 becoming 40:60 or 0.43 as 30:70. it's just need calculator 😂for some ratio, Let me know what you guys think, and I’ll implement accordingly.

Yes! That would only be for the external configuration, we would then translate it into maxRelayPeers and maxServicePeers when configuring the Peer Manager

Regarding the format, I'm pretty neutral about it - no strong opinions. Would like to hear what others think :))

@SionoiS
Copy link
Contributor

SionoiS commented Dec 16, 2024

Regarding your thoughts on the ratio, I have an idea: instead of using a float, we could represent it as a string, like 0.33 becoming 40:60 or 0.43 as 30:70. it's just need calculator 😂for some ratio, Let me know what you guys think, and I’ll implement accordingly.

If that's simpler go for it! No strong opinion here either.

Copy link
Contributor

@NagyZoltanPeter NagyZoltanPeter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great effort! I just added some tiny last notice on it if you would like to consider.

@@ -49,3 +49,28 @@ proc parseCorrectMsgSize*(input: string): uint64 =
let ret = parseMsgSize(input).valueOr:
return 0
return ret

proc parseRelayServiceRatio*(ratio: string): Result[(float, float), string] =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
proc parseRelayServiceRatio*(ratio: string): Result[(float, float), string] =
proc parseRelayServiceRatio*(ratio: string): Result[(float, float), string] {.raises: [ValueError].}=

Let's not hide the exception, so compiler can help using the function properly.


maxRelayPeers* {.
desc:
"Deprecated. Use relay-service-ratio instead. It represents the maximum allowed number of relay peers.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I buy in to this idea. All I wanted to emphasize that do nothide the unintended/old usage of maxRelayPeers behind a log warning and a --help or docs.
Maybe it is better to stop node early and notice user about bad config, IDK.

waku/factory/node_factory.nim Outdated Show resolved Hide resolved
@darshankabariya darshankabariya force-pushed the capping_mechanism branch 2 times, most recently from 4366e81 to 6d637a9 Compare January 20, 2025 04:59
@darshankabariya darshankabariya force-pushed the capping_mechanism branch 2 times, most recently from 509fce5 to eef0d11 Compare January 20, 2025 08:18
Copy link
Collaborator

@Ivansete-status Ivansete-status left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for it and the patience! 🙌
I think we still need to keep backward compatibility with the current max-relay-peers param.
On the other hand, my dreamed nwaku only deals with Result types, doesn't use discard and doesn't raise Exception either ;P

try:
let elements = ratio.split(":")
if elements.len != 2:
raise newException(ValueError, "expected format 'X:Y', ratio = " & ratio)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry that I didn't mention before. Would it be possible to avoid raising an exception plz? Applies elsewhere. We already have the Resullt returning type to tackle the err conditions, unless I'm missing something :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it.

@@ -101,8 +101,15 @@ proc initNode(
agentString = some(conf.agentString),
)
builder.withColocationLimit(conf.colocationLimit)

# Backword compatibility for maxRelayPeers
if conf.maxRelayPeers.isSome():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this condition is true, then we need to infer the relayServiceRation based on the given conf.maxConnections and conf.maxRelayPeers. And we need to keep this temporarily within two consecutive versions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review and patience. It’s now backward compatible and parse function without raising exceptions.

Copy link
Collaborator

@Ivansete-status Ivansete-status left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for it! 💯 🥳

) =
builder.maxRelayPeers = maxRelayPeers
builder.shardAware = shardAware
try:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not needed to handle exception now?

var relayRatio: float64
var serviceRatio: float64
try:
(relayRatio, serviceRatio) = parseRelayServiceRatio(relayServiceRatio).get()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome thanks! I think now we don't need to handle the exception

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

chore: Reserve % of max connections for relay
5 participants