Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeboost: swap sequencers seamlessly #2883

Open
wants to merge 11 commits into
base: express-lane-timeboost
Choose a base branch
from

Conversation

ganeshvanahalli
Copy link
Contributor

@ganeshvanahalli ganeshvanahalli commented Jan 16, 2025

This PR makes it that during a sequencer swap for chosen/active express lane service is not affected. Only the expressLaneService of chosen sequencer is used to process expressLane txs, all the other sequencers are configured to forward the txs they receive to the main one (like how its currently handled with normal txs).

The chosen sequencer also writes accepted expressLane txs to redis server configured via

--execution.sequencer.timeboost.redis-url=""

so that the next chosen sequencer will be able to recover pending txs and up-to-date sequence count, if the previous sequencer crashed/lost priority.

Resolves NIT-3052

[Pending system_test]

@ganeshvanahalli ganeshvanahalli marked this pull request as ready for review January 16, 2025 15:50
@@ -754,6 +757,9 @@ func (s *Sequencer) Activate() {
close(s.pauseChan)
s.pauseChan = nil
}
if s.expressLaneService != nil {
s.expressLaneService.syncFromRedis() // We want sync to complete (which is best effort) before activating the sequencer
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great thiniking. However - I don't want to hold activeMutex while doing it.
Either do the entire syncFromRedis in a separate thread, or release the lock before you call it. Not currently sure which is better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I've changed the impl to instead launch a thread and also updated expressLaneService's updates to redis to be async

roundDuration time.Duration
Client redis.UniversalClient

roundSeqMapMutex sync.Mutex
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have a generic-based containers.SyncMap

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SyncMap is ever-growing so we would need to take care of cleanup, also its optimized for use cases when theres more number of reads compared to writes which is not ideal for our case here

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your'e right, I was confused about something..

es.roundInfo.Add(msg.Round, roundInfo)
unlockByDefer = false
es.roundInfoMutex.Unlock() // Release lock so that other timeboost txs can be processed

es.LaunchThread(func(threadCtx context.Context) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be done before the for loop. If we are adding this message we should try sending it to redis

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also (not exactly here but related) - we need to add another attempt to getForwarder, after the call to validateExpressLaneTx which might take a long time

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, will make these changes


type RedisCoordinator struct {
roundDuration time.Duration
Client redis.UniversalClient
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should probably be a private field

}

func NewRedisCoordinator(redisUrl string, roundDuration time.Duration) (*RedisCoordinator, error) {
redisClient, err := redisutil.RedisClientFromURL(redisUrl)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the coordinator currently supports having no redis url (client == nil)
We should:

  • allow redis url to be empty and thus have no redis coordinator
  • use a default value which is not empty, something like "unset", and make sure that it causes an error so if we enable timeboost but no redis by default it will error out

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

const EXPRESS_LANE_ROUND_SEQUENCE_KEY_PREFIX string = "expressLane.roundSequence." // Only written by sequencer holding CHOSEN (seqCoordinator) key
const EXPRESS_LANE_ACCEPTED_TX_KEY_PREFIX string = "expressLane.acceptedTx." // Only written by sequencer holding CHOSEN (seqCoordinator) key

type RedisCoordinator struct {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a must, but:
I think rediscoordinator can have it's own stopwaiter, and it's public API should be called directly, without ontext, and the redis_coordinator itsleft can spawn processes, print errors, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this.

return
}

es.roundInfoMutex.Lock()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you do not want to hold this lock for the entire time.
Instead, you should treat this as if you're being forwarded all these messages.
What you should do is:

  • read all messages without taking any lock and hold them in a local variable
  • grab the round-control lock,
  • pour all messages you collected into msgAndResultBySequenceNumber
  • advance roundinfo.sequence if and only if it's lower than what you read from redis
  • loop like you do when you receive a new transaction, send transactions in order starting from sequence
  • release lock

roundDuration time.Duration
Client redis.UniversalClient

roundSeqMapMutex sync.Mutex
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your'e right, I was confused about something..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants