Skip to content

Commit

Permalink
test: fix TestTxPool_ExpiredTxs_Timestamp flake (#1573)
Browse files Browse the repository at this point in the history
## Description

Closes #1207  

## Rationale
After a lot of trial and errors and some research, the rationale behind
this change of flaky test is:

Because the TTL is 5ms which is very short, we need to have a more
precise pruning interval to ensure that the transactions are expired, so
that the expired event is caught quickly enough, while the second batch
of transactions are not expired to prevent flaky behaviors.


## Testing method 
```go test -run TestTxPool_ExpiredTxs_Timestamp github.com/tendermint/tendermint/mempool/cat -mod=readonly -race -count 100```

I run this command 200 times and see no failing one, which is kind of reliable compare to failing rate of 15-20% of old implemetation 


---

#### PR checklist

- [x] Tests written/updated
- [ ] Changelog entry added in `.changelog` (we use
  [unclog](https://github.com/informalsystems/unclog) to manage our changelog)
- [ ] Updated relevant documentation (`docs/` or `spec/`) and code comments
  • Loading branch information
tuantran1702 authored Jan 2, 2025
1 parent a0daf1e commit e8d1b46
Showing 1 changed file with 30 additions and 30 deletions.
60 changes: 30 additions & 30 deletions mempool/cat/pool_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -553,40 +553,40 @@ func TestTxPool_ExpiredTxs_Timestamp(t *testing.T) {

// Wait a while, then add some more transactions that should not be expired
// when the first batch TTLs out.
//
// ms: 0 1 2 3 4 5 6
// ^ ^ ^ ^
// | | | +-- Update (triggers pruning)
// | | +------ first batch expires
// | +-------------- second batch added
// +-------------------------- first batch added
//
// The exact intervals are not important except that the delta should be
// large relative to the cost of CheckTx (ms vs. ns is fine here).
time.Sleep(3 * time.Millisecond)
added2 := checkTxs(t, txmp, 10, 1)
// Because the TTL is 5ms which is very short, we need to have a more precise
// pruning interval to ensure that the transactions are expired
// so that the expired event is caught quickly enough
// that the second batch of transactions are not expired.

// Wait a while longer, so that the first batch will expire.
time.Sleep(3 * time.Millisecond)

// Trigger an update so that pruning will occur.
txmp.Lock()
defer txmp.Unlock()
require.NoError(t, txmp.Update(txmp.height+1, nil, nil, nil, nil))
time.Sleep(2500 * time.Microsecond)
added2 := checkTxs(t, txmp, 10, 1)

// All the transactions in the original set should have been purged.
for _, tx := range added1 {
// Check that it was added to the evictedTxCache
evicted := txmp.WasRecentlyEvicted(tx.tx.Key())
require.True(t, evicted)
// use require.Eventually to wait for the TTL to expire
require.Eventually(t, func() bool {
// Trigger an update so that pruning will occur.
txmp.Lock()
defer txmp.Unlock()
require.NoError(t, txmp.Update(txmp.height+1, nil, nil, nil, nil))

// All the transactions in the original set should have been purged.
for _, tx := range added1 {
// Check that it was added to the evictedTxCache
evicted := txmp.WasRecentlyEvicted(tx.tx.Key())
if !evicted {
return false
}

if txmp.store.has(tx.tx.Key()) {
t.Errorf("Transaction %X should have been purged for TTL", tx.tx.Key())
}
if txmp.rejectedTxCache.Has(tx.tx.Key()) {
t.Errorf("Transaction %X should have been removed from the cache", tx.tx.Key())
if txmp.store.has(tx.tx.Key()) {
t.Errorf("Transaction %X should have been purged for TTL", tx.tx.Key())
return false
}
if txmp.rejectedTxCache.Has(tx.tx.Key()) {
t.Errorf("Transaction %X should have been removed from the cache", tx.tx.Key())
return false
}
}
}
return true
}, 10*time.Millisecond, 50*time.Microsecond)

// All the transactions added later should still be around.
for _, tx := range added2 {
Expand Down

0 comments on commit e8d1b46

Please sign in to comment.