Add cross-version test framework (and a simple test) #1371

zuiderkwast · 2024-11-28T10:38:11Z

This includes a way to run two versions of the server from the TCL test framework. It's a preparation to add more cross-version tests. The runtest script accepts a new parameter

./runtest --other-server-path path/to/valkey-server

and a new tag "needs:other-server" for test cases and start_server. Tests with this tag are automatically skipped if --other-server-path is not provided.

This PR adds it in a CI job with Valkey 7.2.7 by downloading a binary release.

Fixes #76

codecov · 2024-11-28T10:53:13Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 70.95%. Comparing base (7fc958d) to head (a0d0a39).
Report is 1 commits behind head on unstable.

Additional details and impacted files

@@             Coverage Diff              @@
##           unstable    #1371      +/-   ##
============================================
+ Coverage     70.80%   70.95%   +0.15%     
============================================
  Files           121      121              
  Lines         65132    65132              
============================================
+ Hits          46118    46216      +98     
+ Misses        19014    18916      -98

see 8 files with indirect coverage changes

This includes a way to run two versions of the server from the TCL test framework. The runtest script accepts a new parameter --old-server-path path/to/valkey-server and a new tag "needs:old-server" for test cases and start_server. Includes an attempt to run it in CI as well. Signed-off-by: Viktor Söderqvist <[email protected]>

and linebreak long lines in the ./runtest --help output Signed-off-by: Viktor Söderqvist <[email protected]>

tests/test_helper.tcl

hpatro · 2024-12-03T18:22:35Z

.github/workflows/ci.yml

+          cd tests/tmp
+          wget https://download.valkey.io/releases/valkey-7.2.7-focal-x86_64.tar.gz
+          tar -xvf valkey-7.2.7-focal-x86_64.tar.gz


I would prefer this to be in a script which can be invoked by multiple tests and provide engine version as parameter and skip downloading if it already exists.

I was thinking keep it minimal and that we can add some abstraction for it when we need it.

What kind of script do you have in mind? A python script under utils/, a yaml file under .github/ or a TCL script under tests/? If we put it under utils/, does that mean we officially support the script?

If we add a script with these few lines, is the version parameter a string like "7.2.7" and we assume the rest of the URLs are always identical? What if we're testing something from another URL, or with a different executable name (an old redis-server, kvdb-server, etc.) Then the parameters are bigger than the script itself.

So far, we don't have any scripts that downloads anything from hard-coded URLs, apart from in the CI jobs, so it think this might be the best place for it. Can we add some github action definition that can be used from multiple github actions?

Sticking to Github actions options seems problematic i.e. how do other vendors/users test backward compatibility locally?

My thought is to have a python script with well defined old server name+version (enum) and download path (doesn't need to be identical). And yes, we would need to support it officially.

how do other vendors/users test backward compatibility locally?

Like this:

./runtest --old-server-path ../my-other-valkey-clone/src/valkey-server

You may want to build it from source, either a source release or a git checkout, even to test a development branch against unstable.

Using a binary release from our website was just a simple way to get the CI running. Not sure it's the best way or a method I want to officially support.

My thought is to have a python script with well defined old server name+version (enum) and download path (doesn't need to be identical). And yes, we would need to support it officially.

It seems like a commitment to support a bunch of URLs to binary releases. I don't really see the benefit. Is it to have predictable tests? Each compatiblity test uses a fixed version, so it knows what to expect?

Keep in mind that it's also a different binary for a different architecture and per OS. To run it on ARM, you can't use the same binary. There are even differences between ubuntu versions, like OpenSSL version and such things.

For daily are we going to run it against all supported versions? That can also be a followup.

Yes, sounds good, and we should add some more meaningful test cases too, like the one we wanted for the light-weight cluster message and the cluster ping-extension issues we've had in the past. I just wanted to create the scaffolding and a simple test case first.

Let's do this incrementally maybe? I think it's better to have this code in and running and we can adjust it as needed, for now just a wget seems OK.

tests/support/server.tcl

madolson

Simple and efficient.

.github/workflows/ci.yml

tests/test_helper.tcl

madolson · 2024-12-10T00:36:13Z

.github/workflows/ci.yml

+          cd tests/tmp
+          wget https://download.valkey.io/releases/valkey-7.2.7-focal-x86_64.tar.gz
+          tar -xvf valkey-7.2.7-focal-x86_64.tar.gz


For daily are we going to run it against all supported versions? That can also be a followup.

Signed-off-by: Viktor Söderqvist <[email protected]>

zuiderkwast · 2024-12-18T19:21:23Z

For daily are we going to run it against all supported versions? That can also be a followup.

Yes, a follow up. More importantly, we should add some cross-version tests for the cluster bus with ping extensions, lightweigt pubsub and such things, especially when we're developing new such features to make sure we don't break the rolling upgrade scenarios.

I'd like to see rolling upgrade tests of a whole cluster too.

zuiderkwast · 2024-12-19T18:02:04Z

The wait_for_sync time out in the CI job, so apparently it's flaky. Why isn't 5 seconds enough? This function is used in many other test cases.

$ cd tests ; git grep -A5 'proc wait_for_sync'
support/util.tcl:proc wait_for_sync r {
support/util.tcl-    wait_for_condition 50 100 {
support/util.tcl-        [status $r master_link_status] eq "up"
support/util.tcl-    } else {
support/util.tcl-        fail "replica didn't sync in time"
support/util.tcl-    }
support/util.tcl-}

tests/README.md

madolson · 2024-12-20T06:52:46Z

The wait_for_sync time out in the CI job, so apparently it's flaky. Why isn't 5 seconds enough? This function is used in many other test cases.

I've never seen this fail anywhere else before. I wonder if there is something about how it's being used in this instance that might take extra time.

Signed-off-by: Viktor Söderqvist <[email protected]>

zuiderkwast · 2025-01-22T18:50:27Z

Thx @madolson. I'm just worried about introducing a flaky test here. I'll merge it if everything passes after updating the PR.

hpatro · 2025-01-22T20:40:44Z

@zuiderkwast let us know if you need any help with wrapping up this PR. @sarthakaggarwal97 is happy to help.

zuiderkwast · 2025-01-23T07:39:05Z

@zuiderkwast let us know if you need any help with wrapping up this PR. @sarthakaggarwal97 is happy to help.

I answered this in my previous comment. Please help ensure we don't have a flaky test. And please clean up the flaky tests in unstable. Let's have at least two weeks without Daily failing.

Signed-off-by: Viktor Söderqvist <[email protected]>

zuiderkwast · 2025-01-23T08:06:41Z

After merging unstable, the test-ubuntu-latest runs on ubuntu-24 "noble". Our "focal" binaries require libssl 1.1.1 so they won't start. I updated the binary name to "noble" assuming we have a binary like that, but to my surprise we have only

focal (2020)
bionic (2018)

And they're built with libssl 1.1.1 so they refuse to start on any recent OS....

@hpatro Any suggestions what to do now?

Signed-off-by: Viktor Söderqvist <[email protected]>

zuiderkwast · 2025-01-23T10:26:14Z

It seems the binary for 7.2.8 for "noble" is just missing. Using 7.2.7 for noble works. However, it might break automatically next time ubuntu-latest is bumbed, in two years or so.

I'm going to merge this now.

Follow-up: If we use binaries built for ubuntu-24, we should probably lock the CI job to run on ubuntu-24 rather than ubuntu-latest. Another option is to run a container instead of a binary download. Help wanted with this follow-up.

hpatro · 2025-01-23T22:12:51Z

It seems the binary for 7.2.8 for "noble" is just missing. Using 7.2.7 for noble works. However, it might break automatically next time ubuntu-latest is bumbed, in two years or so.

It's odd that we have noble for 7.2.7, unsure if we built it manually and uploaded it to the S3 bucket. I don't see we have it listed on the website as well https://valkey.io/download/releases/v7-2-7/ .

zuiderkwast force-pushed the upgrade-test branch from 8403010 to 77c4c0d Compare November 28, 2024 11:57

Add documentation for missing tags and arguments

9e90759

and linebreak long lines in the ./runtest --help output Signed-off-by: Viktor Söderqvist <[email protected]>

zuiderkwast requested review from madolson and hpatro November 28, 2024 16:02

xbasel reviewed Dec 3, 2024

View reviewed changes

tests/test_helper.tcl Outdated Show resolved Hide resolved

hpatro reviewed Dec 3, 2024

View reviewed changes

madolson reviewed Dec 10, 2024

View reviewed changes

zuiderkwast added 2 commits December 18, 2024 19:53

Change "old server" to "other server"

dc0e715

Signed-off-by: Viktor Söderqvist <[email protected]>

Merge remote-tracking branch 'valkey/unstable' into upgrade-test

6d3b7d6

zuiderkwast changed the title ~~Add cross-version replication test~~ Add cross-version test framework (and a simple replication test) Dec 18, 2024

zuiderkwast changed the title ~~Add cross-version test framework (and a simple replication test)~~ Add cross-version test framework (and a simple test) Dec 18, 2024

madolson reviewed Dec 20, 2024

View reviewed changes

tests/README.md Outdated Show resolved Hide resolved

madolson reviewed Dec 20, 2024

View reviewed changes

tests/README.md Outdated Show resolved Hide resolved

Fix tests/README.md formatting/alignment

05da3c9

Signed-off-by: Viktor Söderqvist <[email protected]>

zuiderkwast mentioned this pull request Jan 17, 2025

Enable module API SendClusterMessage to use light message header type #1572

Open

madolson approved these changes Jan 22, 2025

View reviewed changes

Merge remote-tracking branch 'valkey/unstable' into upgrade-test

1f5b83e

Change other binary to valkey-7.2.8-noble-x86_64

64212b0

Signed-off-by: Viktor Söderqvist <[email protected]>

Use 7.2.7 instead of 7.2.8

a0d0a39

Signed-off-by: Viktor Söderqvist <[email protected]>

zuiderkwast merged commit 99ed308 into valkey-io:unstable Jan 23, 2025
49 checks passed

zuiderkwast deleted the upgrade-test branch January 23, 2025 10:27

zuiderkwast mentioned this pull request Jan 23, 2025

Trigger manual failover on SIGTERM / shutdown to cluster primary #1091

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add cross-version test framework (and a simple test) #1371

Add cross-version test framework (and a simple test) #1371

zuiderkwast commented Nov 28, 2024 •

edited

Loading

codecov bot commented Nov 28, 2024 •

edited

Loading

hpatro Dec 3, 2024

zuiderkwast Dec 3, 2024

hpatro Dec 3, 2024

zuiderkwast Dec 3, 2024

madolson Dec 10, 2024

zuiderkwast Dec 10, 2024

madolson Jan 22, 2025

madolson left a comment

madolson Dec 10, 2024

zuiderkwast commented Dec 18, 2024

zuiderkwast commented Dec 19, 2024

madolson commented Dec 20, 2024

zuiderkwast commented Jan 22, 2025

hpatro commented Jan 22, 2025

zuiderkwast commented Jan 23, 2025

zuiderkwast commented Jan 23, 2025 •

edited

Loading

zuiderkwast commented Jan 23, 2025

hpatro commented Jan 23, 2025

Add cross-version test framework (and a simple test) #1371

Add cross-version test framework (and a simple test) #1371

Conversation

zuiderkwast commented Nov 28, 2024 • edited Loading

codecov bot commented Nov 28, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

madolson left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zuiderkwast commented Dec 18, 2024

zuiderkwast commented Dec 19, 2024

madolson commented Dec 20, 2024

zuiderkwast commented Jan 22, 2025

hpatro commented Jan 22, 2025

zuiderkwast commented Jan 23, 2025

zuiderkwast commented Jan 23, 2025 • edited Loading

zuiderkwast commented Jan 23, 2025

hpatro commented Jan 23, 2025

zuiderkwast commented Nov 28, 2024 •

edited

Loading

codecov bot commented Nov 28, 2024 •

edited

Loading

zuiderkwast commented Jan 23, 2025 •

edited

Loading