-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automated regression tests #187
Comments
Basically something similar to what ASV does (but for timings instead of results)? But in general I am not sure this is a viable approach:
Should this be moved to https://github.com/scipp/ess_template, as it is ESS-specific? |
If we make minor changes in workflow, I think we should know about it. See for example scipp/scippneutron#514. But yes, if we often make changes that modify the results, we do need a mechanism where we can easily ignore failed tests (or say "I accept this as the new reference solution") |
Sounds similar to something like https://github.com/matplotlib/pytest-mpl with adding baseline tests to compare matplotlib plots? |
But this one seems like saving plots as a baseline and compare the new results with the existing files. |
Correct |
We need a way to test whether our workflows still produce the accepted 'correct' results after we make some changes. E.g. in scipp/esssans#135 and scipp/esssans#143. However, there are changes that should change the result, such as adding a new correction or tuning a parameter. In Mantid, those accepted results are written to file and loaded to compare them to results from a new version of the code. This needs extra infrastructure to store and provide the files and extra work to update them. Here is a potential alternative.
Have a test script that does this procedure on each PR:
main
.results_main
.results_branch
.results_main
andresults_branch
, load the file and compare withsc.testing.assert_identical
andsc.testing.assert_allclose
.The tests run this way can contain assertions to, e.g., make sure that the result has the expected shape. But the main purpose of these tests is writing data. That data can be any scipp object, e.g., the result of running a workflow.
This procedure would perform regression tests against
main
which we assume has the accepted 'correct' code. But it does not require storing result files in a public location.What do you think? Does this make sense?
The text was updated successfully, but these errors were encountered: