-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Creating test_utils.py #113
base: dev
Are you sure you want to change the base?
Conversation
Provides a DataFrame with random values for logistic regression testing. Ensures that a scatter plot has been added by checking the number of collections in the axis. Makes sure that the axis limits and labels are set correctly
…dated the fit() method
The removal of disp=False in utils.py ensures that no warnings are raised during the tests when running with the current version of statsmodels
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## dev #113 +/- ##
==========================================
+ Coverage 2.07% 73.63% +71.55%
==========================================
Files 9 12 +3
Lines 529 880 +351
==========================================
+ Hits 11 648 +637
+ Misses 518 232 -286 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double check that you have pre-commit running, my comp said that two files were reformatted by the black hook.
msdbook/tests/test_utils.py
Outdated
|
||
def test_fit_logit(sample_data): | ||
"""Test the fit_logit function.""" | ||
predictors = ['Predictor1', 'Predictor2'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should define this in sample_data()
msdbook/tests/test_utils.py
Outdated
|
||
# Call the plot function | ||
contourset = plot_contour_map( | ||
ax, result, sample_data, contour_cmap, dot_cmap, levels, xgrid, ygrid, 'Predictor1', 'Predictor2', base=0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should avoid hardcoding values as much as possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should have been clearer here, for tests you want to have hardcoded values as inputs and outputs so that you can make sure the function is doing the right thing, but in general, if you're using the same value over and over, you should put it into a variable and use that variable. For example, you use 'Predictor1'
and 'Predictor2'
a lot so you should put those into variables.
Added pytest and test mock under independences since they are both required
The sample data is established within the sample_data() fixture. Refrain from hardcoding values in the test_plot_contour_map, opting instead for the dynamic generation of xgrid, ygrid, and levels.
msdbook/tests/test_utils.py
Outdated
|
||
# Call the plot function | ||
contourset = plot_contour_map( | ||
ax, result, sample_data, contour_cmap, dot_cmap, levels, xgrid, ygrid, 'Predictor1', 'Predictor2', base=0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should have been clearer here, for tests you want to have hardcoded values as inputs and outputs so that you can make sure the function is doing the right thing, but in general, if you're using the same value over and over, you should put it into a variable and use that variable. For example, you use 'Predictor1'
and 'Predictor2'
a lot so you should put those into variables.
|
||
# Check that parameters (coefficients) are not empty | ||
assert result.params is not None | ||
assert result.pvalues is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to check that the parameters and values are correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the tests only confirm that the parameters and p-values are not empty or None. They do not ensure that the values are correct in a meaningful way, such as being logically reasonable or statistically valid.
Put pytest and pytest mock in dev section
The names of the columns Predictor1, Predictor2, and Interaction are now set as variables at the start of the file, which are PREDICTOR_1, PREDICTOR_2, and INTERACTION. The test_empty_data function was made clearer. First checking if the dataframe is empty before plotting or fitting the model. The empty_df.empty check helps prevent plotting when there is no data. The test also raises a ValueError if you attempt to fit the model on an empty dataset. np.all(np.isfinite(result.pvalues)) checks that all p-values are valid numbers. Checks if at least one of the coefficients has a p-value below 0.05 using assert np.any(result.pvalues < 0.05. This approach makes more sense because p-values can vary in regression analysis.
Provides a DataFrame with random values for logistic regression testing. Ensures that a scatter plot has been added by checking the number of collections in the axis. Makes sure that the axis limits and labels are set correctly