-
-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add residuals_ attribute to pg.anova output #260
base: main
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## master #260 +/- ##
=======================================
Coverage 98.99% 99.00%
=======================================
Files 19 19
Lines 3290 3318 +28
Branches 527 528 +1
=======================================
+ Hits 3257 3285 +28
Misses 17 17
Partials 16 16
Continue to review full report at Codecov.
|
Hi @lahdjirayhan, This is great — thank you so much for the clean PR! Two minor requests before I merge:
Thanks again for your help, |
This looks great, thanks! I saw that you added the residuals for the Also, thanks for adding the unit testing! Can you please make sure to have at least 3 decimals in the expected (R) output? Raphael |
Some of the tests currently compare up to 2 decimal places only because of the dataset used. I'll use different dataset so it will produce 3 or more decimal places. Regarding |
Never mind, I think I found a way to add residuals to |
Unfortunately the What do you recommend me to do? @raphaelvallat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lahdjirayhan I have added a few more requests. Thanks again!
@@ -532,7 +532,8 @@ def rm_anova(data=None, dv=None, within=None, subject=None, correction='auto', | |||
|
|||
# Calculate sums of squares | |||
ss_with = ((grp_with.mean() - grandmean)**2 * grp_with.count()).sum() | |||
ss_resall = grp_with.apply(lambda x: (x - x.mean())**2).sum() | |||
resid = grp_with.apply(lambda x: (x - x.mean())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have any documentation for this definition of the residuals? If so, can you add as a comment
@@ -938,9 +956,9 @@ def anova(data=None, dv=None, between=None, ss_type=2, detailed=False, | |||
ssbetween = ((grp.mean() - data[dv].mean())**2 * grp.count()).sum() | |||
# Within effect (= error between) | |||
# = (grp.var(ddof=0) * grp.count()).sum() | |||
sserror = grp.apply(lambda x: (x - x.mean())**2).sum() | |||
error = grp.apply(lambda x: x - x.mean()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, do you have any documentation / reference implementation for this method of calculating the residuals?
return _postprocess_dataframe(aov) | ||
aov = _postprocess_dataframe(aov) | ||
|
||
aov.residuals_ = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, we could use pandas.DataFrame.attrs, e.g:
df.attrs = dict(residuals=resid)
although I have never tried it so I don't know if it works well
pingouin/tests/test_parametric.py
Outdated
dv='Cholesterol', between=['Risk'] | ||
).residuals_ | ||
array_equal( | ||
resid[0:5].round(3), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that you are always testing only the first 5 values of the residuals. We need to test all values, i.e. array_equal on all the residuals
Ping @lahdjirayhan. I'd love to release a new version of Pingouin in July. Do you think you'll have time to answer my comments before then? Thank you! |
Are we still waiting for an update on this for residuals? |
I think there is some work required on this PR before we can merge it (see my comments). |
#118