This repository has been archived by the owner on Sep 20, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathsleepstudy_speed.qmd
188 lines (137 loc) · 8.5 KB
/
sleepstudy_speed.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
---
title: "The sleepstudy: Speed - for a change ..."
jupyter: julia-1.9
---
## Background
@Belenky2003 reported effects of sleep deprivation across a 14-day study of 30-to-40-year old men and women holding commercial vehicle driving licenses. Their analyses are based on a subset of tasks and ratings from very large and comprehensive test and questionnaire battery [@Balkin2000].
Initially 66 subjects were assigned to one of four time-in-bed (TIB) groups with 9 hours (22:00-07:00) of sleep augmentation or 7 hours (24:00-07:00), 5 hours (02:00-07:00), and 3 hours (04:00-0:00) of sleep restrictions per night, respectively. The final sample comprised 56 subjects. The Psychomotor Vigilance Test (PVT) measures simple reaction time to a visual stimulus, presented approximately 10 times ⁄ minute (interstimulus interval varied from 2 to 10 s in 2-s increments) for 10 min and implemented in a thumb-operated, hand-held device [@Dinges1985].
**Design**
The study comprised 2 training days (T1, T2), one day with baseline measures (B), seven days with sleep deprivation (E1 to E7), and four recovery days (R1 to R4). T1 and T2 were devoted to training on the performance tests and familiarization with study procedures. PVT baseline testing commenced on the morning of the third day (B) and testing continued for the duration of the study (E1–E7, R1–R3; no measures were taken on R4). Bed times during T, B, and R days were 8 hours (23:00-07:00).
**Test schedule within days**
The PVT (along with the Stanford Sleepiness Scale) was administered as a battery four times per day (09:00, 12:00, 15:00, and 21:00 h); the battery included other tests not reported here [see @Balkin2000]. The sleep latency test was administered at 09:40 and 15:30 h for all groups. Subjects in the 3- and 5-h TIB groups performed an additional battery at 00:00 h and 02:00 h to occupy their additional time awake. The PVT and SSS were administered in this battery; however, as data from the 00:00 and 02:00 h sessions were not common to all TIB groups, these data were not included in the statistical analyses reported in the paper.
**Statistical analyses**
The authors analyzed response speed, that is (1/RT)*1000 -- completely warranted according to a Box-Cox check of the current data -- with mixed-model ANOVAs using group as between- and day as within-subject factors. The ANOVA was followed up with simple tests of the design effects implemented over days for each of the four groups.
**Current data**
The current data distributed with the _RData_ collection is attributed to the 3-hour TIB group, but the means do not agree at all with those reported for this group in [@Belenky2003 Figure 3] where the 3-hour TIB group is also based on only 13 (not 18) subjects. Specifically, the current data show a much smaller slow-down of response speed across E1 to E7 and do not reflect the recovery during R1 to R3. The currrent data also cover only 10 not 11 days, but it looks like only R3 is missing. The closest match of the current means was with the average of the 3-hour and 7-hour TIB groups; if only males were included, this would amount to 18 subjects. (This conjecture is based only on visual inspection of graphs.)
## Setup
First we attach the various packages needed, define a few helper functions, read the data, and get everything in the desired shape.
```{julia}
#| code-fold: true
using CairoMakie # device driver for static (SVG, PDF, PNG) plots
using Chain # like pipes but cleaner
using DataFrameMacros
using DataFrames
using MixedModels
using MixedModelsMakie # plots specific to mixed-effects models using Makie
using ProgressMeter
CairoMakie.activate!(; type="svg")
ProgressMeter.ijulia_behavior(:clear);
```
## Preprocessing
The `sleepstudy` data are one of the datasets available with recent versions of the `MixedModels` package. We carry out some preprocessing to have the dataframe in the desired shape:
- Capitalize random factor `Subj`
- Compute `speed` as an alternative dependent variable from `reaction`, warranted by a 'boxcox' check of residuals.
- Create a `GroupedDataFrame` by levels of `Subj` (the original dataframe is available as `gdf.parent`, which we name `df`)
```{julia}
gdf = @chain MixedModels.dataset(:sleepstudy) begin
DataFrame
rename!(:subj => :Subj, :days => :day)
@transform!(:speed = 1000 / :reaction)
groupby(:Subj)
end
```
```{julia}
df = gdf.parent
describe(df)
```
## Estimates for pooled data
In the first analysis we ignore the dependency of observations due to repeated measures from the same subjects. We pool all the data and estimate the regression of 180 speed scores on the nine days of the experiment.
```{julia}
pooledcoef = simplelinreg(df.day, df.speed) # produces a Tuple
```
## Within-subject effects
In the second analysis we estimate coefficients for each `Subj` without regard of the information available from the complete set of data. We do not "borrow strength" to adjust for differences due to between-`Subj` variability and due to being far from the population mean.
### Within-subject simple regressions
Applying `combine` to a grouped data frame like `gdf` produces a `DataFrame` with a row for each group.
The permutation `ord` provides an ordering for the groups by increasing intercept (predicted response at day 0).
```{julia}
within = combine(gdf, [:day, :speed] => simplelinreg => :coef)
```
@fig-xyplotsleepspeed shows the reaction speed versus days of sleep deprivation by subject.
The panels are arranged by increasing initial reaction speed starting at the lower left and proceeding across rows.
```{julia}
#| code-fold: true
#| label: fig-xyplotsleepspeed
#| fig-cap: "Reaction speed (s⁻¹) versus days of sleep deprivation by subject"
let
ord = sortperm(first.(within.coef))
labs = values(only.(keys(gdf)))[ord] # labels for panels
f = clevelandaxes!(Figure(; resolution=(1000, 750)), labs, (2, 9))
for (axs, sdf) in zip(f.content, gdf[ord]) # iterate over the panels and groups
scatter!(axs, sdf.day, sdf.speed) # add the points
coef = simplelinreg(sdf.day, sdf.speed)
abline!(axs, first(coef), last(coef)) # add the regression line
end
f
end
```
## Basic LMM
```{julia}
contrasts = Dict(:Subj => Grouping())
m1 = let
form = @formula speed ~ 1 + day + (1 + day | Subj)
fit(MixedModel, form, df; contrasts)
end
```
This model includes fixed effects for the intercept which estimates the average speed on the baseline day of the experiment prior to sleep deprivation, and the slowing per day of sleep deprivation. In this case about -0.11/second.
The random effects represent shifts from the typical behavior for each subject.The shift in the intercept has a standard deviation of about 0.42/s.
The within-subject correlation of the random effects for intercept and slope is small, -0.18, indicating that a simpler model with a correlation parameter (CP) forced to/ assumed to be zero may be sufficient.
## No correlation parameter: zcp LMM
The `zerocorr` function applied to a random-effects term estimates one parameter less than LMM `m1`-- the CP is now fixed to zero.
```{julia}
m2 = let
form = @formula speed ~ 1 + day + zerocorr(1 + day | Subj)
fit(MixedModel, form, df; contrasts)
end
```
LMM `m2` has a slghtly lower log-likelihood than LMM `m1` but also one fewer parameters.
A likelihood-ratio test is used to compare these nested models.
```{julia}
#| code-fold: true
MixedModels.likelihoodratiotest(m2, m1)
```
Alternatively, the AIC, AICc, and BIC values can be compared. They are on a scale where "smaller is better". All three model-fit statistics prefer the zcpLMM `m2`.
```{julia}
#| code-fold: true
let
mods = [m2, m1]
DataFrame(;
dof=dof.(mods),
deviance=deviance.(mods),
AIC=aic.(mods),
AICc=aicc.(mods),
BIC=bic.(mods),
)
end
```
## Conditional modes of the random effects
The third set of estimates are their conditional modes.
They represent a compromise between their own data and the model parameters. When distributional assumptions hold, predictions based on these estimates are more accurate than either the pooled or the within-subject estimates.
Here we "borrow strength" to improve the accuracy of prediction.
## Caterpillar plots (effect profiles)
```{julia}
#| code-fold: true
#| fig-cap: "Prediction intervals on the random effects in model m2"
#| label: fig-caterpillarm2
caterpillar(m2)
```
## Shrinkage plot
```{julia}
#| code-fold: true
#| fig-cap: "Shrinkage plot of the means of the random effects in model m2"
#| label: fig-shrinkagem2
shrinkageplot!(Figure(; resolution=(500, 500)), m2)
```
# References
::: {#refs}
:::