10.Rmd

# Multicategorical Focal Antecedents and Moderators

```{r set-options_10, echo = FALSE, cache = FALSE}
options(width = 100)
```

> In this chapter, [Hayes] extend[ed] the principles of moderation analysis described in Chapters 7 and 8 to testing interaction involving a multicategorical focal antecedent variable or moderator. As you will see, the principles discussed in those chapters generalize quite readily, although the model necessarily requires more than one product to capture an interaction between two variables. This makes the formulas a bit more complex, and the visualizing and probing process a bit more involved. But with comfort with the fundamentals described so far, you should not find it difficult to master this extension of multiple regression analysis. [@hayesIntroductionMediationModeration2018, p. 350]

## Moderation of the effect of a multicategorical antecedent variable

Take the case of a continuous or dichotomous moderator $W$ and a multicategorical $X$ "with $g$ groups, include $g − 1$ variables coding membership in the groups, the moderator variable $W$, and $g − 1$ products between the $g − 1$ group codes and moderator $W$ in a regression model" (p. 351) following the form

$$
Y = i_Y + \sum_{i = 1}^{g - 1} b_i D_i + b_g W + \sum_{j = g + 1}^{2g - 1} b_j D_{j - g} W + e_Y,
$$

where $D_i$ denotes the $i$^th^ dummy variable. Given the case where $g = 4$, that formula can be re-expressed as

\begin{align*}
Y & = i_Y + b_1 D_1 + b_2 D_2 + b_3 D_3 + b_4 W + 
b_5 D_1 W + b_6 D_2 W + b_7 D_3 W + e_Y, \;\;\;\text{or} \\ 
& = i_Y + (b_1 + b_5 W) D_1 + (b_2 + b_6 W) D_2 + (b_3 + b_7 W) D_3 + b_4 W + e_Y.
\end{align*}

## An example from the sex disrimination in the workplace study

Here we load a couple necessary packages, load the data, and take a `glimpse()`.

```{r, warning = F, message = F}
library(tidyverse)

protest <- read_csv("data/protest/protest.csv")

glimpse(protest)
```

With a little `if_else()`, computing the dummies `d1` and `d2` is easy enough.

```{r}
protest <- protest %>% 
  mutate(d1 = if_else(protest == 1, 1, 0),
         d2 = if_else(protest == 2, 1, 0))
```

Load **brms**.

```{r, message = F, warning = F}
library(brms)
```

With `model10.1` and `model10.2` we fit the multicategorical multivariable model and the multicategorical moderation models, respectively.

```{r model10.1}
model10.1 <- brm(
  data = protest, 
  family = gaussian,
  liking ~ 1 + d1 + d2 + sexism,
  cores = 4,
  file = "fits/model10.01")

model10.2 <- update(
  model10.1,
  newdata = protest,
  liking ~ 1 + d1 + d2 + sexism + d1:sexism + d2:sexism,
  cores = 4,
  file = "fits/model10.02")
```

Behold the $R^2$ summaries.

```{r, message = F}
r2 <- tibble(`Model 10.1` = bayes_R2(model10.1, summary = F)[, 1],
             `Model 10.2` = bayes_R2(model10.2, summary = F)[, 1]) %>% 
  mutate(`The R2 difference` = `Model 10.2` - `Model 10.1`)
  
r2 %>% 
  pivot_longer(everything())  %>% 
  # this line isn't necessary, but it sets the order the summaries appear in
  mutate(name = factor(name, levels = c("Model 10.1", "Model 10.2", "The R2 difference"))) %>% 
  group_by(name) %>% 
  summarize(mean   = mean(value),
            median = median(value),
            ll     = quantile(value, probs = .025),
            ul     = quantile(value,  probs = .975)) %>% 
  mutate_if(is.double, round, digits = 3)
```

Interestingly, even though our posterior means and medians for the model-specific $R^2$ values differed some from the OLS estimates in the text, their difference corresponded quite nicely to the one in the text. Let's take a look at their distributions.

```{r, fig.width = 8, fig.height = 2.25}
r2 %>% 
  pivot_longer(everything()) %>% 
  ggplot(aes(x = value)) +
  geom_density(linewidth = 0, fill = "grey33") +
  scale_y_continuous(NULL, breaks = NULL) +
  facet_wrap(~ name, scales = "free_y") +
  theme_minimal()
```

The model coefficient summaries cohere well with those in Table 10.1.

```{r}
print(model10.1, digits = 3)
print(model10.2, digits = 3)
```

## Visualizing the model

To get our version of the values in Table 10.2, we'll first recreate columns for $d_1$ through $W$ (SEXISM) and save then as a tibble, `nd`.

```{r}
(
  nd <- tibble(d1 = c(0, 1, 0),
               d2 = c(0, 0, 1)) %>% 
    expand_grid(sexism = quantile(protest$sexism, probs = c(.16, .5, .84)))
)
```

With `nd` in hand, we'll feed the predictor values into `fitted()` for the typical posterior summaries.

```{r}
fitted(model10.2, newdata = nd) %>% round(digits = 3)
```

The values in our `Estimate` column correspond to those in the $\hat Y$ column in the table. We, of course, add summaries of uncertainty to the point estimates.

If we want to make a decent line plot for our version of Figure 10.3, we'll need many more values for `sexism`, which will appear on the $x$-axis.

```{r}
nd <- tibble(d1 = c(0, 1, 0),
             d2 = c(0, 0, 1)) %>% 
  expand_grid(sexism = seq(from = 3.5, to = 6.5, length.out = 30))
```

This time we'll save the results from `fitted()` as a tlbble and wrangle a bit to get ready for the plot.

```{r}
f <- fitted(model10.2, 
            newdata = nd, 
            probs = c(.025, .25, .75, .975)) %>% 
  data.frame() %>% 
  bind_cols(nd) %>% 
  mutate(condition = if_else(d1 == 1, "Individual Protest",
                             if_else(d2 == 1, "Collective Protest", "No Protest"))) %>% 
  # this line is not necessary, but it will help order the facets of the plot
  mutate(condition = factor(condition, levels = c("No Protest", "Individual Protest", "Collective Protest"))) 

glimpse(f)
```

For Figure 10.3 and many to follow for this chapter, we'll superimpose 50% intervals on top of 95% intervals.

```{r, fig.width = 10, fig.height = 3.5}
# this will help us add the original data points to the plot
protest <- protest %>% 
  mutate(condition = ifelse(protest == 0, "No Protest",
                            ifelse(protest == 1, "Individual Protest", 
                                   "Collective Protest"))) %>% 
  mutate(condition = factor(condition, levels = c("No Protest", "Individual Protest", "Collective Protest"))) 
  

# this will help us with the x-axis
breaks <- tibble(values = quantile(protest$sexism, probs = c(.16, .5, .84))) %>% 
  mutate(labels = values %>% round(digits = 2) %>% as.character())

# Here we plot
f %>% 
  ggplot(aes(x = sexism)) +
  geom_ribbon(aes(ymin = Q2.5, ymax = Q97.5),
              alpha = 1/3) +
  geom_ribbon(aes(ymin = Q25, ymax = Q75),
              alpha = 1/3) +
  geom_line(aes(y = Estimate)) +
  geom_point(data = protest,
             aes(y = liking),
             size = 2/3) +
  scale_x_continuous(breaks = breaks$values,
                     labels = breaks$labels) +
  coord_cartesian(xlim = c(4, 6),
                  ylim = c(2.5, 7.2)) +
  labs(x = expression(paste("Perceived Pervasiveness of Sex Discrimination in Society (", italic(W), ")")),
       y = "Evaluation of the Attorney") +
  theme_minimal() +
  facet_wrap(~ condition)
```

By adding the data to the plots, they are both more informative and now serve as a posterior predictive check.

## Probing the interaction

These will involve both omnibus tests and pairwise comparisons.

### The pick-a-point approach.

"The pick-a-point approach requires you to choose values of the moderator $W$ and then estimate the conditional effect of $X$ on $Y$ at those values and ~~conduct an inferential test~~" [evaluate the posterior distribution] (p. 368).

#### Omnibus inference.

Hayes used the omnibus testing framework to assess how important coefficients $b_1$ and $b_2$ were to our interaction model, `model1`. Before fitting the models, he discussed why he preferred to fit models after centering `sexism` (i.e., $W$) to 4.25. Here we'll call our centered variable `sexism_p`, where `_p` stands in for "prime".

```{r}
protest <- protest %>% 
  mutate(sexism_p = sexism - 4.25)
```

From here on, `model10.3` is the moderation model without the lower-order `d1` and `d2` terms; `model10.4` is the full moderation model. But we're going to be fitting both these models three different ways, based on how we center` sexism`. So for this first set where we centered `sexism` on 4.25, we'll give them the suffix `a`.

```{r model10.3a}
# the model without d1 + d2
model10.3a <- update(
  model10.2,
  newdata = protest,
  liking ~ 1 + sexism_p + d1:sexism_p + d2:sexism_p,
  cores = 4,
  file = "fits/model10.03a")

# the full model with d1 + d2
model10.4a <- update(
  model10.2,
  newdata = protest,
  liking ~ 1 + d1 + d2 + sexism_p + d1:sexism_p + d2:sexism_p,
  cores = 4,
  file = "fits/model10.04a")
```

The coefficient summaries for `model10.4a` correspond to the top section of Table 10.3 (p. 373).

```{r}
fixef(model10.4a) %>% round(digits = 3)
```

We can compare their Bayesian $R^2$ distributions like we usually do.

```{r, message = F, warning = F}
library(tidybayes)

r2 <- tibble(`Model without d1 + d2` = bayes_R2(model10.3a, summary = F)[, 1],
             `Model with d1 + d2`    = bayes_R2(model10.4a, summary = F)[, 1]) %>% 
  mutate(`The R2 difference` = `Model with d1 + d2` - `Model without d1 + d2`)
  
r2 %>% 
  pivot_longer(everything()) %>% 
  mutate(name = factor(name, levels = c("Model without d1 + d2", "Model with d1 + d2", "The R2 difference"))) %>% 
  group_by(name) %>% 
  median_qi(value) %>% 
  mutate_if(is.double, round, digits = 3)
```

Our results differ a bit from those in the text (p. 370), but the substantive interpretation is the same. The `d1` and `d2` parameters added little predictive power to the model in terms of $R^2$. We can also use information criteria to compare the models. Here are the results from using the LOO-CV.

```{r}
model10.3a <- add_criterion(model10.3a, criterion = "loo")
model10.4a <- add_criterion(model10.4a, criterion = "loo")

loo_compare(model10.3a, model10.4a) %>% 
  print(simplify = F)
```

The LOO-CV difference between the two models was pretty small. Thus, the LOO-CV gives the same general message as the $R^2$. The `d1` and `d2` parameters were sufficiently small and uncertain enough that constraining them to zero did little in terms of reducing the explanatory power of the statistical model.

Here's the same thing all over again, but this time after centering `sexism` on 5.120.

```{r}
protest <- protest %>% 
  mutate(sexism_p = sexism - 5.120)
```

Now fit the models.

```{r model10.3b}
# the model without d1 + d2
model10.3b <- update(
  model10.2,
  newdata = protest,
  liking ~ 1 + sexism_p + d1:sexism_p + d2:sexism_p,
  cores = 4,
  file = "fits/model10.03b")

# the full model with d1 + d2
model10.4b <- update(
  model10.2,
  newdata = protest,
  liking ~ 1 + d1 + d2 + sexism_p + d1:sexism_p + d2:sexism_p,
  cores = 4,
  file = "fits/model10.04b")
```

These coefficient summaries correspond to the middle section of Table 10.3 (p. 373).

```{r}
fixef(model10.4b) %>% round(digits = 3)
```

Here are the Bayesian $R^2$ summaries and the summary for their difference.

```{r, warning = F}
r2 <- tibble(`Model without d1 + d2` = bayes_R2(model10.3b, summary = F)[, 1],
             `Model with d1 + d2`    = bayes_R2(model10.4b, summary = F)[, 1]) %>%
  mutate(`The R2 difference` = `Model with d1 + d2` - `Model without d1 + d2`)

r2 %>% 
  pivot_longer(everything()) %>% 
  mutate(name = factor(name, levels = c("Model without d1 + d2", "Model with d1 + d2", "The R2 difference"))) %>% 
  group_by(name) %>% 
  median_qi(value) %>% 
  mutate_if(is.double, round, digits = 3)
```

This time, our $\Delta R^2$ distribution was more similar to the results Hayes reported in the text (p. 370, toward the bottom).

Here's the updated LOO-CV.

```{r}
model10.3b <- add_criterion(model10.3b, criterion = "loo")
model10.4b <- add_criterion(model10.4b, criterion = "loo")

loo_compare(model10.3b, model10.4b) %>% 
  print(simplify = F)
```

Here again our Bayesian $R^2$ and `loo()` results cohere, both suggesting the `d1` and `d2` parameters were of little predictive utility. Note how this differs a little from the second $F$-test on page 370.

Here's what happens when we center `sexism` on 5.896. First center.

```{r}
protest <- protest %>% 
  mutate(sexism_p = sexism - 5.896)
```

Fit the models.

```{r model10.3c}
# the model without d1 + d2
model10.3c <- update(
  model10.2,
  newdata = protest,
  liking ~ 1 + sexism_p + d1:sexism_p + d2:sexism_p,
  cores = 4,
  file = "fits/model10.03c")

# the full model with d1 + d2
model10.4c <- update(
  model10.2,
  newdata = protest,
  liking ~ 1 + d1 + d2 + sexism_p + d1:sexism_p + d2:sexism_p,
  cores = 4,
  file = "fits/model10.04c")
```

These coefficient summaries correspond to the lower section of Table 10.3 (p. 373).

```{r}
fixef(model10.4c) %>% round(digits = 3)
```

Again, compute the $R^2$ distributions and their difference-score distribution.

```{r, warning = F}
r2 <- tibble(`Model without d1 + d2` = bayes_R2(model10.3c, summary = F)[, 1],
             `Model with d1 + d2`    = bayes_R2(model10.4c, summary = F)[, 1]) %>%
  mutate(`The R2 difference` = `Model with d1 + d2` - `Model without d1 + d2`)

r2 %>% 
  pivot_longer(everything()) %>% 
  mutate(name = factor(name, levels = c("Model without d1 + d2", "Model with d1 + d2", "The R2 difference"))) %>% 
  group_by(name) %>% 
  median_qi(value) %>% 
  mutate_if(is.double, round, digits = 3)
```

That $\Delta R^2$ distribution matches up nicely with the one Hayes reported at the bottom of page 370. Now compare the models with the LOO.

```{r}
model10.3c <- add_criterion(model10.3c, "loo")
model10.4c <- add_criterion(model10.4c, "loo")

loo_compare(model10.3c, model10.4c) %>% 
  print(simplify = F)
```

Although our Bayesian $R^2$ difference is now predominantly positive, the LOO-CV difference for the two models remains uncertain. Here's a look at the two parameters in question using a handmade coefficient plot.

```{r, fig.width = 6, fig.height = 1, warning = F}
as_draws_df(model10.4c) %>% 
  pivot_longer(b_d1:b_d2) %>% 
  mutate(name = str_remove(name, "b_")) %>% 
  
  ggplot(aes(x = value, y = name)) +
  stat_summary(fun = median,
               fun.min = function(i) quantile(i, probs = .025),
               fun.max = function(i) quantile(i, probs = .975),
               color = "grey33") +
  stat_summary(geom = "linerange",
               fun.min = function(i) quantile(i, probs = .25),
               fun.max = function(i) quantile(i, probs = .75),
               color = "grey33",
               linewidth = 1.25) +
  ylab(NULL) +
  coord_cartesian(xlim = c(0, 2)) +
  theme_minimal()
```

For Figure 10.4, we'll drop our faceting approach and just make one big plot. Heads up: I'm going to drop the 50% intervals from this plot. They'd just make it too busy.

```{r, fig.width = 6, fig.height = 5}
f %>% 
  ggplot(aes(x = sexism, y = Estimate, ymin = Q2.5, ymax = Q97.5, alpha = condition)) +
  geom_ribbon() +
  geom_line() +
  scale_alpha_manual(values = c(.2, .5, .8)) +
  scale_x_continuous(breaks = breaks$values,
                     labels = breaks$labels) +
  coord_cartesian(xlim = c(4, 6),
                  ylim = c(4.5, 6.7)) +
  labs(x = expression("Perceived Pervasiveness of Sex Discrimination in Society "*(italic(W))),
       y = "Evaluation of the Attorney") +
  theme_minimal() +
  theme(legend.direction = "vertical",
        legend.position = "top",
        legend.title = element_blank())
```

#### Pairwise inference.

Hayes continues to reference Table 10.3. In the last subsection, we reproduced those results one model at a time. Why not practice doing it altogether? There are a lot of ways you could do this. A good first try is to extend the `fixef()` approach from before with a little help from `bind_rows()`.

```{r}
bind_rows(
  # start with `model4a`
  fixef(model10.4a) %>% 
    data.frame() %>% 
    rownames_to_column("parameter"),
  # add `model4b`
  fixef(model10.4b) %>% 
    data.frame() %>% 
    rownames_to_column("parameter"),
  # add `model4c`
  fixef(model10.4c) %>% 
    data.frame() %>% 
    rownames_to_column("parameter")
) %>% 
  # wrangle a bit
  mutate(`w'` = str_c("w - ", c(4.25, 5.12, 5.896)) %>% rep(., each = 6)) %>% 
  select(`w'`, everything()) %>% 
  mutate_if(is.double, round, digits = 3)
```

This code works okay, but it's redundant. Here's a streamlined approach where we use a combination of nested tibbles and the `purrr::map()` function to work with our three model fits--`model10.4a`, `model10.4b`, and `model10.4c`--in bulk.

```{r}
t <- tibble(`w'` = str_c("w - ", c(4.25, 5.12, 5.896)),
            name = str_c("model10.4", letters[1:3])) %>% 
  mutate(fit = map(name, get)) %>% 
  mutate(s = map(fit, ~fixef(.) %>% 
                   data.frame() %>% 
                   rownames_to_column("parameter"))) %>% 
  unnest(s) %>% 
  select(`w'`, parameter:Q97.5)

t %>% 
  mutate_if(is.double, round, digits = 3)
```

Summary tables like this are precise and very common in the literature. But you can get lost in all those numbers. A coefficient plot can be better. This first version is pretty close to the Table 10.3 format.

```{r, fig.width = 8, fig.height = 1.75}
t %>% 
  # this will help us order our y-axis
  mutate(parameter = factor(parameter,
                            levels = c("d2:sexism_p", "d1:sexism_p", 
                                       "sexism_p", "d2", "d1", "Intercept")),
         # this is just for aesthetics
         `w'` = str_c("w' = ", `w'`)) %>% 
  
  # plot!
  ggplot(aes(x = Estimate, xmin = Q2.5, xmax = Q97.5, y = parameter)) +
  geom_pointrange() +
  labs(x = NULL,
       y = NULL) +
  theme_minimal() +
  theme(axis.text.y = element_text(hjust = 0)) +
  facet_wrap(~ `w'`, nrow = 1)
```

Notice how this arrangement makes it easiest to compare coefficients within models. If we wanted to make it easier to compare coefficients across models, we might arrange the plot like so.

```{r, fig.width = 6, fig.height = 5}
t %>% 
  # this will help us order our y-axis
  mutate(parameter = factor(parameter,
                            levels = c("Intercept", "d1", "d2", "sexism_p", "d1:sexism_p", "d2:sexism_p"))) %>% 
  
  # plot!
  ggplot(aes(x = Estimate, xmin = Q2.5, xmax = Q97.5, y = `w'`)) +
  geom_pointrange() +
  labs(x = NULL,
       y = NULL) +
  theme_minimal() +
  theme(axis.text.y = element_text(hjust = 0)) +
  facet_wrap(~ parameter, ncol = 1)
```

Oh man--with sweet plots like these, who needs tables! This makes it much easier to see what happened as we changed values we centered `sexism` on. In the middle paragraph on page 374, Hayes pointed out "that $b_1$ and $b_2$ differ in these analyses, but $b_3$, $b_4$, and $b_5$ are unaffected by the centering". Our coefficient plot clarified that in a way I don't think a table ever could. But before we move on, let's back up a little in the text.

"To make this more concrete, consider the effect of Catherine's behavior on how she is perceived among people who are relatively high in their perceptions of the pervasiveness of sex discrimination in society" (p. 372). For this, Hayes defined "relatively high" as $W = 5.896$. To get those estimates for each condition, we'll use `fitted()`. Since the number of unique predictor values is small for this example, we'll just plug them directly into the `newdata` argument rather than first saving them as a `nd` object.

```{r}
fitted(model10.2,
       newdata = tibble(d1 = c(0, 1, 0),
                        d2 = c(0, 0, 1),
                        sexism = 5.896)) %>% 
  round(digits = 3)
```

Those posterior summaries match up nicely with the point estimates Hayes presented at the bottom of page 372. Hayes further expounded:

> So by using the regression centering strategy described earlier in the context of an omnibus test of equality of values of $\hat Y$, the regression coefficients $b_1$ and $b_2$ provide pairwise inferences consistent with the coding system used to represent the three groups, conditioned on the value that $W$ is centered around.

In the next few sentences, he focused on what happened when $W = 4.250$ (i.e., in `model4a`). Recall that the two coefficients in question, $b_1$ and $b_2$, are named `d1` and `d2` when we pull their summaries with `fixef()`.

```{r}
fixef(model10.4a)[c("d1", "d2"), ] %>% 
  round(digits = 3)
```

Hayes then clarified that in this model

\begin{align*}
b_1 & = \theta_{D_1 \rightarrow Y} | (W = 4.250) = 5.400 - 5.698 = -0.299 \;\;\; \text{ and} \\
b_2 & = \theta_{D_2 \rightarrow Y} | (W = 4.250) = 5.513 - 5.698 = -0.185.
\end{align*}

That is, it is the same as a difference score of each of the experimental conditions minus the "No protest" condition. To further show the difference-score quality of these coefficients, we can continue using `fitted()` in conjunction with the original `model10.2` to get the group comparisons for when $W = 4.250$. Since these involve computing difference scores, we'll have to use `summary = F` and do some wrangling.

```{r, warning = F}
fitted(model10.2,
       newdata = tibble(d1     = c(0, 1, 0),
                        d2     = c(0, 0, 1),
                        sexism = 4.25),
       summary = F) %>% 
  data.frame() %>% 
  set_names("No Protest", "Individual Protest", "Collective Protest") %>% 
  mutate(difference_a = `Individual Protest` - `No Protest`,
         difference_b = `Collective Protest` - `No Protest`) %>% 
  pivot_longer(everything()) %>% 
  mutate(name = factor(name, levels = c("No Protest", "Individual Protest", "Collective Protest", 
                                        "difference_a", "difference_b"))) %>% 
  group_by(name) %>% 
  mean_qi(value) %>%
  select(name:.upper) %>% 
  mutate_if(is.double, round, digits = 3)
```

Within simulation variance, `difference_a` is the same as $b_{1 | \text{model10.4a}}$ and `difference_b` is the same as $b_{2 | \text{model10.4a}}$. Here's the same thing for when $W = 5.120$.

```{r, warning = F}
fitted(model10.2,
       newdata = tibble(d1     = c(0, 1, 0),
                        d2     = c(0, 0, 1),
                        sexism = 5.120),
       summary = F) %>% 
  data.frame() %>% 
  set_names("No Protest", "Individual Protest", "Collective Protest") %>% 
  mutate(difference_a = `Individual Protest` - `No Protest`,
         difference_b = `Collective Protest` - `No Protest`) %>% 
  pivot_longer(everything()) %>% 
  mutate(name = factor(name, levels = c("No Protest", "Individual Protest", "Collective Protest",
                                        "difference_a", "difference_b"))) %>% 
  group_by(name) %>% 
  mean_qi(value) %>%
  select(name:.upper) %>% 
  mutate_if(is.double, round, digits = 3)
```

Finally, here it is for when $W = 5.986$.

```{r, warning = F}
fitted(model10.2,
       newdata = tibble(d1     = c(0, 1, 0),
                        d2     = c(0, 0, 1),
                        sexism = 5.986),
       summary = F) %>% 
  data.frame() %>% 
  set_names("No Protest", "Individual Protest", "Collective Protest") %>% 
  mutate(difference_a = `Individual Protest` - `No Protest`,
         difference_b = `Collective Protest` - `No Protest`) %>% 
  pivot_longer(everything()) %>% 
  mutate(name = factor(name, levels = c("No Protest", "Individual Protest", "Collective Protest", 
                                        "difference_a", "difference_b"))) %>% 
  group_by(name) %>% 
  mean_qi(value) %>%
  select(name:.upper) %>% 
  mutate_if(is.double, round, digits = 3)
```

### The Johnson-Neyman technique.

> As discussed in section 7.4, a problem with the pick-a-point approach to probing an interaction is having to choose values of the moderator. When the moderator is a continuum, you may not have any basis for choosing some values rather than others, and the choice you make will certainly influence the results of the probing exercise to some extent... Actively choosing a different system or con- vention, such as using the sample mean of $W$, a standard deviation below the mean, and a standard deviation above the mean also does not eliminate the problem. But the Johnson–Neyman (JN) technique avoids this problem entirely. (p. 376)

#### Omnibus inference.

Consider the first sentence of the section:

> Applied to probing an interaction between a multicategorical $X$ and a continuous $W$, an omnibus version of the JM technique involves finding the value or values of $W$ where their $F$-ratio comparing the $g$ estimated values of $Y$ is just statistically significant. (p. 376)

Since we're not using $F$-tests with our approach to Bayesian modeling, the closest we might have is a series of $R^2$ difference tests, which would require refitting the model multiple times over many ways of centering the $W$-variable, `sexism`. I suppose you could do this if you wanted, but it just seems silly, to me. I'll leave this one up to the interested reader.

#### Pairwise inference.

Hayes didn't make plots for this section, but if you're careful constructing your `nd` and with the subsequent wrangling, you can make the usual plots. Since we have two conditions we'd like to compare with *No Protest*, we'll make two plots. Here's the comparison using *Individual Protest*, first.

```{r, fig.width = 6, fig.height = 4}
# the transition value Hayes identified in the text
Hayes_value <- 5.065

# we need some new data
nd <- tibble(d1 = 0:1,
             d2 = 0) %>% 
  expand_grid(sexism = seq(from = 3.5, to = 6.5, length.out = 30)) %>% 
  mutate(row = 1:n())

# plug those data into `fitted()`
fitted(model10.2,
       newdata = nd,
       summary = F) %>% 
  # wrangle
  data.frame() %>% 
  set_names(pull(nd, row)) %>%
  mutate(draw = 1:n()) %>% 
  pivot_longer(-draw, values_to = "estimate") %>% 
  mutate(row = as.double(name)) %>%
  left_join(nd, by = "row") %>% 
  mutate(condition = if_else(d1 == 0, "No Protest", "Individual Protest")) %>% 
  select(estimate, sexism, draw, condition) %>% 
  pivot_wider(names_from = condition, values_from = estimate) %>% 
  mutate(difference = `Individual Protest` - `No Protest`)%>% 
  
  # plot!
  ggplot(aes(x = sexism, y = difference)) +
  stat_summary(geom = "ribbon",
               fun.min = function(i) quantile(i, probs = .025),
               fun.max = function(i) quantile(i, probs = .975),
               alpha = 1/3) +
  stat_summary(geom = "ribbon",
               fun.min = function(i) quantile(i, probs = .25),
               fun.max = function(i) quantile(i, probs = .75),
               alpha = 1/3) +
  stat_summary(geom = "line",
               fun = median) +
  scale_x_continuous(breaks = c(4, Hayes_value, 6),
                     labels = c("4", Hayes_value, "6")) +
  coord_cartesian(xlim = c(4, 6)) +
  labs(subtitle = expression("Our JN-technique plot for "*italic(Individual~Protest)*" compared with "*italic(No~Protest))) +
  theme_minimal()
```

Now we're ready to compare *No Protest* to *Collective Protest*. The main data difference is which values we assigned to the `d1` and `d2` columns in `nd`. For kicks, we should practice another way to get the median line and interval ribbons. The `stat_summary()` approach from above works great, but it's verbose. The `tidybayes::stat_lineribbon()` function will give us the same results with fewer lines of code.

```{r, fig.width = 6, fig.height = 4, message = F}
# the transition value Hayes identified in the text
Hayes_value <- 5.036

# new data
nd <- tibble(d1 = 0,
             d2 = 0:1) %>% 
  expand_grid(sexism = seq(from = 3.5, to = 6.5, length.out = 30)) %>% 
  mutate(row = 1:n())

# this part is the same as before
fitted(model10.2,
       newdata = nd,
       summary = F) %>% 
  data.frame() %>% 
  set_names(pull(nd, row)) %>%
  mutate(draw = 1:n()) %>% 
  pivot_longer(-draw, values_to = "estimate") %>% 
  mutate(row = as.double(name)) %>%
  left_join(nd, by = "row") %>% 
  # there are some mild differences in the two mutate() lines
  mutate(condition = if_else(d2 == 0, "No Protest", "Collective Protest")) %>% 
  select(estimate, sexism, draw, condition) %>% 
  pivot_wider(names_from = condition, values_from = estimate) %>% 
  mutate(difference = `Collective Protest` - `No Protest`) %>% 

  # plot!
  ggplot(aes(x = sexism, y = difference)) +
  # look how compact this is!
  stat_lineribbon(.width = c(0.5, 0.95),
                  alpha = 1/3, fill = "black") +
  scale_x_continuous(breaks = c(4, Hayes_value, 6),
                     labels = c("4", Hayes_value, "6")) +
  coord_cartesian(xlim = c(4, 6)) +
  labs(subtitle = expression("Our JN-technique plot for "*italic(Collective~Protest)*" compared with "*italic(No~Protest))) +
  theme_minimal()
```

And here we do it one last time between the two active protest conditions. For good measure, we will continue experimenting with different ways of plotting the results. This time well first summarize the posterior median and intervals with `tidybayes::median_qi()` before plotting. We'll then feed those results into our plot with the aid of `tidybayes::geom_lineribbon()` and a follow-up `scale_fill_manual()` line.

```{r, fig.width = 6, fig.height = 4, message = F}
nd <- tibble(d1 = 1:0,
             d2 = 0:1) %>%
  expand_grid(sexism = seq(from = 3.5, to = 6.5, length.out = 30)) %>% 
  mutate(row = 1:n())

fitted(model10.2,
       newdata = nd,
       summary = F) %>% 
  data.frame() %>% 
  set_names(pull(nd, row)) %>%
  mutate(draw = 1:n()) %>% 
  pivot_longer(-draw, values_to = "estimate") %>% 
  mutate(row = as.double(name)) %>%
  left_join(nd, by = "row") %>% 
  # there are some mild differences in the two mutate() lines
  mutate(condition = if_else(d1 == 0, "Individual Protest", "Collective Protest")) %>% 
  select(estimate, sexism, draw, condition) %>% 
  pivot_wider(names_from = condition, values_from = estimate) %>% 
  mutate(difference = `Collective Protest` - `Individual Protest`) %>% 
  # group and summarise, here
  group_by(sexism) %>%
  median_qi(difference, .width = c(.5, .95))  %>% 
  
  # plot!
  ggplot(aes(x = sexism, y = difference, ymin = .lower, ymax = .upper)) +
  # look how simple these two lines are
  geom_lineribbon(show.legend = F) +
  scale_fill_manual(values = c("grey75", "grey50")) +
  coord_cartesian(xlim = c(4, 6)) +
  labs(subtitle = expression("Our JN-technique plot for "*italic(Collective~Protest)*" compared with "*italic(Individual~Protest))) +
  theme_minimal()
```

Little difference between those conditions.

## When the moderator is multicategorical

From a substantive standpoint the combination of

* a multicategorical variable $X$ and a dichotomous or continuous moderator $W$ versus
* a dichotomous or continuous variable $X$ and a multicategorical moderator $W$

might seem different. From a modeling perspective, the difference is trivial. As Hayes pointed out, "when we claim from a statistical test of moderation that $X$'s effect is moderated by $W$, then it is also true that $W$'s effect is moderated by $X$. This is the symmetry property of interactions" (p. 381). This symmetry holds when we're not using the hypothesis-testing framework, too.

### An example.

Just as a refresher, here's the `print()` output for `model2`.

```{r}
print(model10.2, digits = 3)
```

The Bayesian $R^2$:

```{r}
bayes_R2(model10.2) %>% round(digits = 3)
```

And the $R^2$ difference between this and the model excluding the interaction terms:

```{r}
tibble(`Model 10.1` = bayes_R2(model10.1, summary = F)[, 1],
       `Model 10.2` = bayes_R2(model10.2, summary = F)[, 1])  %>% 
  transmute(difference = `Model 10.2` - `Model 10.1`) %>% 
  mean_qi(difference) %>% 
  mutate_if(is.double, round, digits = 3)
```

Much like in the text, our Figure 10.7 is just a little different from what we did with Figure 10.3.

```{r, fig.width = 10, fig.height = 3.5}
# this will help us with the `geom_text()` annotation
slopes <- tibble(
  slope = c(fixef(model10.2)["sexism", "Estimate"] + fixef(model10.2)["d1:sexism", "Estimate"],
            fixef(model10.2)["sexism", "Estimate"] + fixef(model10.2)["d2:sexism", "Estimate"],
            fixef(model10.2)["sexism", "Estimate"]),
  x = c(4.8, 4.6, 5),
  y = c(6.37, 6.25, 4.5),
  condition = factor(c("Individual Protest", "Collective Protest", "No Protest"),
                     levels = c("No Protest", "Individual Protest", "Collective Protest"))) %>% 
  mutate(label = str_c("This slope is about ", slope %>% round(digits = 3)))

# now we plot
f %>% 
  ggplot(aes(x = sexism)) +
  geom_ribbon(aes(ymin = Q2.5, ymax = Q97.5),
              alpha = 1/3) +
  geom_ribbon(aes(ymin = Q25, ymax = Q75),
              alpha = 1/3) +
  geom_line(aes(y = Estimate)) +
  geom_text(data = slopes,
            aes(x = x, y = y, label = label)) +
  coord_cartesian(xlim = c(4, 6)) +
  labs(x = expression(paste("Perceived Pervasiveness of Sex Discrimination in Society (", italic(X), ")")),
       y = "Evaluation of the Attorney") +
  facet_wrap(~ condition) +
  theme_minimal()
```

### Probing the interaction and interpreting the regression coefficients.

We computed the posterior means for the slopes when prepping for the figure, above. Here's how we might get more complete posterior summaries. Much like in the text, our Figure 10.7 is just a little different from what we did with Figure 10.3.

```{r, warning = F}
draws <- as_draws_df(model10.2) %>% 
  transmute(`No Protest`         = b_sexism + `b_d1:sexism` * 0 + `b_d2:sexism` * 0,
            `Individual Protest` = b_sexism + `b_d1:sexism` * 1 + `b_d2:sexism` * 0,
            `Collective Protest` = b_sexism + `b_d1:sexism` * 0 + `b_d2:sexism` * 1)

draws %>% 
  pivot_longer(everything()) %>% 
  mutate(name = factor(name, levels = c("No Protest", "Individual Protest", "Collective Protest"))) %>% 
  group_by(name) %>% 
  mean_qi(value) %>% 
  mutate_if(is.double, round, digits = 3)
```

Here are the differences among the three protest groups.

```{r, warning = F}
draws %>% 
  transmute(`Individual Protest - No Protest`         = `Individual Protest` - `No Protest`,
            `Collective Protest - No Protest`         = `Collective Protest` - `No Protest`,
            `Individual Protest - Collective Protest` = `Individual Protest` - `Collective Protest`) %>% 
  pivot_longer(everything()) %>% 
  # again, not necessary, but useful for reordering the summaries
  mutate(name = factor(name, levels = c("Individual Protest - No Protest", "Collective Protest - No Protest", "Individual Protest - Collective Protest"))) %>% 
  group_by(name) %>% 
  mean_qi(value) %>% 
  mutate_if(is.double, round, digits = 3)
```

## Session info {-}

```{r}
sessionInfo()
```

```{r, echo = F, message = F, warning = F, results = "hide"}
pacman::p_unload(pacman::p_loaded(), character.only = TRUE)
```