Multivariate and multilevel models

brms workshop >> 2020-10-22

Introduction

Multivariate models

Previously, we looked at models with one outcome

\[ y_n \sim N(\mu, \sigma^2) \]

However, models can have multiple outcomes
- Residual correlations
- Parameters can be shared between models
Some outcomes may be hypothesized to be predictors of other outcomes
- e.g. path analysis, mediation

Multilevel models

Previously, we treated the regression coefficients as fixed…
Data is clustered by some factor(s) (e.g. subject, country, …)
Parameters can vary between clusters
Cluster-specific parameters share a prior distribution
- Partial pooling of information across clusters
Prior distribution’s parameters indicate averages and (co)variances of cluster-specific parameters

This session

Multilevel mediation models can be seen as multilevel multivariate models

Mediation

What is mediation?

Mediation is a hypothesized causal model, whereby effect of an IV to a DV is transmitted through an intermediary variable M

Assessing mediation

Experimental approach

Experiment 1: manipulate X and measure M
Experiment 2: manipulate M and measure Y
Establishing a causal chain: Why experiments are often more effective than mediational analyses in examining psychological processes (Spencer, Zanna, and Fong 2005)

Assessing mediation

Statistical modeling approach

Disclaimer
Experiment: manipulate X, measure M and Y
Regress M on X; Y on X and M
Assume that
- Y does not affect M
- No 3rd variable on M to Y relationship
- M is measured without error
- Y and M residuals are not correlated

Assessing mediation

Statistical modeling approach

\begin{align*} Y_i &\sim N(d_Y + c’X_i + bM_i, \sigma^{2}_Y) &\mbox{[Y model]} \\ M_i &\sim N(d_M + aX_i, \sigma^{2}_M) &\mbox{[M model]} \end{align*}

\begin{align*} me &= a \times b &\mbox{[mediated effect]} \\ c &= c’ + me &\mbox{[total effect]} \end{align*}

Hallucinogens, Transformative experiences, and mood

Past research suggests that use of psychedelic substances such as LSD or psilocybin may have positive effects on mood and feelings of social connectedness. These psychological effects are thought to be highly sensitive to context, but robust and direct evidence for them in a naturalistic setting is scarce. In a series of field studies involving over 1,200 participants across six multiday mass gatherings in the United States and the United Kingdom, we investigated the effects of psychedelic substance use on transformative experience, social connectedness, and positive mood. […] We found that psychedelic substance use was significantly associated with positive mood—an effect sequentially mediated byself-reported transformative experience and increased social connectedness. […] Overall, this research provides robustevidence for positive affective and social consequences of psyche-delic substance use in naturalistic settings.

Transformative experience and social connectedness mediate the mood-enhancing effects of psychedelic use in naturalistic settings (Forstmann et al. 2020) https://www.pnas.org/content/117/5/2338

Hypothesized causal model

For this tutorial, we simplify the authors’ model

Hallucinogen data

head(dat)

## # A tibble: 6 x 5
##    Mood    TE Gender Survey H24  
##   <dbl> <dbl> <fct>  <fct>  <fct>
## 1     5     3 1      Event3 0    
## 2     5     2 2      Event5 0    
## 3     6     7 1      Event4 0    
## 4     6     7 2      Event2 0    
## 5     5     1 1      Event3 0    
## 6     5     7 2      Event4 0

Model estimation

We first estimate a single-level ordinary mediation model

path_m <- bf(TE ~ H24)
path_y <- bf(Mood ~ H24 + TE)
get_prior(path_m + path_y + set_rescor(FALSE), data = dat)

##                 prior     class coef group resp dpar nlpar bound       source
##                (flat)         b                                       default
##                (flat) Intercept                                       default
##                (flat)         b            Mood                  (vectorized)
##                (flat)         b H241       Mood                  (vectorized)
##                (flat)         b   TE       Mood                  (vectorized)
##  student_t(3, 5, 2.5) Intercept            Mood                       default
##  student_t(3, 0, 2.5)     sigma            Mood                       default
##                (flat)         b              TE                  (vectorized)
##                (flat)         b H241         TE                  (vectorized)
##    student_t(3, 4, 3) Intercept              TE                       default
##    student_t(3, 0, 3)     sigma              TE                       default

Model estimation

fit0 <- brm(
  path_m + path_y + set_rescor(FALSE),
  data = dat,
  chains = 4,
  cores = 4,
  control = list(adapt_delta = .95),
  file = here("models/mediation-0")
)

Model summary

summary(fit0)

##  Family: MV(gaussian, gaussian) 
##   Links: mu = identity; sigma = identity
##          mu = identity; sigma = identity 
## Formula: TE ~ H24 
##          Mood ~ H24 + TE 
##    Data: dat (Number of observations: 1195) 
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup samples = 4000
## 
## Population-Level Effects: 
##                Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## TE_Intercept       3.87      0.06     3.74     3.99 1.00     5228     3303
## Mood_Intercept     4.71      0.05     4.62     4.81 1.00     5225     2930
## TE_H241            1.35      0.16     1.05     1.66 1.00     6432     3187
## Mood_H241          0.17      0.06     0.05     0.29 1.00     6274     3028
## Mood_TE            0.08      0.01     0.06     0.10 1.00     4979     3326
## 
## Family Specific Parameters: 
##            Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma_TE       2.01      0.04     1.93     2.09 1.00     5456     2483
## sigma_Mood     0.77      0.02     0.74     0.80 1.00     6364     2944
## 
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

Model summary

Where’s my mediation?

\begin{align*} me &= a \times b &\mbox{[mediated effect]} \\ c &= c’ + me &\mbox{[total effect]} \end{align*}

h <- c(
  a = "TE_H241 = 0",
  b = "Mood_TE = 0",
  cp = "Mood_H241 = 0",
  me = "TE_H241 * Mood_TE = 0",
  c = "TE_H241 * Mood_TE + Mood_H241 = 0"
  )
hypothesis(fit0, h)

## Hypothesis Tests for class b:
##   Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob Star
## 1          a     1.35      0.16     1.05     1.66         NA        NA    *
## 2          b     0.08      0.01     0.06     0.10         NA        NA    *
## 3         cp     0.17      0.06     0.05     0.29         NA        NA    *
## 4         me     0.10      0.02     0.07     0.14         NA        NA    *
## 5          c     0.28      0.06     0.16     0.39         NA        NA    *
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.

Model summary

post <- posterior_samples(fit0)
post <- post %>% 
  transmute(
    a = b_TE_H241,
    b = b_Mood_TE,
    cp = b_Mood_H241,
    me = a * b,
    c = cp + me,
    pme = me / c
  )

##       Estimate  Est.Error       Q2.5     Q97.5
## a   1.35243007 0.15639185 1.04652372 1.6621135
## b   0.07698683 0.01093411 0.05618260 0.0986256
## cp  0.17281850 0.06225339 0.05468629 0.2942805
## me  0.10420115 0.01957125 0.06917373 0.1437277
## c   0.27701965 0.06126432 0.15526924 0.3940304
## pme 0.39593471 0.12206173 0.22232368 0.6808670

Model summary

mcmc_intervals(post)

Figure

conditional_effects(fit0)

Multilevel Mediation

Between- vs. within-cluster causal models

Cluster: subject, school, festival, …
Mediation models often address between-subject processes
- Individuals measured once, causal process between individuals
We are interested in within-person causal processes
- Individuals measured repeatedly, causal process within individuals
Multilevel model
- Average person’s within-person causal process
- Causal effects’ heterogeneity
- Hierarchical Bayes estimates for individuals in current sample
Generally, applicable to any clustering (countries, schools, …) but we often talk about subjects
In the current example, may be heterogeneity between festivals?

Multilevel mediation

Cluster-specific parameters (e.g. \(a_1\))
Parameters’ prior distribution is estimated from data
\(\sigma_{a_jb_j}\) can indicate an omitted moderator (Tofighi, West, and MacKinnon 2013)

Multilevel mediation

\begin{align*} Y_{ij} &\sim N(d_{Yj} + {c’_j}X_{ij} + b_{j}M_{ij}, \sigma^{2}_Y) &\mbox{[Y model]} \\ M_{ij} &\sim N(d_{Mj} + {a_j}X_{ij}, \sigma^{2}_M) &\mbox{[M model]} \end{align*}

\[ \begin{pmatrix} d_{Mj} \\ d_{Yj} \\ a_j \\ b_j \\ c'_j \end{pmatrix} \sim N \begin{bmatrix} \begin{pmatrix} d_M \\ d_Y \\ a \\ b \\ c' \end{pmatrix}, \begin{pmatrix} \sigma^2_{d_{Mj}} & & & & \\ \sigma_{d_{Mj}d_{Yj}} & \sigma^2_{d_{Y_j}} & & & \\ \sigma_{d_{Mj}a_j} & \sigma_{d_{Yj}a_j} & \sigma^2_{a_j} & & \\ \sigma_{d_{Mj}b_j} & \sigma_{d_{Yj}b_j} & \sigma_{{a_j}{b_j}} & \sigma^2_{b_j} & \\ \sigma_{d_{Mj}c'_j} & \sigma_{d_{Yj}c'_j} & \sigma_{{a_j}{c'_j}} & \sigma_{{b_j}{c'_j}} & \sigma^2_{c'_j} \end{pmatrix} \end{bmatrix} \]

\begin{align*} me &= a \times b + \sigma_{a_{j}b_{j}} &\mbox{[mediated effect]} \\ c &= c’ + me &\mbox{[total effect]} \end{align*}

Multilevel mediation

Practical implementation

We developed software for Bayesian estimation of multilevel mediation models (Vuorre 2017; Vuorre and Bolger 2017)

bmlm: Bayesian Multi-Level Mediation

R package
Bayesian inference
Data preprocessing, model estimation, summarizing, and visualization
Continuous and binary Y
https://mvuorre.github.io/bmlm/

install.packages("bmlm")

Multilevel mediation

Practical implementation

I wrote the bmlm package before brms had multivariate capabilities
We can go through the paper to learn more about bmlm
Here, we will focus on a more general solution

brms

Bayesian Regression Models using Stan (Bürkner 2017, 2018)

R package
Bayesian inference
Extremely flexible
A bit more post-processing required with mediation vs. bmlm

Data

It is possible that there is heterogeneity between events, and thus we model parameters as varying between events.

First, we remove between-event variability from mediator:

dat <- isolate(dat, by = "Survey", value = "TE")

## # A tibble: 6 x 6
##    Mood    TE Gender Survey H24    TE_cw
##   <dbl> <dbl> <fct>  <fct>  <fct>  <dbl>
## 1     5     3 1      Event3 0     -1.76 
## 2     5     2 2      Event5 0     -0.554
## 3     6     7 1      Event4 0      3.63 
## 4     6     7 2      Event2 0      2.54 
## 5     5     1 1      Event3 0     -3.76 
## 6     5     7 2      Event4 0      3.63

Model estimation

Then, extend the model to a multilevel model with Surveys

path_m <- bf(
  TE ~ H24 + 
    (H24 |p| Survey)
)
path_y <- bf(
  Mood ~ H24 + TE_cw + 
    (H24 + TE_cw |p| Survey)
)
fit1 <- brm(
  path_m + path_y + set_rescor(FALSE),
  data = dat,
  chains = 4,
  cores = 4,
  control = list(adapt_delta = .99),
  file = here("models/mediation-1")
)

|p| indicates shared covariance matrix, p is arbitrary

Model summary

summary(fit1)

##  Family: MV(gaussian, gaussian) 
##   Links: mu = identity; sigma = identity
##          mu = identity; sigma = identity 
## Formula: TE ~ H24 + (H24 | p | Survey) 
##          Mood ~ H24 + TE_cw + (H24 + TE_cw | p | Survey) 
##    Data: dat (Number of observations: 1195) 
## Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
##          total post-warmup samples = 4000
## 
## Group-Level Effects: 
## ~Survey (Number of levels: 6) 
##                                  Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(TE_Intercept)                     0.94      0.39     0.48     1.98 1.00     1433     1944
## sd(TE_H241)                          0.48      0.44     0.01     1.64 1.00     1655     1821
## sd(Mood_Intercept)                   0.27      0.14     0.12     0.60 1.00     1320     1229
## sd(Mood_H241)                        0.20      0.20     0.01     0.73 1.00     1696     2118
## sd(Mood_TE_cw)                       0.03      0.03     0.00     0.12 1.01     1255     1984
## cor(TE_Intercept,TE_H241)           -0.22      0.39    -0.86     0.59 1.00     3369     3000
## cor(TE_Intercept,Mood_Intercept)     0.42      0.32    -0.33     0.89 1.00     2262     2524
## cor(TE_H241,Mood_Intercept)         -0.18      0.41    -0.84     0.65 1.00     1973     2983
## cor(TE_Intercept,Mood_H241)          0.08      0.41    -0.72     0.80 1.00     5142     2804
## cor(TE_H241,Mood_H241)              -0.03      0.41    -0.77     0.74 1.00     3309     3114
## cor(Mood_Intercept,Mood_H241)       -0.03      0.41    -0.77     0.75 1.00     4150     3338
## cor(TE_Intercept,Mood_TE_cw)        -0.18      0.37    -0.84     0.58 1.00     4041     2920
## cor(TE_H241,Mood_TE_cw)              0.07      0.40    -0.72     0.81 1.00     3089     2774
## cor(Mood_Intercept,Mood_TE_cw)      -0.13      0.39    -0.81     0.63 1.00     3900     2836
## cor(Mood_H241,Mood_TE_cw)            0.01      0.41    -0.75     0.76 1.00     2824     3454
## 
## Population-Level Effects: 
##                Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## TE_Intercept       3.71      0.41     2.90     4.55 1.01      757     1493
## Mood_Intercept     5.02      0.13     4.77     5.27 1.01     1199     1118
## TE_H241            1.07      0.34     0.46     1.86 1.00     1962     2239
## Mood_H241          0.11      0.14    -0.18     0.38 1.00     1876     1824
## Mood_TE_cw         0.06      0.02     0.02     0.11 1.00     2249     1777
## 
## Family Specific Parameters: 
##            Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma_TE       1.90      0.04     1.83     1.98 1.00     5943     2972
## sigma_Mood     0.76      0.02     0.73     0.79 1.00     5749     2298
## 
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).

Model summary

post <- posterior_samples(fit1)
covab <- VarCorr(fit1, summary = FALSE)$Survey$cov %>% 
  .[,"TE_H241","Mood_TE_cw"]
post <- post %>% 
  transmute(
    a = b_TE_H241,
    b = b_Mood_TE_cw,
    cp = b_Mood_H241,
    me = a * b + covab,
    c = cp + me,
    pme = me / c
  )

##       Estimate  Est.Error        Q2.5     Q97.5
## a   1.07311535 0.34013759  0.46026372 1.8629118
## b   0.06325264 0.02213318  0.01892453 0.1089930
## cp  0.11084096 0.14444963 -0.17772653 0.3845805
## me  0.06958872 0.03812499  0.01025625 0.1617054
## c   0.18042968 0.14888641 -0.10801953 0.4718210
## pme 0.56056236 8.84003895 -1.55520637 2.5623303

Model summary

mcmc_intervals(post)

Model summary

conditional_effects(fit1)

Heterogeneity

Posterior predictive check

pp_check(
  fit1, resp = "TE", nsamples = 2,
  type = "freqpoly_grouped", group = "Survey" 
  )

Posterior predictive check

pp_check(
  fit1, resp = "Mood", nsamples = 2,
  type = "freqpoly_grouped", group = "Survey" 
  )

Posterior predictive check

Any ideas?

Model comparison

ll <- loo(fit0, fit1)
ll

## Output of model 'fit0':
## 
## Computed from 4000 by 1195 log-likelihood matrix
## 
##          Estimate   SE
## elpd_loo  -3916.7 44.4
## p_loo         7.7  0.8
## looic      7833.3 88.7
## ------
## Monte Carlo SE of elpd_loo is 0.0.
## 
## All Pareto k estimates are good (k < 0.5).
## See help('pareto-k-diagnostic') for details.
## 
## Output of model 'fit1':
## 
## Computed from 4000 by 1195 log-likelihood matrix
## 
##          Estimate   SE
## elpd_loo  -3842.5 43.0
## p_loo        22.9  1.5
## looic      7685.0 85.9
## ------
## Monte Carlo SE of elpd_loo is 0.1.
## 
## Pareto k diagnostic values:
##                          Count Pct.    Min. n_eff
## (-Inf, 0.5]   (good)     1194  99.9%   1198      
##  (0.5, 0.7]   (ok)          1   0.1%   2390      
##    (0.7, 1]   (bad)         0   0.0%   <NA>      
##    (1, Inf)   (very bad)    0   0.0%   <NA>      
## 
## All Pareto k estimates are ok (k < 0.7).
## See help('pareto-k-diagnostic') for details.
## 
## Model comparisons:
##      elpd_diff se_diff
## fit1   0.0       0.0  
## fit0 -74.2      12.5

References

Bürkner, Paul-Christian. 2017. “Brms: An R Package for Bayesian Multilevel Models Using Stan.” Journal of Statistical Software 80 (1): 128. https://doi.org/10.18637/jss.v080.i01.

———. 2018. “Advanced Bayesian Multilevel Modeling with the R Package Brms.” The R Journal 10 (1): 395–411. https://journal.r-project.org/archive/2018/RJ-2018-017/index.html.

Forstmann, Matthias, Daniel A. Yudkin, Annayah M. B. Prosser, S. Megan Heller, and Molly J. Crockett. 2020. “Transformative Experience and Social Connectedness Mediate the Mood-Enhancing Effects of Psychedelic Use in Naturalistic Settings.” Proceedings of the National Academy of Sciences 117 (5): 2338–46. https://doi.org/10.1073/pnas.1918477117.

Spencer, Steven J., Mark P. Zanna, and Geoffrey T. Fong. 2005. “Establishing a Causal Chain: Why Experiments Are Often More Effective Than Mediational Analyses in Examining Psychological Processes.” Journal of Personality and Social Psychology 89 (6): 845–51. https://doi.org/10.1037/0022-3514.89.6.845.

Tofighi, Davood, Stephen G. West, and David P. MacKinnon. 2013. “Multilevel Mediation Analysis: The Effects of Omitted Variables in the 111 Model.” British Journal of Mathematical and Statistical Psychology 66 (2): 290–307. https://doi.org/10.1111/j.2044-8317.2012.02051.x.

Vuorre, Matti. 2017. Bmlm: Bayesian Multilevel Mediation. https://cran.r-project.org/package=bmlm.

Vuorre, Matti, and Niall Bolger. 2017. “Within-Subject Mediation Analysis for Experimental Data in Cognitive Psychology and Neuroscience.” Behavior Research Methods, December, 1–19. https://doi.org/10.3758/s13428-017-0980-9.