Load packages …

## fitting
library(lme4)
library(glmmTMB)
## Warning in check_dep_version(dep_pkg = "TMB"): package version mismatch: 
## glmmTMB was built with TMB package version 1.9.17
## Current TMB package version is 1.9.18
## Please re-install glmmTMB from source or restore original 'TMB' package (see '?reinstalling' for more information)
library(lattice)
## diagnostics etc.
library(broom.mixed)
library(DHARMa)
library(performance)
library(sjPlot)
library(dotwhisker)
##
library(ggplot2); theme_set(theme_bw())
library(cowplot)

Examples

Formulae

Formula Meaning
y ~ x No random effects
y ~ x + (1|g) The intercept is a random effect
y ~ x + (1|site/block) Nested random effects (block within site)
y ~ x + (1|site) + (1|year) Crossed random effects
y ~ x + (1|site:block) Interaction (only block within site)
y ~ x + site + (1|site:block) Fixed site, random block within site
y ~ x + (x|g) Intercept and slope are random effects
y ~ (x|g) Zero slope on average (weird!)
y ~ x + (1|g)+(0+x|g) Independent slope and intercept

Look at the data

  • Spaghetti plot; alternatively +facet_wrap(~Subject)
print((q0 <- ggplot(sleepstudy, aes(Days, Reaction, colour = Subject))
    + geom_point()) + geom_line())

(Alternatives; connect, don’t colour; suppress legend; separate individuals into separate facets … for scatterplots, try ggalt::geom_encircle(). See examples here as well.)

Basic model fits

library(lme4)
## per-group fit (fixed)
lm1 <- lmList(Reaction~Days|Subject, data=sleepstudy)
## fixed effect
lm1B <- lm(Reaction~Days*Subject, data=sleepstudy)
## random intercept
lm2 <- lmer(Reaction~Days+(1|Subject), data=sleepstudy)
## random slopes
lm3 <- lmer(Reaction~Days+(1+Days|Subject), data=sleepstudy) 

Use random slopes when feasible, i.e. when some individuals are measured on more than 1 day (Schielzeth and Forstmeier 2009)

Compute predictions

pp <- expand.grid(Days=0:9, Subject=levels(sleepstudy$Subject))
pp <- (pp                  
  |> cbind(pred_int = predict(lm2, newdata=pp))
  |> cbind(pred_int_slope = predict(lm3, newdata=pp))
  |> cbind(pred_fixed = predict(lm1B, newdata=pp))
  |> tidyr::pivot_longer(starts_with("pred"))
)

plot predictions

theme_set(theme_classic()+theme(legend.position="none"))
q0+geom_line(data=pp, aes(y=value)) + facet_wrap(~name)

diagnostics

  • DHARMa::simulateResiduals()
  • performance::check_model()

coefficient plots

Use broom.mixed (in conjunction with dotwhisker::dwplot)

  • use tidy directly for more control (?tidy.merMod)
  • may want to specify effects="fixed", conf.int=TRUE, conf.method="profile" (for CIs on random effects parameters)
tt <- tidy(lm3, conf.int=TRUE, conf.method="profile")

dwplot(tt)

random effects plots

lattice::dotplot(ranef(lm3))
## $Subject

Or sjPlot::plot_model()

post-hoc stuff

emmeans, effects, car::Anova(), drop1, etc. all work …

model choice considerations

  • what is the maximal model?
  • Which effects vary within which groups?
  • If effects don’t vary within groups, then we can’t estimate among-group variation in the effect
    • convenient
    • maybe less powerful (among-group variation is lumped into residual variation)
  • e.g. female rats exposed to different doses of radiation, multiple pups per mother, multiple measurements per pup (labeled by time). Maximal model … ?

Maximal model often won’t actually work

e.g.

  • Culcita (coral-reef) example: randomized-block design, so each treatment (none/crabs/shrimp/both) is repeated in every block; thus (treat|block) is maximal
  • CBPP data: each herd is measured in every period, so in principle we could use (period|herd), not just (1|herd)

Random-slopes models

  • what does (x|g) (or (1+x|g) really do?
  • both intercept (baseline) and slope vary across groups
  • estimates bivariate zero-centered distribution:

\[ (\textrm{intercept}, \textrm{slope}) = \textrm{MVN}\left(\boldsymbol 0, \left[ \begin{array}{cc} \sigma^2_{\textrm{int}} & \sigma_{\textrm{int},\textrm{slope}} \\ \sigma_{\textrm{int},\textrm{slope}} & \sigma^2_{\textrm{slope}} \end{array} \right] \right) \]

equatiomatic package

equatiomatic::extract_eq(lm3)

\[ \begin{aligned} \operatorname{Reaction}_{i} &\sim N \left(\alpha_{j[i]} + \beta_{1j[i]}(\operatorname{Days}), \sigma^2 \right) \\ \left( \begin{array}{c} \begin{aligned} &\alpha_{j} \\ &\beta_{1j} \end{aligned} \end{array} \right) &\sim N \left( \left( \begin{array}{c} \begin{aligned} &\mu_{\alpha_{j}} \\ &\mu_{\beta_{1j}} \end{aligned} \end{array} \right) , \left( \begin{array}{cc} \sigma^2_{\alpha_{j}} & \rho_{\alpha_{j}\beta_{1j}} \\ \rho_{\beta_{1j}\alpha_{j}} & \sigma^2_{\beta_{1j}} \end{array} \right) \right) \text{, for Subject j = 1,} \dots \text{,J} \end{aligned} \]

What is a practical model?

  • Avoid singular fits
  • singular = zero variances, +/- 1 correlations
  • More subtle for larger models: use isSingular()

How to simplify?

  • remove random effects
  • simplify random effects (Bates et al. 2015; Matuschek et al. 2017; Barr et al. 2013)
    • drop slopes
    • force correlations to zero (use 1+x||g to fit intercept and slope independently)
    • structured random effects (diagonal, compound symmetric …)

Convergence failures

  • convergence failures are common
  • what do they really mean? how to fix them? when can they be ignored?
  • approximate test that gradient=0 and curvature is correct
  • scale and center predictors; simplify model
  • use ?allFit to see whether different optimizers give sufficiently similar answers
    • $fixef, etc.: are answers sufficiently similar?
    • $llik: how similar is goodness-of-fit?

allFit

m1 <- lmer(Reaction ~ Days + (Days|Subject), sleepstudy)
aa <- allFit(m1)
## bobyqa : [OK]
## Nelder_Mead : [OK]
## nlminbwrap : [OK]
## nmkbw : [OK]
## optimx.L-BFGS-B : [OK]
## nloptwrap.NLOPT_LN_NELDERMEAD : [OK]
## nloptwrap.NLOPT_LN_BOBYQA : [OK]
ss <- summary(aa)
names(ss)
## [1] "which.OK" "msgs"     "fixef"    "llik"     "sdcor"    "theta"    "times"   
## [8] "feval"
ss$fixef
##                               (Intercept)     Days
## bobyqa                           251.4051 10.46729
## Nelder_Mead                      251.4051 10.46729
## nlminbwrap                       251.4051 10.46729
## nmkbw                            251.4051 10.46729
## optimx.L-BFGS-B                  251.4051 10.46729
## nloptwrap.NLOPT_LN_NELDERMEAD    251.4051 10.46729
## nloptwrap.NLOPT_LN_BOBYQA        251.4051 10.46729

allFit plot

source("https://raw.githubusercontent.com/lme4/lme4/refs/heads/master/misc/plot_allFit.R")
par(las=1); tpar(fmar = c(1,12,1,1))
plot(aa, flip = TRUE)

GLMMs: integration techniques

  • PQL, Laplace, adaptive Gauss-Hermite quadrature (AGHQ)
  • fast/flexible/inaccurate \(\to\) slow/inflexible/accurate
  • amount of data per group is what matters
    • number of points, info per point (binary vs count)
  • set nAGQ= to a larger value (1=Laplace; 25 should be more than enough)

GLMMs: more details

  • glmer pretty reliable
  • glmmTMB more flexible, faster on extended models (NB etc.)

Other resources

References

Barr, Dale J., Roger Levy, Christoph Scheepers, and Harry J. Tily. 2013. “Random Effects Structure for Confirmatory Hypothesis Testing: Keep It Maximal.” Journal of Memory and Language 68 (3): 255–78. https://doi.org/10.1016/j.jml.2012.11.001.

Bates, Douglas, Reinhold Kliegl, Shravan Vasishth, and Harald Baayen. 2015. “Parsimonious Mixed Models.” arXiv:1506.04967 [Stat], June. http://arxiv.org/abs/1506.04967.

Matuschek, Hannes, Reinhold Kliegl, Shravan Vasishth, Harald Baayen, and Douglas Bates. 2017. “Balancing Type I Error and Power in Linear Mixed Models.” Journal of Memory and Language 94: 305–15. https://doi.org/10.1016/j.jml.2017.01.001.

Schielzeth, Holger, and Wolfgang Forstmeier. 2009. “Conclusions Beyond Support: Overconfident Estimates in Mixed Models.” Behavioral Ecology 20 (2): 416–20. https://doi.org/10.1093/beheco/arn145.