lm in R), to distinguish it from the
generalized linear model (glm)
- Independence
- Response (dependent) variables are linear functions of predictor (derived) variables, in turn based on input (observed, “independent”) variables
- One or more predictor variables per input variable
- Estimate a parameter for the effect of each predictor variable
- Errors or residuals have constant variance and are Normally distributed
- In other words, the difference between our model predictions and our observations is Normal
- not assuming the marginal distribution is Normal
- (Predictor variables measured without error)
- \(y = a + b x\)
- \(y = a + b x + c x^2 + d x^3\)
- \(y = \frac{a}{b+x}\)
- \(y = a + b \log(x)\)
- \(y = \log(a + b x)\)
- \(y = a x^b\)
- \(\log(y) = a + b \log(x)\)
- Continuous predictor: straight line, one parameter
- \(Y = a+bX\): \(Y\) is the response, \(X\) is the input variable
(\(b\) is the slope - expected change in \(Y\) per unit change in \(X\))- Categorical predictor with two categories: one parameter
- difference in predicted value between levels, or code the categories as 0, 1 … (dummy variables)
- Parameters are (usually) easy to interpret
- can get \(p\)-values, CIs for each parameter
- Often have have several predictor variables (parameters) associated with each input variable
- Non-linear response to an input variable
- Linear models can work for non-linear responses! e.g. \(Y = a + bX + cX^2\) (predictor vars \(\{1, X, X^2 \}\))
- For complex smooth curves better to use splines (
splines::ns(), ormgcvpackage)- Categorical variables with \(>2\) levels
- Why can’t we just code them, for example as 0, 1, 2?
- dummy variables or contrasts
- If you’re just trying to control for it, include it in the model and then ignore it!
- Otherwise, you can (and should) get a \(p\) value for the variable as a whole (it’s not in
summary()output)- But we can only get CIs on the parameters — and there are different ways to parameterize (contrasts)
- If you do have (a) clear scientific question(s), you can set up contrasts that allow you to test it/them (we can help)
- If you must, do pairwise comparisons, test each pair of variables for differences and make an
aabbclist (more later)
- Interactions allow the value of one predictor to affect the relationship between another predictor and the response variable
- Interpreting main effects in the presence of interactions is tricky (principle of marginality)
- Estimated effect of \(A\) depends on baseline value of \(B\): pick a fixed point (Schielzeth 2010), or average in some way
- Example: \(Y = a + b_1 X_1 + b_2 X_2 + b_{12} X_1 X_2\)
- The response to \(X_1\) is \(Y = (a+b_2 X_2) + (b_1+b_{12}X_2) X_1\) the response to \(X_1\) depends on the value of \(X_2\).
nlme::Orthodont data)
- You want to know whether (how!) a drug treatment changes the metabolism of some rabbits
- You’re using adult, prime-aged rabbits and keeping them under controlled conditions, so you don’t expect their metabolism to change without the drug.
- Not good enough!
- You also introduce some control rabbits and treat them exactly the same, including giving them fake pills. You find no significant change in the metabolism of the control rabbits through time
- Still not good enough! Why?
- In simple cases (2x2), the interaction only uses one parameter
- We can use CIs, and coefficient plots, and understand what’s going on
- In complicated cases, interaction terms have many parameters
- These have all the interpretation problems of other multi-parameter variables, or more
- Think about “differences in differences”
- Bear movement behaviour (Karelus et al. 2017)
- Predictor variables: sex (categorical), road type (categorical: major/minor), road length (continuous)
- Two-way interactions
- sex \(\times\) road length: “are females more or less sensitive to amount of road than males?”
- sex \(\times\) road type: “do females vary behaviour between road type more or less than males?”
- road type \(\times\) road length: “how differently does amount of road affect crossings for different road types?”
- Don’t accept the null hypothesis
- Don’t throw out predictors you wanted to test because they’re not significant
- Don’t throw out interactions you wanted to test because they’re not significant
- This may make your life harder, but it’s worth it for the karma
- There are techniques to deal with ‘too many’ predictors (e.g., ridge or lasso regression)
- good ways to estimate sensible main effects in the presence of interactions (centering variables (Schielzeth 2010); “sum-to-zero contrasts”)
- Because the linear model depends (somewhat!) on assumptions, it is good to evaluate them
- Concerns (in order):
- Bias/linearity: did you miss a nonlinear relationship?
- Heteroscedasticity (does variance change across the data set, e.g. increasing variance with increasing mean?)
- Independence is hard to test (unless your observations are ordered, e.g. in time or space)
- Normality (assuming no overall problems, do your residuals look Normal?) [the least important of these assumptions]
Data from Paczolt et al. (2015)
skewdata <- read.csv("../data/skewdat.csv")
m1 <- lm(skew~Size, data=skewdata)
plot(m1, id.n=4)
performance::check_model(m1)
library(DHARMa); plot(simulateResiduals(m1))
MASS::boxcox() or
car::powerTransform()lm fits a linear modelsummary prints statistics associated with the fitted
parametersdotwhisker::dwplot() plots a coefficient
table (Gelman, Pasarica, and Dodhia
2002); by_2sd=TRUE automatically scales continuous
predictorscar::Anova() and drop1() finds \(p\) values for multi-parameter effects.
anova(model1, model2) tests the difference between two
specific modelsbroom::tidy() gives you tidy coefficient tables;
consider conf.int = TRUEplot.lm() (S3 method): diagnostic** plotspredict to predict values (newdata,
se.fit args)simulate to simulate values from the fitted modelmethods(class="lm") shows lots of possibilitiesggplot: geom_smooth(method="lm") fits a
linear model to each group of data (i.e. by facet/colour/etc.)
and plots predictions with CIseffects, emmeans, sjPlot,
marginaleffects, ggpredict, …data (Schoener 1970)
library(tidyverse)
lizards <- read.csv("../data/lizards.csv", stringsAsFactors=TRUE) |>
mutate(across(time, ~ factor(., levels=c("early","midday","late"))))
lm_int <- lm(grahami~light*time, data=lizards)
lm_add <- update(lm_int, . ~ light + time)
lm_light <- update(lm_add, . ~ . - time)
lm_time <- update(lm_add, . ~ . - light)
summary(lm_add)
##
## Call:
## lm(formula = grahami ~ light + time, data = lizards)
##
## Residuals:
## Min 1Q Median 3Q Max
## -19.4250 -11.6937 0.2125 6.6250 28.5750
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 27.288 5.631 4.846 0.000112 ***
## lightsunny -19.325 5.734 -3.370 0.003213 **
## timemidday 13.137 7.106 1.849 0.080099 .
## timelate -9.500 6.853 -1.386 0.181743
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 13.71 on 19 degrees of freedom
## Multiple R-squared: 0.5466, Adjusted R-squared: 0.475
## F-statistic: 7.634 on 3 and 19 DF, p-value: 0.001513
anova(lm_add, lm_light)
## Analysis of Variance Table
##
## Model 1: grahami ~ light + time
## Model 2: grahami ~ light
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 19 3569.6
## 2 21 5481.9 -2 -1912.3 5.0895 0.01698 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Because light has only two levels (sunny/shady), this
gives the same \(p\)-value as shown in
summary() for lightsunny. drop1
and car::Anova give the same \(p\)-value too …
anova(lm_add, lm_time)
## Analysis of Variance Table
##
## Model 1: grahami ~ light + time
## Model 2: grahami ~ time
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 19 3569.6
## 2 20 5703.6 -1 -2134 11.359 0.003213 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
drop1(lm_add, test="F")
## Single term deletions
##
## Model:
## grahami ~ light + time
## Df Sum of Sq RSS AIC F value Pr(>F)
## <none> 3569.6 124.03
## light 1 2134.0 5703.6 132.81 11.3589 0.003213 **
## time 2 1912.3 5481.9 129.90 5.0895 0.016983 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
car::Anova(lm_add)
## Anova Table (Type II tests)
##
## Response: grahami
## Sum Sq Df F value Pr(>F)
## light 2134.0 1 11.3589 0.003213 **
## time 1912.3 2 5.0895 0.016983 *
## Residuals 3569.6 19
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
print(summary(lm_int))
##
## Call:
## lm(formula = grahami ~ light * time, data = lizards)
##
## Residuals:
## Min 1Q Median 3Q Max
## -30.250 -4.125 1.250 6.875 17.750
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 23.500 5.375 4.372 0.000416 ***
## lightsunny -11.750 7.602 -1.546 0.140606
## timemidday 27.750 7.602 3.650 0.001980 **
## timelate -12.750 7.602 -1.677 0.111798
## lightsunny:timemidday -32.833 11.190 -2.934 0.009266 **
## lightsunny:timelate 6.500 10.751 0.605 0.553432
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 10.75 on 17 degrees of freedom
## Multiple R-squared: 0.7504, Adjusted R-squared: 0.677
## F-statistic: 10.22 on 5 and 17 DF, p-value: 0.0001179
drop1 for interaction modelDrops only the interaction term (respects marginality)
drop1(lm_int, test = "F")
## Single term deletions
##
## Model:
## grahami ~ light * time
## Df Sum of Sq RSS AIC F value Pr(>F)
## <none> 1964.9 114.30
## light:time 2 1604.7 3569.6 124.03 6.9416 0.006254 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
From ?car::Anova [emphasis added]:
Type-II tests are calculated according to the principle of marginality, testing each term after all others, except ignoring the term’s higher-order relatives; so-called type-III tests violate marginality, testing each term in the model after all of the others. … Be very careful in formulating the model for type-III tests, or the hypotheses tested will not make sense.
Probably best to set
options(contrasts = c("contr.sum", "contr.poly")) if you
want to use type 3 ANOVA!
Three different questions, three different answers …
car::Anova(lm_int, type = "II")
car::Anova(lm_int, type = "III")
## alternative to setting contrasts globally
lm_int_sum <- update(lm_int,
contrasts = list(light = contr.sum,
time = contr.sum))
car::Anova(lm_int_sum, type = "III")
Contrasts allow you to ask specific questions (rather than “is the overall term significant?” or “all pairwise contrasts”)
help("contrast-methods", package = "emmeans")
library(emmeans)
em_time <- emmeans(lm_int, ~ time)
## NOTE: Results may be misleading due to involvement in interactions
## default does a multiple comparisons correction
print(em_contr <- contrast(em_time, method = "eff", adjust = "none"))
## contrast estimate SE df t.ratio p.value
## early effect -0.611 3.15 17 -0.194 0.8483
## midday effect 10.722 3.27 17 3.278 0.0044
## late effect -10.111 3.15 17 -3.214 0.0051
##
## Results are averaged over the levels of: light
plot(em_contr) + geom_vline(xintercept = 0, lty = 2)
There is a tradition of figuring out which pairs of
treatments are significantly different and assigning a letter to each
such “not-clearly-not-equivalent” group. If you need to do this to make
your supervisor happy, you can use
ggpubr. The multcomp package can do the basic
computations too.
From Russ Lenth (emmeans author):
These displays promote visually the idea that two means that are “not significantly different” are to be judged as being equal; and that is a very wrong interpretation. In addition, they draw an artificial “bright line” between P values on either side of alpha, even ones that are very close.
See emmeans vignette on comparisons
library(emmeans)
library(ggplot2); theme_set(theme_bw())
e1 <- emmeans(lm_add, spec = ~ time)
plot(pairs(e1)) + geom_vline(xintercept = 0, lty = 2)
library(effects)
## Loading required package: carData
## lattice theme set by effectsTheme()
## See ?effectsTheme for details.
plot(allEffects(lm_int))
by_2sd
standardization availablelibrary(dotwhisker)
dotwhisker::dwplot(
list(interaction = lm_int,
additive = lm_add)) + geom_vline(xintercept=0, lty =2)
plot(),
performance::check_model(), DHARMaemmeans, effects,
ggpredictemmeans and effects packages
are originally designed forplot(emmeans(lm_int, ~ time*light))
plot(emmeans(lm_int, ~ time), comparison = TRUE)
## NOTE: Results may be misleading due to involvement in interactions
Adds 83.4% “non-overlap” arrows (Knol, Pestman, and Grobbee 2011)