library(mlmRev) ## for contraception data
## Warning: package 'lme4' was built under R version 4.3.2
library(ggplot2)
Adjust some graphical parameters:
theme_set(theme_bw()+
theme(panel.spacing=grid::unit(0,"lines")))
g1 <- (ggplot(Contraception,aes(x=age,y=use))+
labs(x="Centered age",y="Contraceptive use")
)
print(g1+geom_point())
Not very useful!
Use a fancy component (stat_sum
) that calculates the
proportion of the data in a given category that fall at exactly the same
(x,y) location, and by default changes the size of the symbol
proportionally:
print(g1+stat_sum())
Use transparency – not a mapping (which would be inside an
aes()
function) but a hard-coded value:
print(g1+stat_sum(alpha=0.25))
Distinguish by number of live children:
g2 <- (ggplot(Contraception,aes(x=age,y=use,colour=livch))
+ labs(x="Centered age",y="Contraceptive use"))
g3 <- g2 + stat_sum(alpha=0.25)
print(g3)
Add smooth lines:
g4 <- (g3
+ geom_smooth(aes(y=as.numeric(use)))
)
print(g4)
Adjust the axis limits.
g5 <- (g4
+ coord_cartesian(ylim=c(0.9,2.1))
)
print(g5)
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
Technical details:
coord_cartesian(ylim=...)
rather than just
saying +ylim(0.9,2.1)
, which will erase the
confidence-interval ribbons that go below the lower limits of the graph
(although oob=scales::squish
is useful here)Facet by urban/rural: the syntax is rows~columns
, and
.
here means “nothing”, or “only a single row”
(cf. facet_wrap
)
g4+facet_grid(.~urban)
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
Separate urban/rural by line type instead (and turn off confidence-interval ribbons):
g3 + geom_smooth(aes(y=as.numeric(use),lty=urban),se=FALSE)
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
as.numeric(use)
because
use
is a factor.Or make the confidence-interval ribbons light
(alpha=0.1
) instead of turning them off completely:
print(g3 + geom_smooth(aes(y=as.numeric(use),
lty=urban),alpha=0.1))
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
Advanced topic: smooth by fitting a polynomial logistic regression.
g6 <- ggplot(Contraception,aes(x=age,
y=as.numeric(use)-1,
colour=livch))+
stat_sum(alpha=0.25)+labs(x="Centered age",y="Contraceptive use")
print(g6 + geom_smooth(aes(lty=urban),method="glm",
method.args=list(family="binomial"),
formula=y~poly(x,2),alpha=0.1))
Technical details:
A more common use would be to use method="lm"
to get
linear regression lines for each group.
Two-way faceting:
print(g1 +
facet_grid(livch~urban)+geom_smooth(aes(y=as.numeric(use)))
)
This is nice, although a plot without data is always a little scary.
Use a generalized additive model (takes some more tweaking, see
k
parameter!)
print(g3
+ aes(y=as.numeric(use)-1)
+ geom_smooth(aes(lty=urban,y=as.numeric(use)-1),
method="gam",
method.args=list(family="binomial"),
formula=y~s(x,k=5),
alpha=0.1)
)