tables vs graphs


load packages

## graphics
theme_set(theme_bw() + theme(panel.spacing = grid::unit(0, "lines")))
library(ggh4x) ## for nested facets

turning tables into graphs

why graphs instead of tables?

tables are best suited for looking up specific information, and graphs are better for perceiving trends and making comparisons and predictions

why not tables instead of graphs?


example: Wei (2017) Table 5.5

rearranged data

dd <- read_table("../data/wei_tab5.5.txt")

## # A tibble: 6 × 11
##   dataset     r type  MGHD.ERR MGHD.ARI MST.ERR MST.ARI `MI/MGHD.ERR`
##   <chr>   <dbl> <chr>    <dbl>    <dbl>   <dbl>   <dbl>         <dbl>
## 1 sim1     0.05 est     0.0608   0.774   0.0688  0.771         0.121 
## 2 sim1     0.05 sd      0.0292   0.0925  0.0557  0.0998        0.0302
## 3 sim1     0.1  est     0.0578   0.782   0.277   0.456         0.188 
## 4 sim1     0.1  sd      0.0116   0.0412  0.0895  0.215         0.0392
## 5 sim1     0.2  est     0.0674   0.752   0.231   0.562         0.311 
## 6 sim1     0.2  sd      0.0335   0.108   0.0604  0.105         0.0552
## # … with 3 more variables: MI/MGHD.ARI <dbl>, MI/MST.ERR <dbl>,
## #   MI/MST.ARI <dbl>


dd2 <- (dd
  %>% pivot_longer(names_to = "model", values_to = "val",
                   cols = -c(dataset, r, type))
  %>% separate(model, into = c("model", "stat"), sep = "\\.")
  ## est + sd in a single row
  %>% pivot_wider(names_from = type, values_from = val)
head(dd2, 4)

## # A tibble: 4 × 6
##   dataset     r model stat     est     sd
##   <chr>   <dbl> <chr> <chr>  <dbl>  <dbl>
## 1 sim1     0.05 MGHD  ERR   0.0608 0.0292
## 2 sim1     0.05 MGHD  ARI   0.774  0.0925
## 3 sim1     0.05 MST   ERR   0.0688 0.0557
## 4 sim1     0.05 MST   ARI   0.771  0.0998

add auxiliary information

simtab <- read.table(header=TRUE,text="
dataset distribution covstruc separation
sim1 MGHD VEE well-separated
sim2 MGHD VEE overlapping
sim3 MST VEI well-separated
sim4 MST VEI overlapping
sim5 GMM VEE well-separated
sim6 GMM VEE overlapping
dd3 <- dd2 %>% full_join(simtab,by="dataset")


gg1 <- (ggplot(dd3,aes(factor(r),est,colour=model)) 
  + geom_point()+geom_line(aes(group=model))   ## points and lines
  ## transparent ribbons, +/- 1 SD:
  + geom_ribbon(aes(ymin=est-sd,ymax=est+sd,group=model,fill=model),
  ## limit y-axis, compress out-of-bounds values
  + scale_y_continuous(limits=c(0,1),oob=scales::squish)
  + ggh4x::facet_nested(stat~distribution+covstruc+separation)
  + labs(x="r (proportion missing)",y="")
  + scale_colour_brewer(palette="Dark2")
  + scale_fill_brewer(palette="Dark2"))


possible improvements?

table-to-graph tricks


Gelman, Andrew. 2011. “Why Tables Are Really Much Better Than Graphs.” Journal of Computational and Graphical Statistics 20 (1): 3–7.

Gelman, Andrew, Cristian Pasarica, and Rahul Dodhia. 2002. “Let’s Practice What We Preach: Turning Tables into Graphs.” The American Statistician 56 (2): 121–30.

Wei, Yuhong. 2017. “Extending Growth Mixture Models and Handling Missing Values via Mixtures of Non-Elliptical Distributions.” Thesis.