Uncertainty in data visualisation through the lens of statistical inference

class: inverse middle
background-image: url(images/wattle_bee.jpg)
background-position: 99% 99%
background-size: 55%

# Uncertainty in data visualisation through the lens of statistical inference

## Di Cook <br> Monash University

### Visualising Uncertainty <br> Rostock Retreat <br> June 21, 2021

<br>
.tiny[[https://dicook.org/files/Rostock2021/slides.html](https://dicook.org/files/Rostock2021/slides.html)]
<br>
.footnote[Image credit: Di Cook, 2018]

---
class: middle center inverse

# Hello 👋🏻
--

<center><img src="http://dicook.org/img/dicook-2019.png" style="height: 300px; border-radius: 50%;"> </center>
--

Professor, Monash University, Melbourne, Australia <br>
--

<br> [👉🏻 https://dicook.org/files/Rostock2021/slides.html](https://dicook.org/files/Rostock2021/slides.html)

---
class: inverse middle center

# Motivation

---
background-color: white

.panelset[
.panel[.panel-name[map]

.yellow[What do you read from each display?]

These images were motivated by work on the Australian Cancer Atlas, and show thyroid cancer incidence as a choropleth map (left) and a new type of display a hexagon tile map (right).

.tiny[Kobakian and Cook (unpublished) https://github.com/srkobakian/experiment]

]
.panel[.panel-name[explanation]

.yellow[What do you read from each display?]

.pull-left[

High thyroid cancer incidence mostly located in east coast.

]
.pull-right[

High thyroid cancer incidence is evident around Brisbane, Sydney, Perth and in some inner city Melbourne areas. 
]

]
]

---
class: center
background-color: white
background-image: url(images/wide_open_australia.jpg)
background-position: 50% 50%
background-size: cover

# .white["Land doesn’t get cancer, people do"]

.white[We need to establish a new display for Australia to allow us to read spatial distribution of values measured on people]

---
class: middle center

To test which design is better we are going to use the .yellow[lineup protocol]. Each slide shows 12 plots, numbered .yellow[1 through 12] at the top of each plot. One of the 12 is a data plot and the remaining 11 are null plots. 
--

Pick the plot that is .yellow[most different] from the others.  <br><br> Write them down just for yourself, .yellow[without sharing], for now.<br><br>
--

.yellow[## Ready?]
--

---
background-color: white
background-image: url(images/aus_nwse_6_geo.png)
background-size: 68%
background-position: 15% 15%

---
background-color: white
background-image: url(images/aus_nwse_3_hex.png)
background-size: 68%
background-position: 15% 15%

---
background-color: white
background-image: url(images/aus_three_8_geo.png)
background-size: 68%
background-position: 15% 15%

---
background-color: white
background-image: url(images/aus_three_5_hex.png)
background-size: 68%
background-position: 15% 15%

---
background-color: white
background-image: url(images/aus_cities_9_geo.png)
background-size: 68%
background-position: 15% 15%

---
background-color: white
background-image: url(images/aus_cities_3_hex.png)
background-size: 68%
background-position: 15% 15%

---
# Check your choices

<br>
<center>
The data plot is in these positions

<table class="table" style='font-family: "Courier New", courier; margin-left: auto; margin-right: auto;'>
 <thead>
  <tr>
   <th style="text-align:right;background-color: #fdf6e3 !important;"> page </th>
   <th style="text-align:right;background-color: #fdf6e3 !important;"> location </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:right;background-color: #fdf6e3 !important;"> 8 </td>
   <td style="text-align:right;background-color: #fdf6e3 !important;font-weight: bold;"> 6 </td>
  </tr>
  <tr>
   <td style="text-align:right;background-color: #fdf6e3 !important;"> 9 </td>
   <td style="text-align:right;background-color: #fdf6e3 !important;font-weight: bold;"> 3 </td>
  </tr>
  <tr>
   <td style="text-align:right;background-color: #fdf6e3 !important;"> 10 </td>
   <td style="text-align:right;background-color: #fdf6e3 !important;font-weight: bold;"> 8 </td>
  </tr>
  <tr>
   <td style="text-align:right;background-color: #fdf6e3 !important;"> 11 </td>
   <td style="text-align:right;background-color: #fdf6e3 !important;font-weight: bold;"> 5 </td>
  </tr>
  <tr>
   <td style="text-align:right;background-color: #fdf6e3 !important;"> 12 </td>
   <td style="text-align:right;background-color: #fdf6e3 !important;font-weight: bold;"> 9 </td>
  </tr>
  <tr>
   <td style="text-align:right;background-color: #fdf6e3 !important;"> 14 </td>
   <td style="text-align:right;background-color: #fdf6e3 !important;font-weight: bold;"> 3 </td>
  </tr>
</tbody>
</table>

</center>

---
class: inverse middle center

# Conducting visual inference

---
# About the lineup protocol

Based on the statistical justice system

- Compare the data plot (.yellow[accused]) with a population of null plots (.yellow[innocents])
- If the data plot (.yellow[accused]) is picked from the lineup, then reject the null hypothesis because it looks different (.yellow[guilty]). 
- The `\(p\)`-value is the probability that the accused would look this different (.yellow[guilty]) if they actually were not really different (.yellow[innocent]).

.tiny[[Wickham et al (2010) IEEE TVCG](https://vita.had.co.nz/papers/inference-infovis.html)]

---

<table class="table" style="font-size: 24px; margin-left: auto; margin-right: auto;">
 <thead>
  <tr>
   <th style="text-align:left;background-color: #fdf6e3 !important;"> plot </th>
   <th style="text-align:left;background-color: #fdf6e3 !important;"> question </th>
   <th style="text-align:left;background-color: #fdf6e3 !important;"> null </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> Chloropleth maps </td>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> Is there a spatial trend? </td>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> No relationship between location and statistic value </td>
  </tr>
  <tr>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> Tag cloud </td>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> Is this document the same as that document? </td>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> No difference in word counts </td>
  </tr>
  <tr>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> Treemap </td>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> Is the distribution within higher-level categories the same? </td>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> No difference in proportions within categories </td>
  </tr>
  <tr>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> Histogram </td>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> Is the underlying distribution smooth? </td>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> Distribution is smooth </td>
  </tr>
  <tr>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> Histogram </td>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> Is the underlying distribution bell-shaped? </td>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> Distribution is normal </td>
  </tr>
  <tr>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> Residual plot </td>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> Are residuals normally distributed? </td>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> Distribution is normal </td>
  </tr>
  <tr>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> Scatterplot </td>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> Are the two variables associated? </td>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> No association </td>
  </tr>
  <tr>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> Scatterplot, coloured </td>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> Are points clustered by colour? </td>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> No difference between coloured groups </td>
  </tr>
  <tr>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> Time series </td>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> Does the mean change over time? </td>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> Same mean over time </td>
  </tr>
  <tr>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> Time series </td>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> Does the variability change over time? </td>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> Same variability over time </td>
  </tr>
</tbody>
</table>

---
background-color: white
# A visual t-test

Nulls: samples .yellow[simulated] from same normal distribution <br>
Null hypothesis: Both groups are samples from the same distribution

---
background-color: white
# A visual t-test: take 2

Nulls: samples generated by .yellow[permuting labels A, B] <br>
Null hypothesis: Both groups are samples from the same distribution

---
# Procedure (1/5)

.panelset[
.panel[.panel-name[tidy data]

]
.panel[.panel-name[is it tidy?]

<table class="table" style='font-size: 18px; font-family: "Courier New", courier; margin-left: auto; margin-right: auto;'>
 <thead>
  <tr>
   <th style="text-align:left;background-color: #fdf6e3 !important;"> ID </th>
   <th style="text-align:right;background-color: #fdf6e3 !important;"> koala_NSW </th>
   <th style="text-align:right;background-color: #fdf6e3 !important;"> koala_VIC </th>
   <th style="text-align:right;background-color: #fdf6e3 !important;"> bilby_NSW </th>
   <th style="text-align:right;background-color: #fdf6e3 !important;"> bilby_VIC </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> grey </td>
   <td style="text-align:right;background-color: #fdf6e3 !important;"> 23 </td>
   <td style="text-align:right;background-color: #fdf6e3 !important;"> 43 </td>
   <td style="text-align:right;background-color: #fdf6e3 !important;"> 11 </td>
   <td style="text-align:right;background-color: #fdf6e3 !important;"> 8 </td>
  </tr>
  <tr>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> cream </td>
   <td style="text-align:right;background-color: #fdf6e3 !important;"> 56 </td>
   <td style="text-align:right;background-color: #fdf6e3 !important;"> 89 </td>
   <td style="text-align:right;background-color: #fdf6e3 !important;"> 22 </td>
   <td style="text-align:right;background-color: #fdf6e3 !important;"> 17 </td>
  </tr>
  <tr>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> white </td>
   <td style="text-align:right;background-color: #fdf6e3 !important;"> 35 </td>
   <td style="text-align:right;background-color: #fdf6e3 !important;"> 72 </td>
   <td style="text-align:right;background-color: #fdf6e3 !important;"> 13 </td>
   <td style="text-align:right;background-color: #fdf6e3 !important;"> 6 </td>
  </tr>
  <tr>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> black </td>
   <td style="text-align:right;background-color: #fdf6e3 !important;"> 28 </td>
   <td style="text-align:right;background-color: #fdf6e3 !important;"> 44 </td>
   <td style="text-align:right;background-color: #fdf6e3 !important;"> 19 </td>
   <td style="text-align:right;background-color: #fdf6e3 !important;"> 16 </td>
  </tr>
  <tr>
   <td style="text-align:left;background-color: #fdf6e3 !important;"> taupe </td>
   <td style="text-align:right;background-color: #fdf6e3 !important;"> 25 </td>
   <td style="text-align:right;background-color: #fdf6e3 !important;"> 37 </td>
   <td style="text-align:right;background-color: #fdf6e3 !important;"> 21 </td>
   <td style="text-align:right;background-color: #fdf6e3 !important;"> 12 </td>
  </tr>
</tbody>
</table>

]
.panel[.panel-name[what is tidy?]

<br>
<br>
- Variables in columns
- Observations in rows
<br>
<br>

.tiny[https://vita.had.co.nz/papers/tidy-data.pdf]
]
]

---

# Procedure (2/5)

Use the grammar of graphics to define a plot

```r
*ggplot(aes(x = x, y = y, colour = x)) +
  geom_point() +
  stat_summary(fun = "mean", 
               shape = 4, size=1) +
  scale_colour_brewer("", palette="Dark2") +
  some_nice_plot_styling
```

Based on this mapping, what would be considered null (not interesting)?

---
# Procedure (3/5)

Add data or null

```r
*data %>%
  ggplot(aes(x = x, y = y, colour = x)) +
  geom_point() +
  stat_summary(fun = "mean", 
               shape = 4, size=1) +
  scale_colour_brewer("", palette="Dark2") +
  some_nice_plot_styling
```

---
# Procedure (4/5)

Hide the data among a field of nulls

```r
*library(nullabor)
*lineup(null_permute("y"), n=10,
*           true = data) %>%
  ggplot(aes(x = x, y = y, colour = x)) +
  geom_point() +
  stat_summary(fun = "mean", 
               shape = 4, size=1) +
  scale_colour_brewer("", palette="Dark2") +
* facet_wrap(~.sample) +
  some_nice_plot_styling
```

---
# Procedure (5/5)

Ask uninvolved, independent observers to pick the plot that is most different from the lineup. We would expect that the chance that any observer chooses the data plot is 1/10 (or `\(1/m\)` generally, where `\(m\)` is the number of plots in the lineup.

Suppose you had 23 observers, and 8 of them choose the data plot as the most different.

```r
pvisual(x=8, K=23, m=10)
```

```
##      x simulated       binom
## [1,] 8    0.0099 0.001229827
```

---
class: inverse middle center

# How do we know it works?

---
background-color: white

.pull-left[
# Validation experiment

[Majumder et al (2013)](https://www.tandfonline.com/doi/abs/10.1080/01621459.2013.808157) conducted validation study to compare the performance of the lineup protocol, assessed by human evaluators, in comparison to the classical test, using subjects employed with Amazon's Mechanical Turk.

.tiny[http://datascience.unomaha.edu/turk/exp2/index.html]
]
.pull-right[

Power analysis of human evaluation  relative to classical test.

]

---
class: inverse middle center

# How can it be used to compare plot designs?

---
background-color: white
.pull-left[
# Power to compare plot design

[Hofmann et al (2012)](https://ieeexplore.ieee.org/document/6327249) show how power, interpreted as .yellow[proportion of observers who detect] the data plot, from different plot designs, can be used to establish which is better.

How to make the power calculations and generate confidence intervals for power of different designs.

]
.pull-right[

<br>
<img src="images/power.jpg" width="60%" />
</center>
]

---
class: inverse middle center
# Back to the maps experiment

---
background-color: white
# Thyroid cancer incidence in Australia

---
# Experimental design

.yellow[Factor]: choropleth, hexagon tile map<br>
.yellow[Structure]: NW to SE trend, hotspot in 3 cities, hotspot in all cities <br><br>
.yellow[Null]: Simulation from a variogram model to model spatial dependence, using `gstat` package (144 null sets) <br>
.yellow[Replicates]: Four <br>
.yellow[Lineups]: 12 plots in a lineup <br>
.yellow[Data plot]: Trend added to null, for one plot in a lineup<br><br>
.yellow[Displays]: Two sets, with coin flip for data displayed as choropleth or hexagon tile map, sets A and B <br>
.yellow[Subjects]: 42 subjects for set A, and 53 for set B

---
background-color: white
background-image: url(images/detect-compare-1.jpg)
background-size: 68%
background-position: 80% 15%

# Results

---
background-image: url(http://dicook.github.io/nullabor/reference/figures/nullabor_hex.png)
background-position: 70% 1%
background-size: 15%

# How to do this yourself

Get a copy of the `nullabor` package

```r
install.packages("nullabor")
```

```r
# install.packages("remotes")
remotes::install_github("dicook/nullabor")
```

Look at the "Get started" documentation at http://dicook.github.io/nullabor/index.html

---
class: inverse middle center

# Statistical inference architecture

is built to measure uncertainty

---
background-color: white
background-image: url(images/vis_inf.png)
background-size: 95%
background-position: 100% 1%

Statistical <br> inference <br> architecture

---
background-color: white
# Group A1 Data vis challenge

Which display of uncertainty is better?

---
# Thanks for listening!

Here's what I hope you take away from this talk:

- Uncertainty means what might change if you had a different sample
- This is what statistical inference addresses
- Plots can be embedded into an inferential framework
- Crowd-sourcing can help with conducting inference with plots
- Visual inference might be helpful to test your new ideas for adding indications of uncertainty into your displays

---
### Additional reading

.tiny[
^ Buja et al (2009) Statistical Inference for Exploratory Data Analysis and Model Diagnostics, RSPT A <br>
^ Wickham et al (2010) Graphical Inference for Infovis, TVCG <br>
^ Hofmann et al (2012) Graphical Tests for Power Comparison of Competing Design, TVCG <br>
^ Majumder et al (2013) Validation of Visual Statistical Inference, Applied to Linear Models, JASA <br>
^ Yin et al (2013) Visual Mining Methods for RNA-Seq data: Examining Data structure, Understanding Dispersion estimation and Significance Testing, JDMGP <br>
^ Zhao, et al (2014) Mind Reading: Using An Eye-tracker To See How People Are Looking At Lineups, IJITA <br>
^ Lin et al (2015) Does host-plant diversity explain species richness in insects? Ecological Entomology<br>
^ Roy Chowdhury et al (2015) Using Visual Statistical Inference to Better Understand Random Class Separations in High Dimension, Low Sample Size Data, CS <br>
^ Loy et al (2017) Model Choice and Diagnostics for Linear, <br> Mixed-Effects Models Using Statistics on Street Corners, JCGS <br>
^ Roy Chowdhury et al (2018) Measuring Lineup Difficulty By Matching Distance Metrics with Subject Choices in Crowd- Sourced Data, JCGS <br>
^ Vanderplas et al (2020) Testing Statistical Charts: What Makes a Good Graph? ARSIA <br>
^ Vanderplas et al (2021) Statistical significance calculations for scenarios in visual inference. Stat. <br>
]

---
class: inverse middle
background-image: url(images/wattle_bee.jpg)
background-position: 99% 1%
background-size: 25%

# Acknowledgements

Slides created via the R package [**xaringan**](https://github.com/yihui/xaringan), with **wattle theme** created from [xaringanthemer](https://github.com/gadenbuie/xaringanthemer).

The chakra comes from [remark.js](https://remarkjs.com), [**knitr**](http://yihui.name/knitr), and [R Markdown](https://rmarkdown.rstudio.com).

Slides are available at [https://dicook.org/files/Rostock2021/slides.html](https://dicook.org/files/NCB2021/slides.html) and supporting files at [https://github.com/dicook/Rostock2021](https://github.com/dicook/NCB2021).

<a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a>.

.footnote[Image credit: Di Cook, 2018]