class: inverse middle background-image: url(images/wattle_bee.jpg) background-position: 99% 99% background-size: 55% # Uncertainty in data visualisation through the lens of statistical inference ## Di Cook <br> Monash University ### Visualising Uncertainty <br> Rostock Retreat <br> June 21, 2021 <br> .tiny[[https://dicook.org/files/Rostock2021/slides.html](https://dicook.org/files/Rostock2021/slides.html)] <br> .footnote[Image credit: Di Cook, 2018] --- class: middle center inverse # Hello ππ» -- <center><img src="http://dicook.org/img/dicook-2019.png" style="height: 300px; border-radius: 50%;"> </center> -- Professor, Monash University, Melbourne, Australia <br> -- <br> [ππ» https://dicook.org/files/Rostock2021/slides.html](https://dicook.org/files/Rostock2021/slides.html) --- class: inverse middle center # Motivation --- background-color: white .panelset[ .panel[.panel-name[map] .yellow[What do you read from each display?] <center> <img src="images/choro-1.png" style="width: 45%; align: center" /> <img src="images/fullhexmap-1.png" style="width: 45%; align: center" /> </center> These images were motivated by work on the Australian Cancer Atlas, and show thyroid cancer incidence as a choropleth map (left) and a new type of display a hexagon tile map (right). .tiny[Kobakian and Cook (unpublished) https://github.com/srkobakian/experiment] ] .panel[.panel-name[explanation] .yellow[What do you read from each display?] .pull-left[ High thyroid cancer incidence mostly located in east coast. ] .pull-right[ High thyroid cancer incidence is evident around Brisbane, Sydney, Perth and in some inner city Melbourne areas. ] ] ] --- class: center background-color: white background-image: url(images/wide_open_australia.jpg) background-position: 50% 50% background-size: cover # .white["Land doesnβt get cancer, people do"] .white[We need to establish a new display for Australia to allow us to read spatial distribution of values measured on people] --- class: middle center To test which design is better we are going to use the .yellow[lineup protocol]. Each slide shows 12 plots, numbered .yellow[1 through 12] at the top of each plot. One of the 12 is a data plot and the remaining 11 are null plots. -- Pick the plot that is .yellow[most different] from the others. <br><br> Write them down just for yourself, .yellow[without sharing], for now.<br><br> -- .yellow[## Ready?] -- --- background-color: white background-image: url(images/aus_nwse_6_geo.png) background-size: 68% background-position: 15% 15% --- background-color: white background-image: url(images/aus_nwse_3_hex.png) background-size: 68% background-position: 15% 15% --- background-color: white background-image: url(images/aus_three_8_geo.png) background-size: 68% background-position: 15% 15% --- background-color: white background-image: url(images/aus_three_5_hex.png) background-size: 68% background-position: 15% 15% --- background-color: white background-image: url(images/aus_cities_9_geo.png) background-size: 68% background-position: 15% 15% --- background-color: white background-image: url(images/aus_cities_3_hex.png) background-size: 68% background-position: 15% 15% --- # Check your choices <br> <center> The data plot is in these positions <style> table{ border-spacing: unset; # inherent, initial, unset, 0px } </style> <table class="table" style='font-family: "Courier New", courier; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:right;background-color: #fdf6e3 !important;"> page </th> <th style="text-align:right;background-color: #fdf6e3 !important;"> location </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;background-color: #fdf6e3 !important;"> 8 </td> <td style="text-align:right;background-color: #fdf6e3 !important;font-weight: bold;"> 6 </td> </tr> <tr> <td style="text-align:right;background-color: #fdf6e3 !important;"> 9 </td> <td style="text-align:right;background-color: #fdf6e3 !important;font-weight: bold;"> 3 </td> </tr> <tr> <td style="text-align:right;background-color: #fdf6e3 !important;"> 10 </td> <td style="text-align:right;background-color: #fdf6e3 !important;font-weight: bold;"> 8 </td> </tr> <tr> <td style="text-align:right;background-color: #fdf6e3 !important;"> 11 </td> <td style="text-align:right;background-color: #fdf6e3 !important;font-weight: bold;"> 5 </td> </tr> <tr> <td style="text-align:right;background-color: #fdf6e3 !important;"> 12 </td> <td style="text-align:right;background-color: #fdf6e3 !important;font-weight: bold;"> 9 </td> </tr> <tr> <td style="text-align:right;background-color: #fdf6e3 !important;"> 14 </td> <td style="text-align:right;background-color: #fdf6e3 !important;font-weight: bold;"> 3 </td> </tr> </tbody> </table> </center> --- class: inverse middle center # Conducting visual inference --- # About the lineup protocol Based on the statistical justice system - Compare the data plot (.yellow[accused]) with a population of null plots (.yellow[innocents]) - If the data plot (.yellow[accused]) is picked from the lineup, then reject the null hypothesis because it looks different (.yellow[guilty]). - The `\(p\)`-value is the probability that the accused would look this different (.yellow[guilty]) if they actually were not really different (.yellow[innocent]). .tiny[[Wickham et al (2010) IEEE TVCG](https://vita.had.co.nz/papers/inference-infovis.html)] --- <table class="table" style="font-size: 24px; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;background-color: #fdf6e3 !important;"> plot </th> <th style="text-align:left;background-color: #fdf6e3 !important;"> question </th> <th style="text-align:left;background-color: #fdf6e3 !important;"> null </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;background-color: #fdf6e3 !important;"> Chloropleth maps </td> <td style="text-align:left;background-color: #fdf6e3 !important;"> Is there a spatial trend? </td> <td style="text-align:left;background-color: #fdf6e3 !important;"> No relationship between location and statistic value </td> </tr> <tr> <td style="text-align:left;background-color: #fdf6e3 !important;"> Tag cloud </td> <td style="text-align:left;background-color: #fdf6e3 !important;"> Is this document the same as that document? </td> <td style="text-align:left;background-color: #fdf6e3 !important;"> No difference in word counts </td> </tr> <tr> <td style="text-align:left;background-color: #fdf6e3 !important;"> Treemap </td> <td style="text-align:left;background-color: #fdf6e3 !important;"> Is the distribution within higher-level categories the same? </td> <td style="text-align:left;background-color: #fdf6e3 !important;"> No difference in proportions within categories </td> </tr> <tr> <td style="text-align:left;background-color: #fdf6e3 !important;"> Histogram </td> <td style="text-align:left;background-color: #fdf6e3 !important;"> Is the underlying distribution smooth? </td> <td style="text-align:left;background-color: #fdf6e3 !important;"> Distribution is smooth </td> </tr> <tr> <td style="text-align:left;background-color: #fdf6e3 !important;"> Histogram </td> <td style="text-align:left;background-color: #fdf6e3 !important;"> Is the underlying distribution bell-shaped? </td> <td style="text-align:left;background-color: #fdf6e3 !important;"> Distribution is normal </td> </tr> <tr> <td style="text-align:left;background-color: #fdf6e3 !important;"> Residual plot </td> <td style="text-align:left;background-color: #fdf6e3 !important;"> Are residuals normally distributed? </td> <td style="text-align:left;background-color: #fdf6e3 !important;"> Distribution is normal </td> </tr> <tr> <td style="text-align:left;background-color: #fdf6e3 !important;"> Scatterplot </td> <td style="text-align:left;background-color: #fdf6e3 !important;"> Are the two variables associated? </td> <td style="text-align:left;background-color: #fdf6e3 !important;"> No association </td> </tr> <tr> <td style="text-align:left;background-color: #fdf6e3 !important;"> Scatterplot, coloured </td> <td style="text-align:left;background-color: #fdf6e3 !important;"> Are points clustered by colour? </td> <td style="text-align:left;background-color: #fdf6e3 !important;"> No difference between coloured groups </td> </tr> <tr> <td style="text-align:left;background-color: #fdf6e3 !important;"> Time series </td> <td style="text-align:left;background-color: #fdf6e3 !important;"> Does the mean change over time? </td> <td style="text-align:left;background-color: #fdf6e3 !important;"> Same mean over time </td> </tr> <tr> <td style="text-align:left;background-color: #fdf6e3 !important;"> Time series </td> <td style="text-align:left;background-color: #fdf6e3 !important;"> Does the variability change over time? </td> <td style="text-align:left;background-color: #fdf6e3 !important;"> Same variability over time </td> </tr> </tbody> </table> --- background-color: white # A visual t-test <img src="slides_files/figure-html/unnamed-chunk-3-1.png" width="100%" /> Nulls: samples .yellow[simulated] from same normal distribution <br> Null hypothesis: Both groups are samples from the same distribution --- background-color: white # A visual t-test: take 2 <img src="slides_files/figure-html/unnamed-chunk-4-1.png" width="100%" /> Nulls: samples generated by .yellow[permuting labels A, B] <br> Null hypothesis: Both groups are samples from the same distribution --- # Procedure (1/5) .panelset[ .panel[.panel-name[tidy data] <table class="table" style='font-size: 18px; font-family: "Courier New", courier; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:left;background-color: #fdf6e3 !important;"> x </th> <th style="text-align:right;background-color: #fdf6e3 !important;"> y </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;background-color: #fdf6e3 !important;"> B </td> <td style="text-align:right;background-color: #fdf6e3 !important;"> 0.92 </td> </tr> <tr> <td style="text-align:left;background-color: #fdf6e3 !important;"> B </td> <td style="text-align:right;background-color: #fdf6e3 !important;"> 2.18 </td> </tr> <tr> <td style="text-align:left;background-color: #fdf6e3 !important;"> A </td> <td style="text-align:right;background-color: #fdf6e3 !important;"> -0.27 </td> </tr> <tr> <td style="text-align:left;background-color: #fdf6e3 !important;"> A </td> <td style="text-align:right;background-color: #fdf6e3 !important;"> -3.92 </td> </tr> <tr> <td style="text-align:left;background-color: #fdf6e3 !important;"> A </td> <td style="text-align:right;background-color: #fdf6e3 !important;"> -2.12 </td> </tr> <tr> <td style="text-align:left;background-color: #fdf6e3 !important;"> B </td> <td style="text-align:right;background-color: #fdf6e3 !important;"> 1.73 </td> </tr> <tr> <td style="text-align:left;background-color: #fdf6e3 !important;"> A </td> <td style="text-align:right;background-color: #fdf6e3 !important;"> -1.65 </td> </tr> <tr> <td style="text-align:left;background-color: #fdf6e3 !important;"> B </td> <td style="text-align:right;background-color: #fdf6e3 !important;"> 1.81 </td> </tr> <tr> <td style="text-align:left;background-color: #fdf6e3 !important;"> B </td> <td style="text-align:right;background-color: #fdf6e3 !important;"> 3.51 </td> </tr> <tr> <td style="text-align:left;background-color: #fdf6e3 !important;"> B </td> <td style="text-align:right;background-color: #fdf6e3 !important;"> 3.60 </td> </tr> </tbody> </table> ] .panel[.panel-name[is it tidy?] <br> <br> <table class="table" style='font-size: 18px; font-family: "Courier New", courier; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:left;background-color: #fdf6e3 !important;"> ID </th> <th style="text-align:right;background-color: #fdf6e3 !important;"> koala_NSW </th> <th style="text-align:right;background-color: #fdf6e3 !important;"> koala_VIC </th> <th style="text-align:right;background-color: #fdf6e3 !important;"> bilby_NSW </th> <th style="text-align:right;background-color: #fdf6e3 !important;"> bilby_VIC </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;background-color: #fdf6e3 !important;"> grey </td> <td style="text-align:right;background-color: #fdf6e3 !important;"> 23 </td> <td style="text-align:right;background-color: #fdf6e3 !important;"> 43 </td> <td style="text-align:right;background-color: #fdf6e3 !important;"> 11 </td> <td style="text-align:right;background-color: #fdf6e3 !important;"> 8 </td> </tr> <tr> <td style="text-align:left;background-color: #fdf6e3 !important;"> cream </td> <td style="text-align:right;background-color: #fdf6e3 !important;"> 56 </td> <td style="text-align:right;background-color: #fdf6e3 !important;"> 89 </td> <td style="text-align:right;background-color: #fdf6e3 !important;"> 22 </td> <td style="text-align:right;background-color: #fdf6e3 !important;"> 17 </td> </tr> <tr> <td style="text-align:left;background-color: #fdf6e3 !important;"> white </td> <td style="text-align:right;background-color: #fdf6e3 !important;"> 35 </td> <td style="text-align:right;background-color: #fdf6e3 !important;"> 72 </td> <td style="text-align:right;background-color: #fdf6e3 !important;"> 13 </td> <td style="text-align:right;background-color: #fdf6e3 !important;"> 6 </td> </tr> <tr> <td style="text-align:left;background-color: #fdf6e3 !important;"> black </td> <td style="text-align:right;background-color: #fdf6e3 !important;"> 28 </td> <td style="text-align:right;background-color: #fdf6e3 !important;"> 44 </td> <td style="text-align:right;background-color: #fdf6e3 !important;"> 19 </td> <td style="text-align:right;background-color: #fdf6e3 !important;"> 16 </td> </tr> <tr> <td style="text-align:left;background-color: #fdf6e3 !important;"> taupe </td> <td style="text-align:right;background-color: #fdf6e3 !important;"> 25 </td> <td style="text-align:right;background-color: #fdf6e3 !important;"> 37 </td> <td style="text-align:right;background-color: #fdf6e3 !important;"> 21 </td> <td style="text-align:right;background-color: #fdf6e3 !important;"> 12 </td> </tr> </tbody> </table> ] .panel[.panel-name[what is tidy?] <br> <br> - Variables in columns - Observations in rows <br> <br> .tiny[https://vita.had.co.nz/papers/tidy-data.pdf] ] ] --- # Procedure (2/5) Use the grammar of graphics to define a plot ```r *ggplot(aes(x = x, y = y, colour = x)) + geom_point() + stat_summary(fun = "mean", shape = 4, size=1) + scale_colour_brewer("", palette="Dark2") + some_nice_plot_styling ``` Based on this mapping, what would be considered null (not interesting)? --- # Procedure (3/5) Add data or null ```r *data %>% ggplot(aes(x = x, y = y, colour = x)) + geom_point() + stat_summary(fun = "mean", shape = 4, size=1) + scale_colour_brewer("", palette="Dark2") + some_nice_plot_styling ``` --- # Procedure (4/5) Hide the data among a field of nulls ```r *library(nullabor) *lineup(null_permute("y"), n=10, * true = data) %>% ggplot(aes(x = x, y = y, colour = x)) + geom_point() + stat_summary(fun = "mean", shape = 4, size=1) + scale_colour_brewer("", palette="Dark2") + * facet_wrap(~.sample) + some_nice_plot_styling ``` --- # Procedure (5/5) Ask uninvolved, independent observers to pick the plot that is most different from the lineup. We would expect that the chance that any observer chooses the data plot is 1/10 (or `\(1/m\)` generally, where `\(m\)` is the number of plots in the lineup. Suppose you had 23 observers, and 8 of them choose the data plot as the most different. ```r pvisual(x=8, K=23, m=10) ``` ``` ## x simulated binom ## [1,] 8 0.0099 0.001229827 ``` --- class: inverse middle center # How do we know it works? --- background-color: white .pull-left[ # Validation experiment [Majumder et al (2013)](https://www.tandfonline.com/doi/abs/10.1080/01621459.2013.808157) conducted validation study to compare the performance of the lineup protocol, assessed by human evaluators, in comparison to the classical test, using subjects employed with Amazon's Mechanical Turk. .tiny[http://datascience.unomaha.edu/turk/exp2/index.html] ] .pull-right[ Power analysis of human evaluation relative to classical test. <center><img src="images/majumder1.png" width="65%" /> <img src="images/majumder2.png" width="25%" /> </center> ] --- class: inverse middle center # How can it be used to compare plot designs? --- background-color: white .pull-left[ # Power to compare plot design [Hofmann et al (2012)](https://ieeexplore.ieee.org/document/6327249) show how power, interpreted as .yellow[proportion of observers who detect] the data plot, from different plot designs, can be used to establish which is better. How to make the power calculations and generate confidence intervals for power of different designs. ] .pull-right[ <center><img src="images/eucline.png" width="90%" /> <br> <img src="images/polar.png" width="90%" /> <br> <img src="images/power.jpg" width="60%" /> </center> ] --- class: inverse middle center # Back to the maps experiment --- background-color: white # Thyroid cancer incidence in Australia <center> <img src="images/choro-1.png" style="width: 45%; align: center" /> <img src="images/fullhexmap-1.png" style="width: 45%; align: center" /> </center> --- # Experimental design .yellow[Factor]: choropleth, hexagon tile map<br> .yellow[Structure]: NW to SE trend, hotspot in 3 cities, hotspot in all cities <br><br> .yellow[Null]: Simulation from a variogram model to model spatial dependence, using `gstat` package (144 null sets) <br> .yellow[Replicates]: Four <br> .yellow[Lineups]: 12 plots in a lineup <br> .yellow[Data plot]: Trend added to null, for one plot in a lineup<br><br> .yellow[Displays]: Two sets, with coin flip for data displayed as choropleth or hexagon tile map, sets A and B <br> .yellow[Subjects]: 42 subjects for set A, and 53 for set B --- background-color: white background-image: url(images/detect-compare-1.jpg) background-size: 68% background-position: 80% 15% # Results --- background-image: url(http://dicook.github.io/nullabor/reference/figures/nullabor_hex.png) background-position: 70% 1% background-size: 15% # How to do this yourself Get a copy of the `nullabor` package ```r install.packages("nullabor") ``` or ```r # install.packages("remotes") remotes::install_github("dicook/nullabor") ``` Look at the "Get started" documentation at http://dicook.github.io/nullabor/index.html --- class: inverse middle center # Statistical inference architecture is built to measure uncertainty --- background-color: white background-image: url(images/vis_inf.png) background-size: 95% background-position: 100% 1% Statistical <br> inference <br> architecture --- background-color: white # Group A1 Data vis challenge Which display of uncertainty is better? <img src="slides_files/figure-html/unnamed-chunk-13-1.png" width="100%" /> --- # Thanks for listening! Here's what I hope you take away from this talk: - Uncertainty means what might change if you had a different sample - This is what statistical inference addresses - Plots can be embedded into an inferential framework - Crowd-sourcing can help with conducting inference with plots - Visual inference might be helpful to test your new ideas for adding indications of uncertainty into your displays --- ### Additional reading .tiny[ ^ Buja et al (2009) Statistical Inference for Exploratory Data Analysis and Model Diagnostics, RSPT A <br> ^ Wickham et al (2010) Graphical Inference for Infovis, TVCG <br> ^ Hofmann et al (2012) Graphical Tests for Power Comparison of Competing Design, TVCG <br> ^ Majumder et al (2013) Validation of Visual Statistical Inference, Applied to Linear Models, JASA <br> ^ Yin et al (2013) Visual Mining Methods for RNA-Seq data: Examining Data structure, Understanding Dispersion estimation and Significance Testing, JDMGP <br> ^ Zhao, et al (2014) Mind Reading: Using An Eye-tracker To See How People Are Looking At Lineups, IJITA <br> ^ Lin et al (2015) Does host-plant diversity explain species richness in insects? Ecological Entomology<br> ^ Roy Chowdhury et al (2015) Using Visual Statistical Inference to Better Understand Random Class Separations in High Dimension, Low Sample Size Data, CS <br> ^ Loy et al (2017) Model Choice and Diagnostics for Linear, <br> Mixed-Effects Models Using Statistics on Street Corners, JCGS <br> ^ Roy Chowdhury et al (2018) Measuring Lineup Difficulty By Matching Distance Metrics with Subject Choices in Crowd- Sourced Data, JCGS <br> ^ Vanderplas et al (2020) Testing Statistical Charts: What Makes a Good Graph? ARSIA <br> ^ Vanderplas et al (2021) Statistical significance calculations for scenarios in visual inference. Stat. <br> ] --- class: inverse middle background-image: url(images/wattle_bee.jpg) background-position: 99% 1% background-size: 25% # Acknowledgements Slides created via the R package [**xaringan**](https://github.com/yihui/xaringan), with **wattle theme** created from [xaringanthemer](https://github.com/gadenbuie/xaringanthemer). The chakra comes from [remark.js](https://remarkjs.com), [**knitr**](http://yihui.name/knitr), and [R Markdown](https://rmarkdown.rstudio.com). Slides are available at [https://dicook.org/files/Rostock2021/slides.html](https://dicook.org/files/NCB2021/slides.html) and supporting files at [https://github.com/dicook/Rostock2021](https://github.com/dicook/NCB2021). <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a>. .footnote[Image credit: Di Cook, 2018]