7 Uncertainty

By the end of this chapter you should gain the following knowledge and practical skills.

Knowledge

Appreciate the main challenges and objectives of uncertainty representation.
Learn how visualization techniques can be used to support ‘frequency framing’.
Understand how parameter uncertainty due to random fluctuation can be estimated computationally.

Practical skills

Generate estimates of parameter uncertainty using bootstrap resampling.
Apply functional-style programming for working over bootstrap resamples.
Write ggplot2 code to create uncertainty visualizations: icon arrays, risk theatres, gradient bars, ensemble and hypothetical outcome plots.

7.1 Introduction

Uncertainty is a key preoccupation of those working in statistics and data analysis. A lot of time is spent providing estimates for it, reasoning about it and trying to take it into account when making evidence-based claims and decisions. There are many ways in which uncertainty can enter a data analysis and many ways in which it can be conceptually represented. This chapter focuses mainly on parameter uncertainty: quantifying and conveying the different possible values that a quantity of interest might take. It is straightforward to imagine how visualization can support this. We can use data graphics to represent different values and give greater emphasis to those for which we have more certainty – to communicate or imply levels of uncertainty in the background. Such representations are nevertheless quite challenging to execute. In Chapter 3 we learnt that there is often a gap between the visual encoding of data and its perception. There is a tendency in standard data graphics to imbue data with marks that over-imply precision. We will consider research in Cartography and Information Visualization on uncertainty representation, before exploring and applying techniques for visually encoding parameter uncertainty. We will do so using STATS19 road safety data, exploring how injury severity rates in pedestrian-vehicle crashes vary over time and by geographic area.

7.2 Concepts

7.2.1 Uncertainty visualization

Cartographers and Information Visualization researchers have been concerned for some time with visual variables, or visual channels (Munzner 2014), that might be used to encode uncertainty information. Figure 7.1 displays several of these. Ideally, visual variables should be intuitive, logically related to notions of precision and accuracy, while also allowing sufficient discriminative power when deployed in data dense visualizations.

Figure 7.1: Visual variables that can be used to represent levels of uncertainty information. Sketchy rendering is generated with `Rough.js`, an implementation of the work published in Wood et al. (2012).

Kinkeldey, MacEachren, and Schiewe (2014) provides an overview of empirical research into the effectiveness of proposed visual variables against these criteria. As intuitive signifiers of uncertainty, or lack of precision, fuzziness (not encoded in Figure 7.1) and location have been shown to work well. Slightly less intuitive, but nevertheless successful in terms of discrimination, are size, transparency and colour value. Sketchiness is another intuitive signifier proposed in Boukhelifa et al. (2012). As with many visual variables, sketchiness is probably best considered as an ordinal visual variable to the extent that there is a limited range of sketchiness levels that can be discriminated. An additional feature of sketchiness is its sense of informality. This may be desirable in certain contexts, less so in others (see Wood et al. 2012 for further discussion).

When thinking about uncertainty visualization, a key guideline is that:

Things that are not precise should not be encoded with symbols that look precise.

Much discussed in recent literature on uncertainty visualization (e.g. Padilla, Kay, and Hullman 2021) is the US National Weather Service’s (NWS) (NHC 2023) cone graphic (Figure 7.2 (a)). The cone starts at the storm’s current location and spreads out to represent the modelled projected path of the storm. The main problem is that the cone implies the storm is expanding as it moves away from its current location, when this is not the case. In fact there is more uncertainty in the areas that could be affected by the storm the further away those areas are from the storm’s current location. The second problem is that the cone uses strong lines that imply precision. The temptation is to think that anything contained by the cone is unsafe and anything outside of it is safe. This is of course not what is suggested by the model. Rather, that areas not contained by the cone are beyond some chosen threshold probability. You will notice that the graphic in Figure 7.2 (a) is annotated with a guidance note to discourage such false interpretation.

(a) National Hurricane Center cone design.

In Van Goethem et al.’s (2014) redesign, colour value is used to represent four binned categories of storm probability suggested by the model. Greater visual saliency is therefore conferred to locations where there is greater certainty. The state boundaries are also encoded somehwat differently. In Figure 7.2 (b) US states are symbolised using a single line generated via curve schematisation (Van Goethem et al. 2014). The thinking here is that hard lines in maps tend to induce binary judgements. If the cone is close to but not overlapping a state boundary, for example, should a state’s authorities prepare and organise a response any differently from a state whose boundary very slightly overlaps the cone? Context around states is therefore introduced in the redesign, but in a way that discourages binary thinking; precise inferences of location are not possible as the state areas and borders are very obviously not exact.

7.2.2 Frequency framing

For practical reasons the rest of the chapter considers how these general principles for uncertainty representation might be applied to a single aspect of uncertainty: quantifiable parameter uncertainty. Parameters of interest are often probabilities or relative frequencies – ratios and percentages describing the probability of some event happening. It is notoriously difficult to develop intuition around these sorts of relative frequencies, and so data graphics can usefully support their interpretation.

In our STATS19 road crash dataset, a parameter of interest is the pedestrian injury severity rate, or the proportion of all pedestrian crashes that result in serious or fatal injury (KSI). We might wish to compare the injury severity rate of crashes taking place between two local authority areas, say Bristol and Sheffield. There is in fact quite a difference in the injury severity rate between these two local authorities. In 2019, 35 out of 228 reported crashes (15%) in Bristol were KSI, while for Sheffield this figure was 124 out of 248 reported crashes (50%). This feels like quite a large difference, but it is difficult to imagine or experience these differences in probabilities when written down or encoded visually using relative bar length in standardised bar charts.

Icon arrays are used in public health communication and have been demonstrated to be effective at communicating probabilities of event outcomes. They offload the thinking that happens when evaluating ratios. The icon arrays in Figure 7.3 communicate the two injury severity rates for Bristol and Sheffield. Each crash is a square, and crashes are coloured according to whether they resulted in a serious injury or fatality (dark red) or slight injury (light red). In the bottom row, cells are given a non-random ordering to effect something similar to a standardised bar chart. While the standardised bars enables the two recorded proportions to be “read-off” (15% and 50% KSI), the random arrangement of cells in the icon array perhaps builds intuition around the differences in probabilities of a pedestrian crash resulting in serious injury.

There are compelling examples of icon arrays being used in data journalism, most obviously to communicate outcome probabilities in political polling. You might remember that at the time of the 2016 US Presidential election there was much criticism levelled at pollsters, even the excellent FiveThirtyEight (Silver 2016), for not correctly calling the result. Huffpost gave Trump a 2% chance of winning the election, The New York Times 15% and FiveThirtyEight 28%. Clearly the Huffpost estimate was really quite off, but thinking about FiveThirtyEight’s prediction, how surprised should we be if an outcome that is predicted to happen with a probability of almost a third, does in fact occur?

Figure 7.4: Risk theatre of different election eve forecasts, reimplemented in ggplot2 but based on data graphics appearing in Gross (2016).

The risk theatre (Figure 7.4) is a variant of an icon array. In this case it represents polling probabilities as seats of a theatre – a dark seat represents a Trump victory. If you imagine buying a theatre ticket and being randomly allocated to a seat, how confident would you be about not sitting in a “Trump” seat in the FiveThirtyEight image? The distribution of dark seats suggests that the 28% risk of a Trump victory according to the model is not negligible.

7.2.3 Quantifying uncertainty in frequencies

In the icon arrays above we made little of the fact that the sample size varies between the two recorded crash rates. This was because the differences were in fact reasonably small. When looking at injury severity rates across all local authorities in the country, however, there is substantial variation in the rates and sample sizes. Bromsgrove has a very low injury severity rate based on a small sample size (4%, or one out of 27 crashes resulting in KSI); Cotswold has a very high injury severity rate based on a small sample size (75%, or 14 out of 19 crashes resulting in KSI). With some prior knowledge of these areas one might expect the difference in KSI rates to be in this direction, but would we expect the difference to be of this order of magnitude? Just three more KSIs recorded in Bromsgrove takes its KSI rate up to that of Bristol’s.

Although STATS19 is a population dataset to the extent that it contains data on every crash recorded by the police, it makes sense that the more data on which our KSI rates are based, the more certainty we have in them being reliable estimates of injury severity – ones that might be used to predict injury severity in future years. So we can treat our observed injury severity (KSI) rates as being derived from samples of an (unobtainable) population. Our calculated KSI rates are parameters that try to represent, or estimate, this population.

Although this formulation might seem unnecessary, from here we can apply some statistical concepts to quantify uncertainty around our KSI rates. We assume:

The variable of interest, KSI rate, has an unobtainable population mean and standard deviation.
That our data are one sample from this unobtainable population, but other samples could be drawn that will result in different outcomes, estimated KSI rates, simply by chance.
From any sample that is drawn we can calculate a mean and standard deviation in KSI rates.
And so we can derive a sampling distribution and obtain an array of estimated KSI rates and other parameters from resampling many times.
This sampling distribution could then be used to quantify how precise are our estimates of KSI rate. Generally the larger the sampling distribution, the more precise, the less uncertain, the estimate.

In Chapter 6 we used Confidence Intervals to estimate the uncertainty around regression coefficients. From early stats courses you might have learnt how Confidence Intervals can be calculated using statistical theory, but we can derive them empirically via bootstrapping – the process enumerated above. So a bootstrap resample involves taking a random sample with replacement from the original data and of the same size as the original data. From this resample a parameter estimate can be derived, in this case the KSI rate. And this process can be repeated many times to generate an empirical sampling distribution for the parameter. The standard error can be calculated from the standard deviation of the sampling distribution. This non-parametric bootstrapping approach is especially useful in exploratory analysis (Beecham and Lovelace 2023): it can be applied to many sample statistics, makes no distributional assumptions and can work on quite complicated sampling designs.

Presented in Figure 7.5 are KSI rates with error bars used to display 95% Confidence Intervals generated from a bootstrap procedure in which 1000 resamples were taken with replacement. Upper and lower limits were lifted from .025 and .975 percentile positions of the bootstrap sampling distribution. Assuming that the observed data are drawn from a wider (unobtainable) population, the 95% Confidence Intervals demonstrate that while Cotswold recorded a very large KSI rate, sampling variation means that this figure could be much lower (or higher), whereas for Bristol and Sheffield, where our KSI rate is derived from more data, the range of plausible values that the KSI rate might take due to sampling variation is much smaller – there is less uncertainty associated with their KSI rates.

Figure 7.5: KSI rates for pedestrian-vehicle crashes in selected local authorities with bootstrapped CIs (derived from 1000 resamples).

7.2.4 Visualizing uncertainty in frequencies

Error bars, like those in Figure 7.5, are a space-efficient way of conveying parameter uncertainty. However, remembering our main guideline for uncertainty visualization – that things that are not precise should not be encoded with symbols that look precise – they do have problems. The hard borders can lead to binary or categorical thinking (see Correll and Gleicher 2014). Certain values within a Confidence Interval are more probable than others, and so we should endeavour to use a visual encoding that reflects this. Matt Kay’s excellent ggdist package (Kay 2024) extends ggplot2 with a range of chart types for representing these sorts of intervals. In Figure 7.6 error bars are replaced by half eye plots and gradient bars, which give greater visual saliency to values of KSI that are more likely.

Figure 7.6: KSI rates for pedestrian-vehicle crashes in selected local authorities, with bootstrapped uncertainty estimates.

STATS19 road crash data are released annually. Given the wide uncertainty bands for some local authorities, it might be instructive to explore the stability of KSI rates year-on-year. In Figure 7.5 these KSI rates are represented with a bold line, and the faint lines are superimposed bootstrap resamples. The lines demonstrate volatility in the KSI rates for Cotswold and Bromsgrove due to small numbers. The observed increase in KSI rates for Sheffield since 2015 does appear to be a genuine one, although may also be affected by uncertainty around data collection and how reliably injury severity is recorded in the dataset.

Figure 7.7: Year-on-year KSI rates for pedestrian-vehicle crashes in selected local authorities, with bootstrap resamples superimposed.

The superimposed lines in the figure above are a form of ensemble visualization. An alternative approach might have been to animate over the bootstrap resamples to generate a Hypothetical Outcome Plot (HOP) (Hullman, Resnick, and Adar 2015). HOPs convey a sense of uncertainty by animating over random draws of a distribution. As there is no single outcome to anchor to, HOPs force viewers to account for uncertainty, recognising that some less probable outcomes may also be possible – essentially to think distributionally.

Figure 7.8: Frames from hypothetical outcome plot of year-on-year KSI rates for pedestrian-vehicle crashes.

7.2.5 Multiple comparisons

In road safety monitoring, a common ambition is to compare crash rates across local authorities. This is in order to make inferences around patterns of high and low injury severity rate. We might represent injury severity rates as Risk Ratios (RR) comparing the observed injury severity rate in each local authority to a benchmark, say the injury severity rate we would expect to see nationally. RRs are an intuitive measure of effect size: RRs >1.0 indicate that the injury severity rate is greater than the national average; RRs <1.0 that it is less than the national average. As they are a ratio of ratios, and therefore agnostic to sample size, RRs can nevertheless be unreliable. Two ratios might be compared that have very different sample sizes, and no compensation is made for the one that contains more data.

We can use quantitative measures to adjust for this. In the example in Figure 7.9 we use hierarchical modelling to shrink local authority KSI rates towards the global mean (national average KSI rate) where they are based on small numbers of observations (see Beecham and Lovelace 2023). From here our effect sizes, called Bayesian Risk Ratios, are sensitive to uncertainty since they are made more conservative where they are based on fewer observations. The Bayesian Risk Ratio for each local authority is represented with a | icon: angled to the right / where the KSI rate is greater than expected, to the left \ where it is less than expected. Additionally, we use bootstrap resampling to derive confidence intervals for our Bayesian RRs. If this interval does not cross 1.0, the RR is judged statistically significant and is coloured according to whether estimated RRs are above (/) or below (\) expectation.

Figure 7.9: Bayesian Risk Ratios comparing pedestrian injury severity rates in English local authorities coloured according to ‘statistical significance’, whether the bootstrap confidence interval does not cross 1.0.

From Figure 7.9 we make inferences around concentrations of high and low injury severity rate (in the annotations). A problem with this approach, and explicitly encoding ‘statistical significance’ values, is one familiar to statisticians but that is rarely addressed in visual data analysis: the multiple comparison problem. Whenever a statistical signal is identified, there is a chance that the result observed is in fact a false alarm. In the plot above which uses a 95% confidence level, the “false positive rate” is expected to be 5% or 1/20. When many tests are considered simultaneously, as in Figure 7.9, the number of these false alarms begins to accumulate. There are corrections that can be used to address this: test statistics can be adjusted and made more conservative. But these corrections have consequences. Too severe a correction can result in statistical tests that are underpowered and result in an elevated false negative rate, where a statistical test fails to detect an effect that truly exists. See Brunsdon and Charlton (2011) for an interesting discussion in the context of mapping crime rates.

So there is no single solution to multiple testing, which happens often in visual data analysis, especially in Geography, where health and other outcomes are mapped visually. It is actually less of problem in Figure 7.9 since our RRs are derived from a multilevel model in which estimates are partially pooled, or shrunk, to reflect the level of information we have (Gelman, Hill, and Yajima 2012). Presenting the RRs in their spatial context, and providing full information around RRs that are not significant (the oriented lines), also supports informal calibration. For example, depending on the phenomena, we may wish to attach more certainty to RRs that are labelled statistically significant and whose direction is consistent with their neighbours than those that are exceptional from their neighbours. Additionally, constructing a graphical line-up test (Wickham et al. 2010) allows us to explore whether the sorts of spatial patterns in RR values in the observed data are genuine or might appear in random decoy maps. Although informal, this sort of visual test approximates to the type of question that transport analysts may ask when identifying priority areas for road safety intervention (Beecham and Lovelace 2023).

Task

Watch Matt Kay’s excellent talk to BostonCHI, Uncertainty Visualization as a Moral Imperative:

https://www.youtube.com/watch?v=mfQ3QVyw4N0

And Robert Kosara’s talk, Presentation and Audience, as part of his Advanced Visualization course for Observable, from 43:43 minutes in:

https://www.youtube.com/watch?v=Wb6xKQRtWig

7.3 Techniques

The technical element demonstrates how some of the uncertainty estimate examples in the chapter can be reproduced. We will again make use of functional programming approaches via the purrr package, mostly for generating and working over bootstrap resamples.

7.3.1 Import

Download the 07-template.qmd¹ file for this chapter, it to your vis4sds project.
Open your vis4sds project in RStudio, and load the template file by clicking File > Open File ... > 07-template.qmd.

The template file lists the required packages: tidyverse, sf, tidymodels (for working with the bootstraps), ggdist and distributional for generating plots of parameter uncertainty and gganimate for the hypothetical outcome plot. Code for loading the STATS19 pedestrian crash data is in the 07-template.qmd file.

7.3.2 Plot icon arrays

Icon arrays can be generated reasonably easily in standard ggplot2 using geom_tile() and some data generation functions. The most straightforward approach is to place icon arrays in a regularly-sized grid. In the example, KSI rates in Fareham (41%) and Oxford (17%) are compared.

Figure 7.10: Icon arrays of pedestrian-vehicle crashes.

First we generate the array data: a data frame of array locations (candidate crashes) with values representing whether the crash is slight or KSI depending on the observed KSI rate. In the code below, we set up a 10x10 grid of row and column locations and populate these with values for the selected local authorities (Oxford and Fareham) using base R’s sample() function.

array_data <- tibble(
  row=rep(1:10, times=1, each=10),
  col=rep(1:10, times=10, each=1),
  Oxford=
    sample(
    c(rep(TRUE, times=1, each=17), rep(FALSE, times=1, each=83)),
    size=100, replace=FALSE),
  Fareham=
    sample(
    c(rep(TRUE, times=1, each=41), rep(FALSE, times=1, each=59)),
    size=100, replace=FALSE)
)

The plot code is straightforward:

array_data |>
  pivot_longer(
    cols=c(Oxford,Fareham), names_to="la", values_to="is_ksi"
    ) |>
  ggplot(aes(x=row,y=col, fill=is_ksi)) +
  geom_tile(colour="#ffffff", linewidth=1) +
  scale_fill_manual(values=c("#fee0d2","#de2d26"), guide="none") +
  facet_wrap(~la)

Plot specification:

Data: The array data, with pivot_longer() so that we can facet by local authority.
Encoding: x- and y-position according to the array locations and filled on whether the sampled crash is KSI or slight.
Marks: geom_tile() for drawing square icons.
Scale: scale_fill_manual() is supplied with values that are dark (KSI) and light (slight) red.
Facets: facet_wrap() for faceting on local authority.
Setting: Tiles are given large, white borders (geom_tile(colour="#ffffff", size=1)).

Figure 7.11: Risk theatre for pedestrian-vehicle crashes.

To present the icon array as a risk theatre, we have created a shapefile containing 1,000 theatre seat positions. To randomly allocate KSIs to seats on the proportion in which those crashes occur, we use the slice_sample() function.

theatre_cells <- st_read(here("data", "theatre_cells.geojson"))

ksi_seats <- bind_rows(
  theatre_cells |> slice_sample(n=170) |>
    add_column(la="Oxford\n170 KSI in 1,000 crashes"),
  theatre_cells |> slice_sample(n=410) |>
    add_column(la="Fareham\n410 KSI in 1,000 crashes")
)

The code:

theatre_cells |>
  ggplot() +
  geom_sf() +
  geom_sf(
    data=ksi_seats,
    fill="#000000"
  ) +
  annotate("text", x=23, y=1, label="Stage", alpha=.5) +
  annotate("text", x=23, y=21, label="Orchestra", alpha=.5) +
  annotate("text", x=23, y=31, label="Front mezzanine", alpha=.5) +
  annotate("text", x=23, y=42, label="Rear mezzanine", alpha=.5) +
  facet_wrap(~la)

Plot specification:

Data: theatre_cells contains geometry data for all 1,000 seats; ksi_seats contains the randomly sampled seat locations.
Marks: geom_tile() for drawing seat icons.
Facets: facet_wrap() for faceting on local authority.
Setting: KSI tiles are coloured black (fill="#000000"). Also annotate() blocks of the theatre, x- and y- placement is determined via trial-and-error.

7.3.3 Generate bootstrap estimates of parameter uncertainty

The code for generating bootstrap resamples, stored in rate_boots, initially looks formidable. It is a template that is nevertheless quite generalisable, and so once learnt can be extended and applied to suit different use cases.

rate_boots <- ped_veh |>
  mutate(
    is_ksi=accident_severity!="Slight",
    year=lubridate::year(date)
  ) |>
  filter(year==2019,
         local_authority_district %in% c("Bristol, City of",
        "Sheffield", "Bromsgrove", "Cotswold")
  ) |>
  select(local_authority_district, is_ksi) |>
  nest(data=-local_authority_district) |>
  mutate(la_boot=map(data, bootstraps, times=1000, apparent=TRUE)) |>
  select(-data) |>
  unnest(la_boot) |>
  mutate(
    is_ksi=map(splits, ~analysis(.) |>  pull(is_ksi)),
    ksi_rate=map_dbl(is_ksi, ~mean(.x)),
    sample_size=map_dbl(is_ksi, ~length(.x))
  ) |>
  select(-c(splits, is_ksi))

Code description:

Setup: The first mutate() is straightforward – a binary is_ksi variable identifies whether a crash is KSI, and the crash year is extracted from the date variable. Crashes recorded in 2019 are then filtered, along with the four comparator local authorities. To generate bootstrap resamples for each local authority, we nest() on local authority. You will remember that nest() creates a special type of column (a list-column) in which the values of the column is a list of data frames – in this case the crash data for each local authority. So running the code up to and including the nest(), a data frame is returned which contains four rows corresponding to the filtered local authorities and a list-column called data, each element of which is a data frame of varying dimensions (lengths) depending on the number of crashes recorded in each local authority.
Generate bootstraps resamples: In the mutate() that follows, purrr’s map() function is used to iterate over the list of datasets and the bootstraps() function to generate 1,000 bootstrap resamples for each nested dataset. The new column, la_boot, is a list-column this time containing a list of bootstrap datasets.
Calculate sample estimates: We unnest() the la_boot column to return a dataset with a row for each bootstrap resample and a list-column named splits which contains the bootstrap data. Again we map() over each element of splits to calculate the ksi_rate for each of the bootstrap datasets. The first call to map() extracts the is_ksi variable; the second is just a convenient way of calculating a rate from this (remembering that is_ksi is a binary variable); the third collects the sample size for each of the bootstraps, which of course is the number of crashes recorded for each local authority.

7.3.4 Plot parameter estimates with uncertainty information

With ggdist, the code for generating KSI rates with estimates of parameter uncertainty is straightforward and very similar to the error bar plots in the previous chapter.

Plot code:

rate_boots |>
  group_by(local_authority_district) |>
  mutate(std.error=sd(ksi_rate)) |>
  filter(id=="Apparent") |>
  ggplot(
    aes(x=reorder(local_authority_district, ksi_rate), y=ksi_rate)
    ) +
  stat_gradientinterval(
    aes(dist = dist_normal(mu=ksi_rate, sigma=std.error)),
    point_size = 1.5
  ) +
  coord_flip()

Plot specification:

Data: The rate_boots data frame is grouped by local authority and in the mutate() we calculate an estimate of bootstrap standard error, the standard deviation of the sampling distribution, and filter all rows where id=="Apparent" – this contains the KSI rate for the observed (unsampled) data.
Encoding: x- position varies according to local authority and y-position according to KSI rate. The estimated KSI rate and bootstrap standard error are also passed to stat_gradientinterval(), the ggdist function for producing gradient plots.
Marks: stat_gradientinterval() for drawing the gradients and point estimates.
Setting: coord_flip() for easy reading of local authority names.

7.3.5 Ensemble plots and hypothetical outcome plots

To generate bootstrap resamples on local authority and year, necessary for the year-on-year analysis, we can use the same template as that for calculating rate_boots; the only difference is that we select() and nest() on the year as well as the local_authority_district column.

rate_boots_temporal <- ped_veh |>
  ...
  ... |>
  select(local_authority_district, is_ksi, year) |>
  nest(-c(local_authority_district, year)) |>
  ...
  ...
  ...

The ensemble plot is again reasonably straightforward:

rate_boots_temporal |>
  ggplot(aes(x=year, y=ksi_rate)) +
  geom_line(data=. %>%  filter(id=="Apparent"), 
    aes(group=id), linewidth=.5) +
  geom_line(
    data=. %>%  filter(id!="Apparent"),
    aes(group=id), alpha=.1, size=.2
    ) +
  facet_wrap(~local_authority_district)

Plot specification:

Data: The rate_boots_temporal data frame. Note that we include two line layers, one with the observed data (data=. %>% filter(id=="Apparent") and one with the bootstrap data (data=. %>% filter(id!="Apparent").
Encoding: x- position varies according to year, y-position according to KSI rate.
Marks: geom_line() for drawing lines.
Facets: facet_wrap() for faceting on local authority.
Setting: The bootstrap lines are de-emphasised by making the alpha and size channels very small.

Finally, the Hypothetical Outcome Plot (HOP) can be created easily using the gganimate package, simply by adding a call to transition_states() at the end of the plot specification:

rate_boots_temporal |>
  filter(id!="Apparent") |>
  ggplot(aes(x=year, y=ksi_rate)) +
  geom_line(aes(group=id), linewidth=.6) +
  facet_wrap(~local_authority_district)+
  transition_states(id, 0,1)

7.4 Conclusions

Uncertainty is fundamental to any data analysis. Statisticians and data scientists almost always end up reasoning about uncertainty, developing quantitative estimates of uncertainty and communicating uncertainty so that it can be taken into account when making evidence-based claims and decisions. Through an analysis of injury severity in the STATS19 road crash dataset, this chapter introduced techniques for quantifying and visually representing parameter uncertainty. There has been much activity in the Information Visualization and Data Journalism communities focussed on uncertainty communication – on developing approaches that promote intuition and allow users to experience uncertainty. We have covered some of these and demonstrated how they could be incorporated into our road crash analysis case study.

7.5 Further Reading

An excellent primer on uncertainty visualization:

Padilla, L., Kay, M. and Hullman, J. 2021. “Uncertainty Visualization,” in Wiley StatsRef: Statistics Reference Online, edited by B. Everitt N. Balakrishnan T. Colton and J. L. Teugels, Wiley. doi: 10.1002/9781118445112.stat08296.

On visualizing parameter uncertainty:

Correll, M. and Gleicher, M. 2014. “Error Bars Considered Harmful: Exploring Alternate Encodings for Mean and Error,” IEEE Transactions on Visualization and Computer Graphics 20(12): 2142–2151. doi: 10.1109/TVCG.2014.2346298.
Kale, A., Nguyen, F., Kay, M. and Hullman, J. 2019. “Hypothetical Outcome Plots Help Untrained Observers Judge Trends in Ambiguous Data,” IEEE Transactions on Visualization and Computer Graphics, 25(1): 892–902. doi: 10.1109/TVCG.2018.2864909.

On bootstrap resampling with R and tidyverse:

Ismay, C. and Kim, A. 2020. “Statistical Inference via Data Science: A ModernDive into R and the Tidyverse”, New York, NY: CRC Press. doi: 10.1201/9780367409913.
- Chapters 7, 8.

BBC Visual and Data Journalism Team. 2019. “BBC Visual and Data Journalism Cookbook for R Graphics.” https://github.com/bbc/rcookbook.

Beecham, R., and R. Lovelace. 2023. “A Framework for Inserting Visually-Supported Inferences into Geographical Analysis Workflow: Application to Road Safety Research” 55: 345–66. https://doi.org/10.1111/gean.12338.

Beecham, R., N. Williams, and L. Comber. 2020. “Regionally-Structured Explanations Behind Area-Level Populism: An Update to Recent Ecological Analyses.” PLOS One 15 (3): e0229974. https://doi.org/10.1371/journal.pone.0229974.

Boukhelifa, N., A. Bezerianos, T. Isenberg, and J. Fekete. 2012. “Evaluating Sketchiness as a Visual Variable for the Depiction of Qualitative Uncertainty.” IEEE Transactions on Visualization and Computer Graphics 18 (12): 2769–78. https://doi.org/10.1109/TVCG.2012.220.

Brunsdon, C., and M. Charlton. 2011. “An Assessment of the Effectiveness of Multiple Hypothesis Testing for Geographical Anomaly Detection.” Environment and Planning B: Planning and Design 38 (2): 216–30. https://doi.org/10.1068/b36093.

Buja, A., D. Cook, H. Hofmann, M. Lawrence, E-K Lee, D. Swayne, and H. Wickham. 2009. “Statistical Inference for Exploratory Data Analysis and Model Diagnostics.” Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 367 (1906): 4361–83. https://doi.org/10.1098/rsta.2009.0120.

Burn-Murdoch, J. 2023. “Making Charts That Make an Impact.” Invited talk, Data Visualization Society’s Outlier Conference. https://www.youtube.com/watch?v=tIbaQUo6H9g&ab_channel=DataVisualizationSociety.

Comber, A., C. Brunsdon, M. Charlton, G. Dong, R. Harris, B. Lu, Y. Lü, et al. 2023. “A Route Map for Successful Applications of Geographically Weighted Regression.” Geographical Analysis 55 (1): 155–78. https://doi.org/10.1111/gean.12316.

Correll, M., and M. Gleicher. 2014. “Error Bars Considered Harmful: Exploring Alternate Encodings for Mean and Error.” IEEE Transactions on Visualization and Computer Graphics 20 (12): 2142–51. https://doi.org/10.1109/TVCG.2014.2346298.

Franconeri, S. L., L. M. Padilla, P. Shah, J. M. Zacks, and J. Hullman. 2021. “The Science of Visual Data Communication: What Works.” Psychological Science in the Public Interest 22 (3): 110–61. https://doi.org/10.1177/15291006211051956 .

Gelman, A., J. Hill, and M. Yajima. 2012. “Why We (Usually) Don’t Have to Worry about Multiple Comparisons.” Journal of Research on Educational Effectiveness 5 (2): 189–211. https://doi.org/10.1080/19345747.2011.618213.

Gross, J. 2016. “How to Better Communicate Election Forecasts — in One Simple Chart.” The Washington Post. https://www.washingtonpost.com/news/monkey-cage/wp/2016/11/29/how-to-better-communicate-election-forecasts-in-one-simple-chart/.

Healy, K. 2019. Data Visualization: A Practical Introduction. Princeton, NJ: Princeton University Press. https://socviz.co.

Hullman, J., and A. Gelman. 2021. “Designing for Interactive Exploratory Data Analysis Requires Theories of Graphical Inference.” Harvard Data Science Review 3 (3).

Hullman, J., P. Resnick, and E. Adar. 2015. “Hypothetical Outcome Plots Outperform Error Bars and Violin Plots for Inferences About Reliability of Variable Ordering.” PLOS One 10 (11). https://doi.org/10.1371/journal.pone.0142444.

Ismay, C., and A. Kim. 2020. Statistical Inference via Data Science: A ModernDive into R and the Tidyverse. New York, NY: CRC Press. https://doi.org/10.1201/9780367409913.

Kale, A., F. Nguyen, M. Kay, and J. Hullman. 2019. “Hypothetical Outcome Plots Help Untrained Observers Judge Trends in Ambiguous Data.” IEEE Transactions on Visualization and Computer Graphics 25 (1): 892–902. https://doi.org/10.1109/TVCG.2018.2864909.

Kay, M. 2021. “Uncertainty Visualization as a Moral Imperative.” Invited talk, BostonCHI meeting. https://www.youtube.com/watch?v=mfQ3QVyw4N0&ab_channel=BostonCHI.

———. 2024. “ggdist: Visualizations of Distributions and Uncertainty in the Grammar of Graphics.” IEEE Transactions on Visualization and Computer Graphics 30 (1): 414–24. https://doi.org/10.1109/TVCG.2023.3327195.

Kinkeldey, C., A. MacEachren, and J. Schiewe. 2014. “How to Assess Visual Communication of Uncertainty? A Systematic Review of Geospatial Uncertainty Visualisation User Studies.” The Cartographic Journal 51 (4): 372–86. https://doi.org/10.1179/1743277414Y.0000000099.

Kosara, R. 2023. “Lesson 4: Presentation, Uncertainty, ISOTYPE.” ObservableHQ Notebook; ObservableHQ. https://observablehq.com/@observablehq/lesson-4-presentation-uncertainty-isotype?collection=@observablehq/advanced-data-vis-course.

Munzner, T. 2014. Visualization Analysis and Design. AK Peters Visualization Series. Boca Raton, FL: CRC Press.

NHC. 2023. “National Hurricane Center and Central Pacific Hurricane Center.” https://www.nhc.noaa.gov/.

Padilla, L., M. Kay, and J. Hullman. 2021. “Uncertainty Visualization.” In Wiley StatsRef: Statistics Reference Online, edited by B. Everitt N. Balakrishnan T. Colton and J. L. Teugels. Wiley. https://doi.org/10.1002/9781118445112.stat08296.

Scherer, C. 2023. “Designing Data Visualizations to Successfully Tell a Story.” Workshop at Posit::conf(2023), Chicago, IL. https://posit-conf-2023.github.io/dataviz-storytelling/.

Silver, N. 2016. “Why FiveThirtyEight Gave Trump a Better Chance Than Almost Anyone Else.” FiveThirtyEight. https://fivethirtyeight.com/features/why-fivethirtyeight-gave-trump-a-better-chance-than-almost-anyone-else.

The Turing Way Community. 2025. “The Turing Way: A Handbook for Reproducible, Ethical and Collaborative Research.” Zenodo. https://doi.org/10.5281/zenodo.15213042.

Van Goethem, A., A. Reimer, B. Speckmann, and J. Wood. 2014. “Stenomaps: Shorthand for Shapes.” IEEE Transactions on Visualization and Computer Graphics 20 (12): 2053–62. https://doi.org/10.1109/TVCG.2014.2346274.

Wickham, H., D. Cook, H. Hofmann, and A. Buja. 2010. “Graphical Inference for Infovis.” IEEE Transactions on Visualization and Computer Graphics 16 (6): 973–79. https://doi.org/10.1109/TVCG.2010.161.

Wolf, L. J., L. Anselin, D. Arribas-Bel, and L. Rivers Mobley. 2021. “On Spatial and Platial Dependence: Examining Shrinkage in Spatially Dependent Multilevel Models.” Annals of the American Association of Geographers 111 (6): 1679–91. https://doi.org/10.1080/24694452.2020.1841602.

Wood, J., P. Isenberg, T. Isenberg, J. Dykes, N. Boukhelifa, and A. Slingsby. 2012. “Sketchy Rendering for Information Visualization.” IEEE Transactions on Visualization and Computer Graphics 18 (12): 2749–58. https://doi.org/10.1109/TVCG.2012.262.

Yang, F., M. Cau, C. Mortenson, H. Fakhari, A. D. Lokmanoglu, J. Hullman, S. Franconeri, N. Diakopoulos, E. C. Nisbet, and M. Kay. 2024. “Swaying the Public? Impacts of Election Forecast Visualizations on Emotion, Trust, and Intention in the 2022 U.S. Midterms.” IEEE Transactions on Visualization and Computer Graphics 30 (1): 23–33. https://doi.org/10.1109/TVCG.2023.3327356.

https://vis4sds.github.io/vis4sds/files/07-template.qmd↩︎