Concreteness vs faithfulness in visual summaries
This is Jessica. I recently had a discussion with collaborators that got me thinking about trade-offs we often encounter in summarizing data or predictions. Specifically, how do we weigh the value of deviating from a faithful or accurate representation of how some data was produced in order to it more interpretable to people? This often comes up as sort of an implicit concern in visualization, when we decide things like whether we should represent probability as frequency to make it more concrete or usable for some inference task. It comes up more explicitly in some other areas like AI/ML interpretability, where people debate the validity of using post-hoc interpretability methods. Thinking about it through a visualization example more has made me realize that at least in visualization research, we still don’t really have principled foundation for resolving these questions.
My collaborators and I were talking about designing a display for an analysis workflow involving model predictions. We needed to visualize some distributions, so I proposed using a discrete representation of distribution based on how they have been found tolead to more accurate probability judgments and decisions among non-experts in multiple experiments. By “discrete representations” here I mean things like discretizing a probability density function by taking some predetermined number of draws proportional to the inverse cumulative distribution function and showing it in a static plot (quantile dotplot), or animating draws from the distribution we want to show over time (hypothetical outcome plots), or possibly some hybrid of static and animated. However, one of my collaborators questioned whether it really makes sense to use, for example, aball swarm style chartif you aren’t using a sampling based approach to quantify uncertainty.
This made me realize how common it is in visualization research to try to separate the visual encoding aspect from the rest of the workflow. We tend to see the question of how to visualize distribution as mostly independent from how to generate the distribution. So even if we used some analytical method to infer a sampling distribution, the conclusions of visualization research as typically presented would suggest that we should still prefer to visualize it as a set of outcomes sampled from the distribution. We rarely discuss how much the effectiveness of some technique might vary when the underlying uncertainty quantification process is different.
On some level this seems like an obvious blind spot, to separate the visual representation from the underlying process. But I can think of a few reasons why researchers might default to trying to separate encodings from generating processes and not necessarily question doing this. For one, having worked in visualization for years, at least in the case of uncertainty visualization I’ve seen various instances where users of charts seem to be more sensitive to changes to visual cues than they are to changes to descriptions of how some uncertainty quantification was arrived at. This implies that aiming for perfect faithfulness in our descriptions is not necessarily where we want to spend our effort. E.g, change an axis scaling and theeffect size judgments you get in response will be different, but modifying the way you describe the uncertainty quantification process alone probably won’t result in much of a change to judgments without some addtional change in representation. So the focus naturally goes to trying to “hack” the visual side to get the more accurate or better calibrated responses.
I could also see this way of thinking becoming ingrained in part because people who care about interfaces have always had to convince others of the value of what they do through evidence that the reprersentation alone matters. Showing the dependence of good decisions on visualization alone is perceived as sort of a fundamental way to argue that visualization should be taken seriously as a distinct area.
At the same time though, disconnecting visual from process could be criticized for suggesting a certain sloppiness in how we view the function of visualization. Not minding the specific ways that we break the tie between the representation and the process might imply we don’t have a good understanding of the constraints on what we are trying to achieve. Treating the data generating process as a black box is certainly much easier than trying to align the representations to it, so it’s not necessarily surprising that the research community seems to have settled with the former.
Under this view, it becomes research-worthy to point out issues that only really arise because we default to thinking that representation and generation are separate. For example, there’s a well known psych study suggesting wedon’t want to visualize continuous data with bar chartsbecause people will think they are seeing discrete groups (and vice versa). It’s kind of weird that we can have these one-off results be taken very seriously, but then not worry so much about mismatch in other contexts, like acknowledging that making some assumptions to compute a confidence interval and then sampling some hypothetical outcomes from that is different from using sampling directly to infer a distribution.
I suspect for this particular uncertainty visualization example, the consequences of the visual metaphor not faithfully capturing underlying distribution generation process seem minor relative to the potential benefits of getting people thinking more concretely about the implications of error in the estimate. There’s also a notion of frequency that’s also inherent in the conventional construction of confidence intervals which maybe makes a frequency representation seem less egregiously wrong. Still, there’s the potential for the discrete representation to be read as mechanistic, i.e., as signifying a bootstrap construction process even where it actually doesn’t that my collaborator seemed to be getting at.
But on the other hand, any data visualization is a concretization of something nebulous, i.e., an abstraction encoded in the visual-spatial realm used to represent our knowledge of some real world thing approximated by a measurement process. So one could also point out that it doesn’t really make sense to act as though there are going to be situations where we are free from representational “distortion.”
Anyway, I do think there’s a valid criticism to be made through this example of how research hasn’t really attempted to address these trade-offs directly. Despite all of the time we spend emphasizing the importance of the right representation in interactive visualization, I expect most of us would be hard pressed to explain the value of a more concrete representation over a more accurate one for a certain problem without falling back on intuitions. Should we be able to get precise about this, or even quantify it? I like the idea of trying, but in an applied field like infovis would expect the majority to judge it to be not worth the effort (if only because theory over intuition is a tough argument to make when funding exists without it).
Like I said above, a similar trade-off seems to come up in areas like AI/ML interpretatibility and explainability, but I’m not sure if there are attempts yet to theorize it. It could maybe be described as the value of human model alignment, meaning the value of matching the representation of some information to metaphors or priors or levels of resolution that people find easier to mentally compute with, versus generating model alignment, where we constrain the representation to be mechanistically accurate. It would be cool to see examples attempting to quantify this trade-off or otherwise formalize it in a way that could provide design principles.