6–9 Jul 2026
Europe/Warsaw timezone

Sparge plots for comparing univariate distributions with many overlapping data points

8 Jul 2026, 13:00
20m
Lightning Talk (5 minutes) Virtual Presentation Room

Speaker

David Schruth (UW)

Description

The visualization of data embodies a visceral fundament for understanding, and constitutes a cardinal mechanism for modern communication of any measurement outcomes of research—despite typically being neglected (largely for technical reasons) in traditional statistics literature. There are limitless ways of realizing plots of data, and visualizing even the most basic distribution of points proves to be no exception to this generalization. An exceedingly popular way of plotting univartiate data—for exploratory data analysis and visual comparison—is by means of the ‘box plot’ which demarcates the mean, quartile, and extremes of a distribution—using a central as well as exterior lines of the rectangular box overlay. Other examples of similarly informative plots, that can compare data from several sources or across different experiments or data-sets, include the ‘strip-chart’ and the ‘forest plot’. Key shortcomings and constraints in many such approaches include an inability to represent large datasets along with the related issue of over-plotting of such points with near identical values—all while still highlighting parametric-distributional scaffolding demarcations. Other methods that can circumvent these problems include 'violin', 'sina', and 'raincloud' plots, which alleviate many such side-effects of simultaneously assessing large clusters of over-plotted points. Here I offer a re-incarnation of distributional plot to this family of statistical visualizations—as the 'sparge plot'—which leverages R's feature of translucent over-plotting to reflect an underlying density of overlapping points. I have additionally engineered a way to add an over-plot of a subtle (automatically gray-scaled) boxplot-frame around the the core of the data when this central area becomes overly dense. This novel plotting approach facilitates a dualistic solution to transparently visualizing both the density, distribution, and spread of data—as well as potentially overlapping data-points and outliers—for larger sample sizes.

Additional Material or Paper

https://cloud.r-project.org/web/packages/caroline/index.html

If you used AI tools or services to support the preparation of this submission, please state the name and reason for using each of them.

No AI tools/services were used

Keywords: Please list up to 5 keywords to help us find the right session for your contribution. box-plot, strip-chart, forest-plot, violin-plot, raincloud-plot
Virtual Option This submission is for pre-recorded virtual presentation only
Material License APACHE
Video Recording Video sharing is fine
The author(s) agree(s) to take responsibility and be accountable for the contents of the submission and is/are authorized to present it. Confirm
Interested in serving as reviewer? dschruth@anthropoidea.org

Author

Presentation materials

There are no materials yet.