6–9 Jul 2026
Europe/Warsaw timezone

Exploring and visualizing data subsets using the vtree package

6 Jul 2026, 13:00
3h
Tutorial (3 hours) Data visualization Tutorials

Speaker

Nick Barrowman (CHEO Research Institute)

Description

Suppose you want to know how many companies there are with over 100 employees in each region of several countries. We could write this as country >> region >> company >> employees. With just two variables, you can make a two-way table of counts with row or column percentages. But this does not easily generalize to larger numbers of variables, and attempts to display this kind of information can be hard to interpret. The vtree package provides a general and easily-interpreted way of displaying “variable trees” giving the size of nested subsets of an R data frame, along with tools for pruning, displaying additional summary information, customizing, and more. I first released the package in 2018 and since then my experience using vtree, along with that of many others, has shown where vtree fits in the toolbox of methods for exploratory data analysis and data visualization.

This interactive tutorial will cover the basics of using vtree, including a wide variety of examples. More advanced and special-purpose features will also be explained. Participants in the tutorial will have the opportunity to work through construction of variable trees and how to choose the order of variables, whether and how to prune, and what kinds of summary information to display. An extended example will demonstrate how to generate CONSORT-style study flow diagrams reproducibly. Additionally, other tools (e.g. UpSetR) will be compared to vtree. The tutorial will be informal and audience participation will be welcome.

Learning goals (only for tutorials)

After attending this tutorial you will be able to use vtree to:
* Quickly visualize nested data subsets using a variable tree
* Prune a variable tree to show the most relevant subsets
* Add additional summary information to a variable tree
* Compactly display combinations of variables
* Customize the display of variable trees
* Produce a CONSORT-style study flow diagram.
* Compare and contrast vtree and some other tools.

Prerequisites (only for tutorials)

Basic knowledge of R.

If you used AI tools or services to support the preparation of this submission, please state the name and reason for using each of them.

No AI tools/services were used.

Target audience (only for tutorials)

Anyone who performs data analysis and wishes to understand and visualize the structure and contents of data sets.

Additional Material or Paper

https://www.jstatsoft.org/article/view/v114i04

https://nbarrowman.github.io/vtree.html

Keywords: Please list up to 5 keywords to help us find the right session for your contribution. exploratory data analysis, data exploration, data visualization, study flow diagrams, reproducibility
Virtual Option This submission is for onsite presentation primarily, but I would also like it to be considered for pre-recorded virtual presentation if I don't get an onsite slot
Video Recording Video sharing is fine
The author(s) agree(s) to take responsibility and be accountable for the contents of the submission and is/are authorized to present it. Confirm
Interested in serving as reviewer? no

Author

Nick Barrowman (CHEO Research Institute)

Presentation materials

There are no materials yet.