useR! 2026

Name: useR! 2026
Start: 2026-07-06T08:00:00+02:00
End: 2026-07-09T19:00:00+02:00
Location: No location set

6–9 Jul 2026

Europe/Warsaw timezone

Contribution List

77. Convex Optimization in R Using CVXR

Ms Anqi Fu (Memorial Sloan Kettering), Balasubramanian Narasimhan (Stanford University)

06/07/2026, 08:30

All tracks

Tutorial (3 hours)

Tutorials

Convex optimization is fundamental to modern statistics and machine learning, underpinning methods from least squares and ridge regression to support vector machines (SVMs) and portfolio optimization. While Python users have long enjoyed state-of-the-art convex optimization through CVXPY, R users now have access to the same capabilities through CVXR 1.8.x---a complete rewrite using R's S7...
Go to contribution page
22. From Collecting Log-data to Analyzing Process Indicators with logLime R Package

Dr Tomasz Żółtak (Institute of Philosophy and Sociology, Polish Academy of Sciences)

06/07/2026, 08:30

Social sciences

Tutorial (3 hours)

Tutorials

This workshop introduces participants to the workflow of processing, analyzing and visualizing log-data describing respondent interactions with web survey interface collected from the open and popular LimeSurvey survey platform, using the logLime R package (along with other packages: dplyr, ggplot2, ggdensity and gganimate). While discussing this process, participants will discover different...
Go to contribution page
72. Geocomputation with R

Prof. Jakub Nowosad (Adam Mickiewicz University), Dr Jannes Muenchow (cynkra GmbH)

06/07/2026, 08:30

All tracks

Tutorial (3 hours)

Tutorials

R has become one of the most widely used languages for geographic data science. Its strength lies in a well-established ecosystem of several hundred spatial packages that support geographic data handling, analysis, and visualization, while integrating seamlessly with R’s wider tools for data processing and statistical analysis. R's flexibility and statistical capabilities make it attractive...
Go to contribution page
13. Introduction to spatial data science

Krzysztof Dyba (Adam Mickiewicz University in Poznan)

06/07/2026, 08:30

All tracks

Tutorial (3 hours)

Tutorials

Brief biography

Krzysztof Dyba is a senior lecturer at Adam Mickiewicz University in Poznan specializing in spatial data science and remote sensing. His teaching experience includes conducting four international workshops on R applications for satellite data at the OpenGeoHub Summer School, as well as leading an external course on “Advanced Spatial Analysis” at Maria Curie-Sklodowska...
Go to contribution page
103. Modelling spatial density of geo-located point data

Prof. Katarzyna Kopczewska (University of Warsaw)

06/07/2026, 08:30

Econometrics and financial modeling

Tutorial (3 hours)

Tutorials

This workshop introduces participants to the analytics of spatial geo-located point data. It starts with data processing (reading to sf class, visualisation in ggplot, plot and interactive mapview, CRS re-projecting); then it continues with detection of density clusters (QDC, DBSCAN), degree of agglomeration (using entropy-based ETA and SPAG) and comparison of density patterns by further...
Go to contribution page
173. Building AI-Powered Data Pipelines with blockr: From Visual Analysis to Custom Extensions

Christoph Sax (cynkra GmbH, University of Basel), David Granjon (cynkra GmbH)

06/07/2026, 13:00

Web applications (Shiny, dashboards)

Tutorial (3 hours)

Tutorials

Data analysis in R requires writing code, which remains a barrier for many domain experts. blockr is an open-source visual programming framework for R that allows users to construct reactive data pipelines by assembling modular blocks through a point-and-click interface. The framework generates reproducible R code automatically.

This hands-on tutorial takes participants from first use to...
Go to contribution page
168. CANCELLED - Energy Systems Modeling with R

Oleg Lugovoy (Optimal Solution LLC)

06/07/2026, 13:00

Econometrics and financial modeling

Tutorial (3 hours)

Tutorials

Energy Systems optimization models, also know as Macro Energy System (MES) models are the key tools to evaluate energy transition and decarbonization strategies for countries, regions, and globally. The workshop introduces a set of tools and open datasets to design and compare energy transition scenarios, compile reports, all not leaving R. Hands on sessions will focus on preparing datasets...
Go to contribution page
3. Exploring and visualizing data subsets using the vtree package

Nick Barrowman (Children's Hospital of Eastern Ontario Research Institute, University of Ottawa)

06/07/2026, 13:00

Data visualization

Tutorial (3 hours)

Tutorials

Suppose you want to know how many companies there are with over 100 employees in each region of several countries. We could write this as country >> region >> company >> employees. With just two variables, you can make a two-way table of counts with row or column percentages. But this does not easily generalize to larger numbers of variables, and attempts to display this kind of information...
Go to contribution page
93. Introduction to Parallel Processing in R using Futureverse - Easier than Ever Before

Henrik Bengtsson (University of California San Francisco (UCSF))

06/07/2026, 13:00

Efficient programming

Tutorial (3 hours)

Tutorials

This tutorial provides an introduction to the Futureverse (https://www.futureverse.org), a cohesive package ecosystem designed to facilitate and simplify parallel and distributed computing in R.

While accessible to beginners and those with some R experience, this workshop also provides valuable insights for more advanced users. We will focus on the new futurize() function, which...
Go to contribution page
144. LeaRn and teach at the same time with Pair Programming – code storytelling activity

Dr Pawel Orzechowski (The University of Edinburgh), Dr Brittany Blankinship (The University of Edinburgh)

06/07/2026, 13:00

Education

Tutorial (3 hours)

Tutorials

Would your code be better if you had someone by your side to talk through it? Or would your code be clearer if you might have to hand over the keyboard to someone at any moment?

Pair programming is a collaboration technique widely used in the software industry – it involves two people working together on one programming task. One person is the driver, suggesting solutions and typing the...
Go to contribution page
219. Quarto and AI for reproducible documents

MIchał Ramsza (SGH Warsaw School of Economics)

06/07/2026, 13:00

Analysis best practices and workflows

Tutorial (3 hours)

Tutorials

Abstract. This workshop introduces participants to using the Quarto system for creating reproducible documents. Participants will learn how to create Quarto documents, including a complete workflow from data loading and wrangling to analysis and automated document generation. The final document format can be many; however, the workshop focuses on DOCX with custom styling.

*Outline of the...
Go to contribution page
229. Interactive Graphics for Understanding and Interpreting Nonlinear Model Behaviour in High Dimensions, using R

Dianne Cook

07/07/2026, 09:00

Keynote

Abstract:
When crossing a busy street, we understand that keeping our eyes open isn't optional—it's how we stay safe. Yet when building complex models, we often choose to work blind. Some of this is understandable — visualising high-dimensional data is genuinely difficult. But cultural attitudes matter too: there's a lingering belief that "looking at the data" compromises objectivity, and a...
Go to contribution page
6. Analyzing Sports Data with R

Max Marchi (Cleveland Guardians)

07/07/2026, 10:30

Case studies and applications

Talks (15-20 minutes)

Talks

LeBron choosing between taking a mid-range shot and feeding a teammate for an open three-pointer; Verstappen heading towards the pit lane or driving another few laps with the current tires; Sinner going all-in on a second serve against Alcaraz. Making split-second decisions when the stakes are the highest requires the talent and hard work that only great athletes have. Analysis of sports data...
Go to contribution page
11. R We There Yet? Surviving the Transition to Package Maturity

Edoardo Mancini (Roche)

07/07/2026, 10:30

Talks (15-20 minutes)

Talks

You’ve built a great R package. People are using it. Feature completeness is in sight. Congratulations - you’ve defied the odds. Now the hard part begins!

Transitioning an open-source R package from active development to long-term maintenance and stability is a complex shift. Throughout this talk we’ll explore methods to tackle this often overlooked, but key challenge of the open-source...
Go to contribution page
9. When One Hundredth of a Second Matters: Bayesian Counterfactual Modeling in R

Kristian Vepsäläinen (freelance data scientist)

07/07/2026, 10:30

Talks (15-20 minutes)

Talks

At the 1980 Winter Olympics, the men’s 15 km cross-country skiing race was decided by one hundredth of a second. Rather than debating historical causes, we treat this as a modeling problem: could a physically plausible effect—such as a small change in aerodynamic drag—have been large enough to matter?

In this talk, we demonstrate how Bayesian forward simulation in R can be used to reason...
Go to contribution page
82. Writing Interactive Applications with Base R using getGraphicsEvent()

Wolfgang Viechtbauer (Maastricht University)

07/07/2026, 10:30

Talks (15-20 minutes)

Talks

Interactive graphical applications in R are typically built using external web-based frameworks such as Shiny or by interfacing with other programming languages such as Tcl/Tk or JavaScript. However, these approaches may require server infrastructure or knowledge of additional programming languages.

As an alternative, the base R function grDevices::getGraphicsEvent() enables interactive...
Go to contribution page
36. Create your own dashboard with EurostatRTool

Antonio Grosso

07/07/2026, 10:50

Talks (15-20 minutes)

Talks

In today's digital landscape, the demand for making statistics more accessible and meaningful to the general public is increasing. This has led to the need for fast and efficient ways to disseminate statistical information, particularly when data visualisations must be produced quickly and updated frequently. In this context, Eurostat launched a project under the ESS Innovation Agenda and...
Go to contribution page
33. Maintaining a package on CRAN

Lluís Revilla Sancho

07/07/2026, 10:50

Talks (15-20 minutes)

Talks

Publishing a package on CRAN is often half the work of a maintainer, then comes the hardest part: maintaining it there. Many resources focus on getting the package published on CRAN and what it takes one maintainer to do so. They share common problems and how to solve them but they are not based on data or focused on maintaining the package on CRAN.

Recently, CRAN has open up some...
Go to contribution page
164. mlr3forecast: Extending mlr3 to time series forecasting

Maximilian Mücke (LMU Munich)

07/07/2026, 10:50

Talks (15-20 minutes)

Talks

mlr3forecast extends the mlr3 ecosystem to support time series forecasting workflows. It introduces a dedicated forecasting task class and resampling strategies that respect temporal ordering, enabling forecasting models to be benchmarked, tuned, and combined in a systematic way. Learners wrapping established forecasting methods such as ARIMA and ETS can be used alongside any mlr3 learner...
Go to contribution page
94. Workforce Digital Twins: Individual-Level Stochastic Simulation for Strategic Workforce Planning in R

Dr Nicolas Flores Castillo (BHP, Organisational Development and Analytics)

07/07/2026, 10:50

Talks (15-20 minutes)

Talks

Traditional workforce analytics relies on aggregate metrics that obscure individual-level dynamics and wash out critical correlations between employee characteristics, career trajectories, and compensation outcomes. This talk presents a Workforce Digital Twin methodology using individual-level stochastic simulation to model strategic gender pay equity interventions in a large global...
Go to contribution page
102. Reusing 'ggplot2' code: how to design better plot helper functions

Cynthia Huang (LMU Munich)

07/07/2026, 11:10

Talks (15-20 minutes)

Talks

Wrapping 'ggplot2' code into plot helper functions is a common way to make multiple versions of a custom plot without
copying and pasting the same code over and over again. Helper functions can replace long and complex 'ggplot2' code
chunks with just a single function call. However, if that single function is not designed carefully, the initial convenience can
often turn into frustration....
Go to contribution page
172. samplyr: A Tidy Grammar for Survey Sampling Design in R

Dr Ahmadou Dicko

07/07/2026, 11:10

Talks (15-20 minutes)

Talks

In methodology reports and academic textbooks, sampling designs are described in a structured way, with an explicit logic connecting strata, clusters, sampling stages, and selection probabilities. This structure is often lost in the R code that implements it, where the design gets scattered across intermediate objects, successive calls, and technical details. The more complex the plan, the...
Go to contribution page
179. SerolyzeR - an R package for automated analysis of serological data

Jakub Grzywaczewski (Warsaw University of Technology), Dr Nuno Sepúlveda (Warsaw University of Technology)

07/07/2026, 11:10

Talks (15-20 minutes)

Talks

Pre-processing and quality control of high-dimensional serological data from Multiplex Bead Assay machines pose a significant bottleneck to the responsible application of machine learning to global health challenges. Driven by the data demands of the PvSTATEM project, an international initiative aimed at malaria elimination, we developed SerolyzeR, an open-source R package designed to...
Go to contribution page
87. UnitMix in Production: Multivariate Gaussian Mixtures for Robust Detection of Scale Errors and Outliers

Cristina Faricelli (ISTAT)

07/07/2026, 11:10

Statistical models and methods

Talks (15-20 minutes)

Talks

UnitMix is an R package designed to detect and correct unit of measurement errors using Gaussian mixture model-based clustering, supporting both methodological research and production workflows at National Statistical Institutes (NSIs).
The core function, assign.cluster, implements a multivariate Gaussian mixture model on log-transformed variables, allowing clusters to be defined...
Go to contribution page
177. Evaluating Disclosure Risk in Synthetic Data with the R Package riskutility

Oscar Thees (FHWN / SwissAnon)

07/07/2026, 11:30

Talks (15-20 minutes)

Talks

The growing movement toward open data, open science, and open government has increased the demand for sharing detailed microdata while protecting individual privacy. Data anonymization methods, including classical statistical disclosure control techniques and synthetic data generation, enable data sharing, but evaluating the resulting privacy risks and analytical usefulness remains...
Go to contribution page
203. Interactive Graphics for Analyses, Documents and Dashboards

Simon Urbanek

07/07/2026, 11:30

Talks (15-20 minutes)

Talks

The R ecosystem provides very good infrastructure for making R accessible as web-based documents, dashboards and for performing remote analyses. This opens up the possibilities for using interactive graphics which are important both for exploratory data analysis and presentation. However, most such solutions consists of simply binding exisiting JavaScript libraries which are often designed for...
Go to contribution page
63. No code, no limits: blockr, a visual AI driven data pipelines builder for R.

David Granjon (cynkra GmbH)

07/07/2026, 11:30

Talks (15-20 minutes)

Talks

We are delighted to introduce 'blockr', a visual, block-based interface for building, customising, and sharing interactive R data workflows, without any coding experience.

'blockr' enables users to snap together modular blocks for data loading, transformation, visualisation, and export, forming directed acyclic graph (DAG) pipelines with instant visual feedback. It carefully integrates AI...
Go to contribution page
181. What If We Break Base R? Bug Propagation in CRAN

Szymon Maksymiuk (WUT)

07/07/2026, 11:30

Talks (15-20 minutes)

Talks

With over 23,000 packages, CRAN forms a large ecosystem of reviewed code. Additionally, its strict rules and dependency structure make it great for analyzing bug propagation in an ecosystem.

This talk presents a hypothetical scenario in which I introduce a bug into base R. I analyze the consequences for the entire ecosystem and whether packages themselves would catch those bugs with their...
Go to contribution page
165. bbk: Accessing Central Bank Data in R

Maximilian Mücke (LMU Munich)

07/07/2026, 11:50

Lightning Talk (5 minutes)

Lightning Talks

bbk provides a unified R interface for accessing data from major central banks, including the Deutsche Bundesbank, European Central Bank, Swiss National Bank, and Bank of England. Central bank data is widely used in economic research and financial modeling, yet each institution exposes its own API with different conventions, making reproducible data access cumbersome. bbk abstracts over these...
Go to contribution page
51. Causal Inference with marginaleffects

Vincent Arel-Bundock (Université de Montréal)

07/07/2026, 11:50

Lightning Talk (5 minutes)

Lightning Talks

Policy debates, product decisions, and scientific claims all hinge on a simple question: what would happen to Y if we changed X? In this talk, I will present a practical causal inference workflow in R, using the marginaleffects package. This package offers a consistent interface for causal inference, and it is compatible with virtually all model-fitting packages in the R ecosystem. The key...
Go to contribution page
113. roxyreqs: Adding roxygen2-style documentation to testthat

Moritz Lang (Roche)

07/07/2026, 11:50

Lightning Talk (5 minutes)

Lightning Talks
R has excellent tooling for documenting functions via roxygen2, but no equivalent for test cases. Who wrote a test? Who reviewed it? What requirement does it verify? This information often lives in comments or external documents - disconnected from the code.

The roxyreqs package extends roxygen2 to support @meta tags above test_that() blocks:
```
#' @meta author Alice
#'...
```
Go to contribution page
57. SweEpiAI: An LLM-Powered R Shiny Application for Natural Language Exploration of Swedish Epidemiological Data APIs

Dr Máté Szilcz (Viti Science)

07/07/2026, 11:50

Lightning Talk (5 minutes)

Lightning Talks

Background: Public health databases often present barriers to non-technical users through complex query interfaces requiring knowledge of variable codes and API structures. The Swedish National Board of Health and Welfare maintains a comprehensive statistical database on disease prevalence and hospital care, yet exploring these data programmatically demands substantial technical...
Go to contribution page
86. Empowering African Life Scientists Through R: Outcomes of a 3-Day Bioinformatics Outreach Nigeria Training

Seun Olufemi (Bioinformatics Outreach Nigeria)

07/07/2026, 11:55

Lightning Talk (5 minutes)

Lightning Talks

Computational literacy remains a critical gap among life scientists in sub-Saharan Africa, limiting their contribution to global science and competitiveness in data-driven research careers. Bioinformatics Outreach Nigeria (BON) organised a 3-day intensive R training for life scientists, covering base R, data types and structures, data cleaning, tidyverse-based manipulation, and ggplot2...
Go to contribution page
35. paperboy - A Collection of News Media Scrapers in R

Sina Chen (GESIS Leibniz Institute for the Social Sciences)

07/07/2026, 11:55

Building tools for reproducible research

Lightning Talk (5 minutes)

Lightning Talks

The philosophy of the R package paperboy is that the package is a repository for webscraping scripts for news media sites, with advanced features for quick data retrieval - even for content behind log-ins or anti-scraping measures. Many data scientists and researchers write their own code when they have to retrieve news media content from websites. At the end of research projects, this code is...
Go to contribution page
62. Fuzz Testing R-Based Research Software for Robustness

Marco Colombo (University of Heidelberg)

07/07/2026, 13:00

Talks (15-20 minutes)

Talks

Fuzz testing is a technique that generates random, unexpected or malformed
data and feeds them into a program. By observing how the software behaves
under such conditions, this approach may uncover potential weaknesses,
vulnerabilities and security flaws in applications.

A dynamically-typed language such as R poses some difficulties to fuzz testing,
as by definition no predefined typing...
Go to contribution page
99. Measuring the Invisible: Investing in R's Core Infrastructure

Laia Domenech-Burin (Sovereign Tech Agency)

07/07/2026, 13:00

Talks (15-20 minutes)

Talks

The R language provides essential infrastructure for statistical computing, research, and data science worldwide. Yet, the labour that sustains this infrastructure remains largely invisible: maintaining legacy upstream code, triaging complex systems-level bugs, and hardening build pipelines. This work underpins the reliability and security of the ecosystem, but remains difficult to measure,...
Go to contribution page
127. Beyond Code Coverage: Mutation Testing in R with MutatoR

Dr Pierre Donat-Bouillud (Czech Technical University in Prague)

07/07/2026, 13:20

Talks (15-20 minutes)

Talks

The standard metric for testing and is widely used in R packages, for instance with testthat or tinytest. However, coverage can often be misleading: 100% coverage does not mean the absence of bugs, and line covered with a test does not mean that all behaviours flowing through this line are effectively verified.

This presentation introduces MutatoR, a new package designed to bring...
Go to contribution page
236. Sponsor Session

Marcin Dubel (Appsilon), Mike Smith (Pfizer R&D UK Ltd), Olajoke Oladipo (cynkra)

07/07/2026, 13:20

Talks (15-20 minutes)

Sponsor Session

Appsilon
How Appsilon useR - Marcin Dubel

cynkra
Data Workflows for All: Reproducible Data Analysis in the Age of AI - Olajoke Oladipo

R Consortium
R is People - Mike Smith
Go to contribution page
162. 5 years of internal software validation in Roche

Szymon Maksymiuk (Roche), Lorenzo Braschi (Roche)

07/07/2026, 13:40

Talks (15-20 minutes)

Talks

Software validation is a key part of analyses conducted in a regulatory environment, and it is crucial to document the accuracy, reproducibility, and traceability of the software used for clinical analyses. At Roche, we initially outsourced this effort to an external partner. However, this was a solution far from ideal due to, among others, bottlenecks, a fixed release schedule, and...
Go to contribution page
182. Quality Control Risk Model for R Package Validation

Magnus Mengelbier (Independent / Freelancer)

07/07/2026, 14:00

Talks (15-20 minutes)

Talks

There are many validation approaches for R and its packages that has a foundation in classic software and application validation, specifically how it relates to statistical analysis applications within regulated industries like Life Science. The introduction of risk-based validation approaches in the last decade has provided additional tools, but as we now approach R as a language, packages as...
Go to contribution page
128. Building the Ultimate R AI Assistant

Pawel Rucki (Roche)

07/07/2026, 14:20

Lightning Talk (5 minutes)

Lightning Talks

While standard LLMs are powerful for general coding, they often fall short when working with specialized or internal R packages. This usually comes down to a knowledge gap - public models simply do not have access to private packages or the most recent documentation. To solve this, we propose a shift away from a single, all-purpose assistant toward a decentralized network of agents, where each...
Go to contribution page
129. R Documentation as an AI Tool

Pawel Rucki (Roche)

07/07/2026, 14:25

Lightning Talk (5 minutes)

Lightning Talks

When an AI assistant suggests code for a specialized R package, it’s often guessing based on outdated or public data. We built mcp.rhelp to turn local R documentation into a tool for AI coding assistants. This lightweight MCP server allows assistants to navigate R’s documentation - finding where a function lives, reading its exact help file, and inspecting its source code. In this talk, I...
Go to contribution page
163. Even faster C for ultimate R performance

Laura Bąkała

07/07/2026, 14:30

Lightning Talk (5 minutes)

Lightning Talks

It's no secret that to optimize R code one has to write it in C (or C++). This is, however, just a beginning. When does it make sense to rewrite the code? What are the typical performance sinkholes? What are the key techniques to get even more performance?

There are surprisingly few resources where you could find the answers, and my talk will try to fill that gap. Lately, I've been working...
Go to contribution page
230. R Under Sirens: Research, Students, and Community in Wartime Ukraine

Dariia Mykhailyshyna

07/07/2026, 15:00

Keynote

Abstract:
The talk examines the use of R for empirical research, university teaching, and community-oriented training related to Ukraine during the Russian full-scale invasion. It focuses on how R is applied in applied research projects, in classroom instruction, and in initiatives such as the Workshops for Ukraine series when work is affected by repeated disruption.

Examples will be used...
Go to contribution page
12. Comprehensive PhD documentation with R and Quarto

Tina Rozsos (Vrije Universiteit Amsterdam)

07/07/2026, 16:10

Lightning Talk (5 minutes)

Lightning Talks

A PhD is a complex, multi-year project that involves managing a diverse range of information: from meeting notes to research ideas, from immediate tasks to long-term plans. Disorganized notes and files lead to inefficiencies, loss of information, and challenges in reproducing past workflows. This lightning talk presents a comprehensive PhD documentation template built with R and Quarto, meant...
Go to contribution page
5. Evaluating imputation strategies for longitudinal cohort studies

Dr Sinead Moylett (University of Limerick)

07/07/2026, 16:10

Epidemiology

Lightning Talk (5 minutes)

Lightning Talks

Missing data is a structural feature of longitudinal cohort studies and can bias inference when attrition is systematic. We present an R-based evaluation of imputation strategies for ordinal outcomes in the Irish Longitudinal Study on Ageing (TILDA) across five waves. All data processing, simulation, imputation, and evaluation were implemented in R, enabling a fully reproducible workflow for...
Go to contribution page
14. Why Pay for Survey Platforms? Just Use R!

Mr Clievins Selva (Deutsch)

07/07/2026, 16:10

Lightning Talk (5 minutes)

Lightning Talks

Researchers conducting surveys and assessments often encounter commercial barriers. While convenient, platforms also come with substantial licensing costs, vendor lock-in, and data sovereignty concerns. Meanwhile, open-source alternatives typically require researchers to use separate tools for survey design, data collection, analysis and reporting. This creates inefficiencies, increases the...
Go to contribution page
56. Computing ROC AUC Efficiently with R

Błażej Kochański (Politechnika Gdańska)

07/07/2026, 16:15

Lightning Talk (5 minutes)

Lightning Talks

The Area Under the Receiver Operating Characteristic Curve (AUC) is a widely used measure for evaluating the performance of binary classification models. In the literature and in practice, it appears under various names and is closely related to other performance measures. We review these formulations and discuss the motivation for efficient AUC computation in empirical analysis. We survey R...
Go to contribution page
24. Following Polish graduates’ pathways with R

Tomasz Żółtak (Institute of Philosophy and Sociology, Polish Academy of Sciences), Dr Grzegorz Humenny (Educational Research Institute, Warsaw, Poland), Mr Bartłomiej Płatkowski (Educational Research Institute, Warsaw, Poland)

07/07/2026, 16:15

Lightning Talk (5 minutes)

Lightning Talks

Polish secondary school graduates tracking system has been providing data on further educational and professional careers of Polish youth annually since 2021. Moreover, the ways of disseminating these results are constantly being developed: from static, general reports to interactive dashboards that are designed to serve the needs of specific groups of stakeholders. In our presentation we will...
Go to contribution page
47. Tracking your own time and productivity using R and Clockify

Håvard R. Karlsen (NTNU)

07/07/2026, 16:15

Lightning Talk (5 minutes)

Lightning Talks

Logging your own work hours is an efficient way to work out what you spend your time on. It can be used to help you manage your time better, to ensure you spend an appropriate time on a project, or even to negotiate pay or responsibilities at work. In this lighting talk I show how I keep track of my own hours using R and an external tracking tool (here: Clockify). My...
Go to contribution page
39. From Administrative Data to Interactive Healthcare Benchmarking: An End-to-End R Workflow

Janez Bijec (University of Ljubljana / Statistical office of Slovenia)

07/07/2026, 16:20

Lightning Talk (5 minutes)

Lightning Talks

Routine administrative data collected by healthcare payers hold significant potential for monitoring care quality, yet translating them into actionable insights requires careful statistical modeling and thoughtful communication. This talk presents a complete R-based pipeline — from raw reimbursement data to an interactive Shiny dashboard — developed to benchmark hospital performance for...
Go to contribution page
111. metacart 3.0: Classification and regression trees for Exploratory Moderator Analysis in Meta-Analysis

Juan Claramunt Gonzalez (Leiden University)

07/07/2026, 16:20

Lightning Talk (5 minutes)

Lightning Talks

metacart 3.0 integrates regression and classification trees into the framework of meta-analysis to perform exploratory moderator analysis. Meta-regression trees identify, based on the study characteristics and their interactions, subgroups of studies that maximize within-subgroup homogeneity of effect sizes. To avoid overfitting, the resulting tree is pruned using cross-validation. Finally,...
Go to contribution page
64. Reproducible Clinical Data Review: A Modular R and Quarto Workflow for Transparent Reporting

Mr Winkle Lu

07/07/2026, 16:20

Lightning Talk (5 minutes)

Lightning Talks

During clinical trials, EDC data must be reviewed regularly — from routine medical review reports to formal audit and inspection scenarios. In these contexts, reviewers do not explore data freely. They follow a predictable, structured process and need a document that clearly records what was seen, under what conditions, and what conclusions were drawn. This core distinction shapes the design...
Go to contribution page
149. Let triangles solve your multinomial problems: Using simplexes to analyze low dimension trade-offs

Liam Mueller (UC San Diego)

07/07/2026, 16:25

Lightning Talk (5 minutes)

Lightning Talks

How should we best model multinomial tradeoffs? When there are only two options that sum to 100 percent, it can be straightforward to employ options like GLM to address research questions with binomial response data, but when there are more than two groups, these multinomial models become too complex to perform and too convoluted for our audience to understand. However, using simplexes to...
Go to contribution page
70. Rxsim: Reducing friction in simulating clinical trials (RCT) using R6 programming.

Dr Matthew Valko (Boehringer Ingelheim Pharma GmbH & Co. KG), Dr Saumil Shah (Boehringer Ingelheim Pharma GmbH & Co. KG)

07/07/2026, 16:25

Lightning Talk (5 minutes)

Lightning Talks

Simulations are a great tool to support decisions on optimal trial designs. Trial designs have evolved from simple 2-arm treatment comparison to multi-arm dose-finding, adaptive designs, and platform trials. R is a popular and important tool for those in the pharmaceutical industry. R6 programming and supporting R functions have allowed us to design a package rxsim that is flexible enough to...
Go to contribution page
98. Shiny Rhinos Docking Ducks on a Shoestring: How We Built the Austrian Health Atlas

Martin Zuba (Austrian National Public Health Institute (GOEG)), Zuzanna Brzozowska (Austrian National Public Health Institute (GOEG))

07/07/2026, 16:25

Lightning Talk (5 minutes)

Lightning Talks

The Austrian Health Atlas is an open-access platform using interactive charts and maps to intuitively illustrate public health data. Developed by the Austrian National Public Health Institute (GÖG), the Gesundheitsatlas shows trends, determinants and socio-economic differences in the health of the Austrian population, offering both international and subnational comparisons.
The platform's...
Go to contribution page
74. Clinical Output Review with R

Rozeta Simonovska

07/07/2026, 16:30

Lightning Talk (5 minutes)

Lightning Talks

Part of the responsibilities of a statistician working on clinical trials is reviewing tables, listings, and figures (TLFs) to ensure accuracy and compliance. However, the review process can be challenging due to the volume of outputs and the dynamic nature of clinical trial data. A common issue arises when outputs are reviewed and checked, but subsequent data updates or programming changes...
Go to contribution page
112. Scientific Accountability in Shiny Apps

January Weiner

07/07/2026, 16:30

Lightning Talk (5 minutes)

Lightning Talks

The principle of scientific accountability is more than reproducibility of data analysis: each result – figure, table or p-value must be unequivocally and easily tracked to the original data from which it originates. In many bioinformatics workflows this kind of accountability is difficult to maintain, particularly in interactive web applications where exploratory analyses are performed...
Go to contribution page
175. Solving Optimum Allocation Problems in Stratified Sampling Using the R Package stratallo

Wojciech Wójciak (Warsaw University of Technology)

07/07/2026, 16:30

Lightning Talk (5 minutes)

Lightning Talks

Optimum allocation of sample sizes across strata is a fundamental problem in survey sampling. When designing a stratified survey, researchers must decide how to distribute a fixed total sample size among strata in order to achieve specific statistical goals, such as minimizing estimator variance or minimizing survey cost subject to precision constraints. Classical results, such as...
Go to contribution page
143. How many coders does it take to write a book? Teaching Programming Across Disciplines is out now!

Dr Brittany Blankinship (The University of Edinburgh), Dr Pawel Orzechowski (The University of Edinburgh)

07/07/2026, 16:35

Lightning Talk (5 minutes)

Lightning Talks

Coding and data have been hijacked by macho-nonsense culture (and the teaching of coding even more so). We noticed a unique opportunity to write a book comprised of a range of case studies and ideas which flip the table and flip the narrative.

Over the last 5 years we have built a community of coding and data educators who teach outside of traditional computer science settings. Our 300+...
Go to contribution page
180. Marrying form and function: design takeaways from an R package for estimating real-time epidemic trends from multiple streams of surveillance data

Dr Saras Windecker (The Kids Research Institute, Perth, Australia)

07/07/2026, 16:35

Lightning Talk (5 minutes)

Lightning Talks

Real-time growth and prevalence of infectious diseases are vital information for epidemic response. Yet it is rare to directly observe infections, and so we must reconstruct infection trends from delayed and imperfect alternatives, including time-series of case counts and cross-sectional infection prevalence surveys. Joint inference from multiple data can improve estimates of infection trends,...
Go to contribution page
96. R communities – from first crush to long-term relationship

Ms Susanne Steinmann (Clinical Cancer Registry Lower-Saxony, Team Data Analysis)

07/07/2026, 16:35

Lightning Talk (5 minutes)

Lightning Talks

The R language and R Studio are widely used in most of the federal cancer registries in Germany; but only few have special “R onboarding” programs. Therefore, a R user group “Forum R” was initiated in 2021 as part of the expert panel of the clinical cancer registries “Plattform § 65c”. This “Forum R” aims for networking and further education for employees of the cancer registries in Germany in...
Go to contribution page
119. Enabling Data Exploration Through a Metadata Layer: LLM Integration in cohortBuilder

Adam Forys (Roche), Krystian Igras (7N)

07/07/2026, 16:40

Lightning Talk (5 minutes)

Lightning Talks

To filter data, users need to know dataset structures, variable names, and valid value ranges. The cohortBuilder R package offers a common API for multi-step filtering across data frames, databases, and custom backends. The shinyCohortBuilder package adds an interactive Shiny GUI on top of it.

We introduce a metadata layer in cohortBuilder that connects filtering pipelines to large...
Go to contribution page
193. icsp2: Stratified interval censoring survival for clinical trial analysis

Isaac Gravestock (Roche)

07/07/2026, 16:40

Lightning Talk (5 minutes)

Lightning Talks

In clinical trials involving time to disease progression that can only be assessed at clinical visits, the real time to event is interval censored between visits, eg tumour assessments in progression-free survival (PFS) outcomes in oncology trials. In the typical analysis of PFS, we systematically impute the event time to be the visit where the progression was detected.

In many cases the...
Go to contribution page
187. R as Complete Teaching Infrastructure: Automating and Enriching University Geography Education

Matthew Haffner (University of Wisconsin - Eau Claire)

07/07/2026, 16:40

Lightning Talk (5 minutes)

Lightning Talks

Teaching university courses in geography, urban planning, and spatial data science demands tools that can convey place, and R is remarkably well-suited to the task. This talk presents a reproducible R-based teaching infrastructure built around several components: interactive spatially-enabled presentations, a course website ecosystem, and a reporting system leveraging the Canvas API. Lectures...
Go to contribution page
225. ApoBcomp: An R Shiny Framework for Apolipoprotein B Estimation and Cardiovascular Risk Modeling

Ms Aleyna Erakcaoğlu (Department of Biostatistics, Erciyes University School of Medicine, Kayseri, Türkiye)

07/07/2026, 16:45

Lightning Talk (5 minutes)

Lightning Talks

Apolipoprotein B (ApoB) is a key biomarker reflecting the number of atherogenic lipoprotein particles and is increasingly recognized as a superior indicator of cardiovascular risk compared to traditional lipid measures. However, direct ApoB measurement is not always routinely available in clinical practice, and existing tools do not provide an integrated framework for its estimation and...
Go to contribution page
155. Scaling training to infinity... and beyond!

Raniere Gaia Costa da Silva

07/07/2026, 16:45

Lightning Talk (5 minutes)

Lightning Talks

This task will show how [JupyterLite][1] can improve a community hosted training session by reducing the friction for learners to access JupyterLab in class and later at home.

JupyterLite is a distribution of Jupyter for WebAssembly (Wasm) enabling JupyterLab to run completely (including the Jupyter server) in a modern web browser. Support to R on JupyterLite was added in 2025.

For...
Go to contribution page
150. **md2qstn** : An R Package for Reproducible Survey Design and Lifecycle Management

Yasuto NAKANO (Kwansei Gakuin University)

07/07/2026, 17:00

Poster

Poster

The purpose of this talk is to present md2qstn, a specialized R library developed to bridge the gap between plain-text survey drafting and digital deployment. md2qstn enables the conversion of Markdown-formatted questionnaires into DDI(Data Documentation Initiative)-compliant XML and Qualtrics-compatible QSF(Qualtrics Survey Format) JSON files. Although the prevailing approach in...
Go to contribution page
194. A {ladder} to get on to the (Google) Slides

Isaac Gravestock (Roche)

07/07/2026, 17:00

Poster

Poster

Google slides is a widely available productivity tool used by many institutions but is not well integrated with existing R ecosystem workflows such as Rmarkdown, which has made its use incompatible with reproducible research and reporting. ladder is an R package for inserting tables into Slides presentations and supports multiple table formats from R.

In particular it supports flextable...
Go to contribution page
27. An LLM-based Pipeline for Understanding Decision Choices in Data Analysis from Published Literature

H. Sherry Zhang (University of Texas at Austin)

07/07/2026, 17:00

Poster

Poster

Decision choices, such as those made when building regression models, and their rationale are essential for interpreting results and understanding uncertainty in an analysis. However, these decisions are rarely studied because tracing every alternatives considered by authors is often impractical, and reworking a completed analysis is generally of limited interest. Consequently, researchers...
Go to contribution page
207. Analyzing Global Music Trends Using Spotify Audio Features: A Data-Driven Analysis in R

shristi y (IIIT UNA)

07/07/2026, 17:00

Poster

Poster

The widespread adoption of digital music streaming platforms has created unprecedented opportunities to analyze large-scale music consumption data. This study investigates global music trends by analyzing Spotify track data using statistical and visualization techniques implemented in R. The objective is to explore how various audio features—including danceability, energy, valence,...
Go to contribution page
19. Asking the Community: Interesting Queries and Everyday Challenges for the R Ecosystem

Florian Sihler (Ulm University)

07/07/2026, 17:00

Poster

Poster

In the past months, we built a tool to analyze all versions (roughly 170,000) of all packages available on CRAN, obtaining around 80 GB of raw data on various semantic aspects such as call graphs of functions, dead code, values of constants, the coverage of provided vignettes, transitive dependencies of packages, and much more. Moreover, the data is linked to the release date and...
Go to contribution page
109. City River Spaces (CRiSp): Automating and Scaling Up the Delineation of Urban River Spaces

Claudiu Forgaci (Delft University of Technology)

07/07/2026, 17:00

Poster

Poster

Spatially designing and planning urban transformations around rivers while capturing the complexities of riverside urban areas remains challenging. An essential part of the challenge is how boundaries are drawn in the analysis of urban areas surrounding rivers. To overcome this challenge, we developed the rcrisp open-source R package to automate the morphological delineation of riverside...
Go to contribution page
107. Cross-Language Collaboration in Spatial Data Science: The SDSL Community

Claudiu Forgaci (Delft University of Technology)

07/07/2026, 17:00

Poster

Poster

The Spatial Data Science across Languages (SDSL) Community brings together developers and users of common and emerging programming languages for spatial data science. It aims to foster understanding and address common issues while discussing language-specific problems. We focus broadly on geospatial and geographic space, with some applications to general image spaces and local reference...
Go to contribution page
59. deltatest: Statistical Hypothesis Testing Using the Delta Method for Online A/B Testing

Daisuke Ichikawa (Kibaroku), Koji Makiyama (HOXO-M Inc.), Shinichi Takayanagi, kazuyuki sano

07/07/2026, 17:00

Poster

Poster

Online A/B tests often randomize at the user level while evaluating ratio metrics at a finer-grained unit, such as page views or sessions. This mismatch induces within-user correlation and can make standard Z-tests anti-conservative, increasing false positives. The deltatest package provides an R interface for delta-method-based hypothesis testing of ratio metrics, following the practical...
Go to contribution page
174. Flexible Aggregation with SUOWA Operators in R. An Implementation Based on the Choquet Integral

Teresa Gonzalez-Arteaga (Universidad de Valladolod)

07/07/2026, 17:00

Poster

Poster

Aggregation functions play a central role in decision making, and among them, weighted means and Ordered Weighted Averaging (OWA) operators are two of the most widely used families. Their relevance is reinforced by the fact that both can be expressed as particular cases of the Choquet integral, which has inspired numerous attempts to develop unified generalizations of these operators.
...
Go to contribution page
148. metasurvey: Reproducible Survey Data Processing with Step Pipelines in R

Mauro Loprete (Universidad de la República, Uruguay)

07/07/2026, 17:00

Poster

Poster

Household survey microdata is a primary input for social science research and public policy evaluation, yet the processing pipelines that turn raw microdata into publishable estimates are rarely documented, shared, or reproduced. Each research team writes ad hoc scripts to recode variables, construct indicators, and compute weighted statistics, duplicating effort and introducing silent...
Go to contribution page
29. Orchestrating the b3verse: Elevating Research Software through a Modular R Package Ecosystem

Ward Langeraert (Research Institute for Nature and Forest)

07/07/2026, 17:00

Analysis best practices and workflows

Poster

Poster

Scaling research software beyond single scripts or standalone packages requires deliberate architectural choices, shared conventions, and robust distribution infrastructure. This poster presents the b3verse, a coordinated ecosystem of twelve interoperable R packages designed to transform large biodiversity occurrence cubes into standardized indicators for research and policy...
Go to contribution page
220. pemr: A Unified R Interface for Managing Personal Exposure Monitor Data in Air Pollution Research

Dr Oscar de Leon (Universidad del Valle de Guatemala)

07/07/2026, 17:00

Poster

Poster

Air pollution exposure research relies on a growing diversity of wearable personal exposure monitors (PEMs), each producing log files with distinct header structures, column naming conventions, and measurement units. The R ecosystem already offers strong infrastructure at adjacent layers for network-level data (openair and AirSensor), on-road vehicle emission systems (pems.utils), and...
Go to contribution page
154. rush: A Database-Centric Architecture for Distributed Computing in R

Mr Marc Becker (Ludwig-Maximilians-Universität München)

07/07/2026, 17:00

Poster

Poster

We present rush, an R package for asynchronous and decentralized optimization. Traditional approaches for parallel computing in R follow a controller-worker model where a central process proposes tasks, dispatches them to workers, and collects results. When proposing new tasks is computationally expensive, the central controller becomes a bottleneck that leaves workers idle, a problem that...
Go to contribution page
106. Statistical Analysis and Predictive Modeling of IPL Match Data Using R

Vihan Singh (Indian Institute of Information Technology Una)

07/07/2026, 17:00

Poster

Poster

The increasing availability of structured sports datasets has created new opportunities for applying statistical analysis and predictive modeling techniques to sports analytics. The Indian Premier League (IPL) provides detailed match and ball-by-ball datasets that allow in-depth statistical exploration of match dynamics and performance patterns. This study applies statistical analysis and...
Go to contribution page
108. The Rbanism Community: Empowering Urbanists to Use Research Software Effectively and with Confidence

Claudiu Forgaci (Delft University of Technology)

07/07/2026, 17:00

Poster

Poster

The Rbanism community aims to empower urbanism researchers, students, educators and practitioners to use open-source software and related open-science practices effectively and with confidence. It raises awareness, stimulates engagement and builds capacity by demonstrating the benefits of reproducibility, automation and scalability. Rbanism was initiated in 2021 by a group of R users in the...
Go to contribution page
115. Training and assessment of spatial prediction models: challenges, conceptual frameworks and implemented strategies in the R package CAST

Hanna Meyer (University of Münster)

07/07/2026, 17:00

Poster

Poster

One key task in environmental science is the continuous mapping of environmental variables across space, and often across both space and time. Machine learning algorithms are frequently employed for this purpose, combining local field observations with comprehensive sets of predictor variables to produce spatial predictions. This enables the prediction of the variable of interest at locations...
Go to contribution page
231. A world still to be mapped: reflections on geocomputation in R

Jakub Nowosad

08/07/2026, 09:00

Keynote

Abstract:

Spatial data are central to understanding environmental change, social processes, and their interactions. Maps are not only visual products of analysis, but tools that shape how problems are framed, how phenomena are perceived, and how decisions are made. R has become a widely used environment for spatial data science because it combines interactive analysis, open development,...
Go to contribution page
34. Modernizing R's web-mapping capabilities

Tim Appelhans (Friedrich-Alexander Universität Erlangen-Nürnberg, Institut für Geographie, Wetterkreuz 15, 91058 Erlangen, Germany)

08/07/2026, 10:30

Talks (15-20 minutes)

Talks

When it comes to web-mapping in R, RStudio’s (Posit’s) 'leaflet' package has established itself as the de-facto standard. However, in addition to the lack of support for 3D data, 'leaflet' struggles with the rendering of large data (i.e. more than 1 million points/vertices) and data transfer from R memory to the browser can be slow. To overcome these bottlenecks, two packages, 'geoarrowWidget'...
Go to contribution page
8. Practical Strategies for R-Based Research in Secure Data Environments

Aleksi Lahtinen (University of Turku)

08/07/2026, 10:30

Analysis best practices and workflows

Talks (15-20 minutes)

Talks

In social science research, datasets are often confidential, requiring analyses to be conducted in secure remote access environments. One such environment is FIONA, which provides researchers with access to sensitive, unit-level Finnish register data alongside standard statistical software, including R. We use FIONA to analyse extensive register data on Finnish teenagers and young adults, with...
Go to contribution page
45. stats::free1way(): Semiparametrically Efficient Population and Permutation Inference in Distribution-free Stratified K-sample Oneway Layouts

Torsten Hothorn (CH/DE)

08/07/2026, 10:30

Talks (15-20 minutes)

Talks

Starting with R 4.6-0, the stats package provides infrastructure for distribution-free model-based inference in possibly stratified K-sample oneway layouts via the novel free1way model function. Treatment effects to be estimated using free1way include odds- and hazard ratios, Lehmann parameters, and a generalised version of Cohen's d for at least ordered and possibly right-censored...
Go to contribution page
25. {teal}: A production-ready modular framework for reproducible data exploration on clinical trials

Dr Lluís Revilla Sancho (Roche, Spain)

08/07/2026, 10:30

Talks (15-20 minutes)

Talks

In data-intensive industries like pharmaceuticals, the ability to move seamlessly from population-level summaries to individual data points is critical for valid insight generation. However, enabling faster, more efficient exploratory analysis and regulatory delivery of clinical trials insights is challenging and difficult to scale. This talk introduces {teal}, recently released on CRAN with...
Go to contribution page
90. An Enhanced Projection Pursuit Tree Classifier with Visual Methods for Assessing Algorithmic Improvements

Natalia da Silva (Universidad de la República)

08/07/2026, 10:50

Talks (15-20 minutes)

Talks

This talk presents enhancements to the projection pursuit tree classifier and visual diagnostic methods for assessing their impact in high dimensions. The original algorithm uses linear combinations of variables in a tree structure where depth is constrained to be less than the number of classes, a limitation that proves too rigid for complex classification problems. Our extensions improve...
Go to contribution page
121. Areal interpolation methods

Edzer Pebesma (University of Münster)

08/07/2026, 10:50

Talks (15-20 minutes)

Talks

Areal units (polygons or pixels) are often used to summarize and distribute spatial data. In many spatial data science studies, datasets with different areal units need to be combined, and to do so first have to be transformed into a common set of areal units. This involves upscaling (going to a coarser resolution) or downscaling (going to a finer spatial resolution), or a combination. This...
Go to contribution page
42. Building stateful web apps with Rserve

Tom Elliott (iNZight Analytics Ltd)

08/07/2026, 10:50

Talks (15-20 minutes)

Talks

Web front-ends offer seamless deployment across desktop and mobile platforms, with an ever growing variety of libraries enabling rich, interactive user experiences. However, integrating these modern web technologies with R's powerful analytic capabilities presents significant challenges.

Rserve is an R library that enables two-way communication between R and other programming...
Go to contribution page
17. Critical Paths in R: Turning Data Scientists into Effective Project Managers

Gary Sutton

08/07/2026, 10:50

Talks (15-20 minutes)

Talks

Many data scientists and other R users have no doubt been assigned to projects that quickly derailed—missed deadlines, scope creep, budget overruns—often because effective project management was assumed to be a "soft skill" anyone without deep technical expertise could handle. In reality, successful project delivery demands rigorous quantitative techniques: structured decomposition,...
Go to contribution page
216. Is here still where it was? Changes in terrestrial reference frames in R spatial packages

Roger Bivand (Norwegian School of Economics)

08/07/2026, 11:10

Talks (15-20 minutes)

Talks

When we represent or analyse spatial data, the position of observations matters. Where objects of interest are, also in relation to each other, and how position is measured, are referenced through coordinate reference systems (CRS), including units of measurement. Local CRS use arbitrary starting points, while standard CRS rely on tabulated values, for example the axis lengths of an ellipsoid...
Go to contribution page
92. Managing Analytic Multiplicity in Epidemiology: Reproducible Multiverse and Vibration of Effects Analyses in R

Alyssa Columbus

08/07/2026, 11:10

Talks (15-20 minutes)

Talks

Empirical data analysis often involves a large number of defensible analytic choices, including model specification, covariate selection, transformations, and approaches to missing data. These decisions can meaningfully influence statistical results, yet they are rarely explored or reported systematically. This talk presents a reproducible R-based workflow for examining analytic multiplicity...
Go to contribution page
52. Plumbing Your Way to React: Using Plumber APIs to Evolve Shiny Systems

Deepansh Khurana (Dimwit Labs)

08/07/2026, 11:10

Talks (15-20 minutes)

Talks

In 2024, I talked about How I Built An API for My Life (and How You Can Too) which was a personal life tracker, called Hrafnagud, built using a slew of services but primarily with the support of R, namely, {plumber} and {shiny}. This was an API with a chunk of different endpoints for finance, travel, and more. The key focus was building the first iteration of such a system, and as the system...
Go to contribution page
105. R Package VIM: Flexible Model Adaptive Imputation with Vimpute

Eileen Vattheuer (Statistics Austria)

08/07/2026, 11:10

Talks (15-20 minutes)

Talks

Handling missing values in heterogeneous datasets remains a common challenge in applied data analysis. We introduce vimpute(), a new function in the R package VIM that provides a unified and model-adaptive framework for multivariate imputation. The function implements an iterative, variable-wise imputation procedure that adapts the modelling strategy to the type and characteristics of each...
Go to contribution page
159. A Connected Set of R packages for Regularized Regression based on Ridge and S-Type Estimators

Dr Filiz Karadag (Ege University), Dr Olgun Aydin (Gdansk University of Technology)

08/07/2026, 11:30

Talks (15-20 minutes)

Talks

The subject of this presentation concerns the implementation of regularized regression and advanced parameter estimation procedures. The study demonstrates the practical application of these methods using three R packages developed by the authors: S-type.est, Styperidge.reg, and ridgregextra.

Within this ecosystem, the S-type.est package provides the core estimation procedures for S-type...
Go to contribution page
55. CRDTs for R: Conflict-Free Data Structures for Real-Time Collaboration

Charlie Gao (Posit Software, PBC)

08/07/2026, 11:30

Talks (15-20 minutes)

Talks

Collaborative data analysis presents a fundamental concurrency problem: when multiple users modify the same data simultaneously, how should conflicts be resolved? Traditional approaches rely on locking or central arbitration, but conflict-free replicated data types (CRDTs) offer a principled alternative. A CRDT is a data structure whose concurrent operations are guaranteed to converge to the...
Go to contribution page
88. Introduction to edgeTransport: developing sustainable transportation futures

Alex K. Hagen (Potsdam Institute for Climate Impact Research (PIK))

08/07/2026, 11:30

Lightning Talk (5 minutes)

Talks

Scenario analysis lets us explore ‘what-if’ futures for climate change, linking today’s decisions to long-term impacts. Systematic comparison of scenarios for transport – a key sector for emission reductions – can inform decision-making in policy and investments. Decarbonization of passenger and freight transport depends on mode shifts and the adoption of low-carbon technologies like electric...
Go to contribution page
211. Towards Reproducible Research using NHANES Data

Robert Gentleman (Dana-Farber Cancer Institute)

08/07/2026, 11:30

Talks (15-20 minutes)

Talks

The National Health and Nutrition Examination Survey (NHANES) provides extensive public data on demographics, health, and nutrition, collected in two-year cycles since 1999. Although invaluable for epidemiological and health-related research, the complexity of NHANES data makes accessing, managing, and analyzing these datasets challenging. We present a reproducible computational environment...
Go to contribution page
58. automatedRecLin: Record Linkage Based on an Entropy-Maximizing Classifier

Adam Struzik (Adam Mickiewicz University in Poznań)

08/07/2026, 11:50

Lightning Talk (5 minutes)

Lightning Talks

In this paper, we present the automatedRecLin R package (available on CRAN), designed to perform record linkage based on an entropy-maximizing classifier. First, we briefly introduce the maximum entropy classification algorithm for record linkage, originally proposed by Lee et al. (2022, Surv. Methodol.), and describe an extension that allows for the use of continuous comparison functions....
Go to contribution page
20. flowR: A Dataflow Analysis Framework and Program Slicer for R

Florian Sihler (Ulm University)

08/07/2026, 11:50

Lightning Talk (5 minutes)

Lightning Talks

R provides a plethora of packages and features that support the dynamic and interactive exploration of data.
Yet, there is a lack of static analysis tools which support program comprehension, reproducibility, and software engineering practices.
With flowR we provide not just a static analysis framework, but also an easily accessible extension for common IDEs such as [Visual Studio...
Go to contribution page
81. Moving data computation to web browser with 'jsplyr'

Maciej Banas

08/07/2026, 11:50

Lightning Talk (5 minutes)

Lightning Talks

The presentation will introduce jsplyr, a new and experimental R package providing a dplyr interface for lazy-evaluated data manipulation in JavaScript. The package is designed to offload heavy data calculations from the server to the web browser, making it particularly useful in shiny applications.

The talk will begin with the rationale behind creating jsplyr and its current...
Go to contribution page
146. Deploying Shiny Apps on Google Cloud Run

Alfredo Hernandez Sanchez (Vilnius University)

08/07/2026, 11:55

Lightning Talk (5 minutes)

Lightning Talks

This talk presents a practical workflow for taking a Shiny app from local development to a public production deployment using Google Cloud Run. Drawing on the deployment of a real dashboard, I show how to package a Shiny app in a container, deploy it as a managed web service, connect it to a custom domain, and update it through a lightweight GitHub based workflow.

Rather than treating...
Go to contribution page
116. Streamlining reproducible research with `checklist`: new organisation management and enhanced quality control

Thierry Onkelinx (Research Institute for Nature and Forest (INBO))

08/07/2026, 11:55

Lightning Talk (5 minutes)

Lightning Talks

The checklist R package has undergone significant evolution, with a major focus on flexible organisation management and improved quality control workflows.
The most substantial change is the introduction of the org_list and org_item classes (v0.5.0), which supersede the previous organisation class.
This redesign enables research groups to define and enforce their own institutional...
Go to contribution page
18. An Investigation of the R Package Ecosystem: Insights from the crawlR Project

Florian Sihler (Ulm University)

08/07/2026, 13:00

Talks (15-20 minutes)

Talks

Listing over 23,000 Packages on CRAN alone, the R ecosystem provides a plethora of general-purpose and domain-specific packages. While there is a strict quality standard for packages to be accepted on CRAN, there is little information available about the evolution and current state of the ecosystem.
With the crawlR project, we statically analyzed all versions of all packages available on CRAN...
Go to contribution page
21. Growable Vectors and In-Place Row Operations: A data.table Perspective

Benjamin Schwendinger (Fraunhofer Austria Research GmbH)

08/07/2026, 13:00

Talks (15-20 minutes)

Talks

R’s copy-on-modify semantics make repeated growth and row-wise operations appear inherently expensive. Appending rows or incrementally building tabular structures typically triggers reallocation and copying. Recent versions of R now introduce an experimental resizable vector API, allowing vectors to be allocated with a maximum capacity and resized up to that capacity without...
Go to contribution page
41. Sparge plots for comparing univariate distributions with many overlapping data points

David Schruth (UW)

08/07/2026, 13:00

Lightning Talk (5 minutes)

Virtual Presentation Room

The visualization of data embodies a visceral fundament for understanding, and constitutes a cardinal mechanism for modern communication of any measurement outcomes of research—despite typically being neglected (largely for technical reasons) in traditional statistics literature. There are limitless ways of realizing plots of data, and visualizing even the most basic distribution of points...
Go to contribution page
31. Beyond Shiny: Building Native Desktop Applications with R

Anastasiia Kostiv

08/07/2026, 13:05

Lightning Talk (5 minutes)

Virtual Presentation Room

R has long been recognized for its statistical computing power, and Shiny has revolutionized how R users build interactive applications. However, deploying Shiny apps as standalone desktop software remains cumbersome - requiring a browser, a running R session, and often a server infrastructure. What if R could power true native desktop applications? ...
Go to contribution page
30. Interactive Bubble Charts in R Made Simple with nivo.bubblechart

Anastasiia Kostiv

08/07/2026, 13:10

Lightning Talk (5 minutes)

Virtual Presentation Room

How do you bring modern, interactive data visualizations into R without writing JavaScript? The nivo.bubblechart package lets R users create responsive, animated circle packing charts using a familiar data frame workflow. Built on top of the Nivo visualization library (React + D3), it provides clean defaults, customizable colors and interactivity, and seamless Shiny integration with click and...
Go to contribution page
196. Predicting IPL Match Outcomes Using Machine Learning and Ball-by-Ball Cricket Data Analysis in R

Dr Prince Sharma (Indian Institute of Information Technology Una)

08/07/2026, 13:15

Lightning Talk (5 minutes)

Virtual Presentation Room

The rapid growth of sports analytics has enabled researchers to use data-driven techniques to understand performance patterns and predict outcomes in competitive sports. Cricket, particularly the Indian Premier League (IPL), generates a large amount of structured match data that can be analyzed to study team strategies, player performance, and match dynamics. This project focuses on analyzing...
Go to contribution page
37. cloud::servers[workload == R][order(cost_efficiency)][1, eval(task)]

Gergely Daroczi

08/07/2026, 13:20

Talks (15-20 minutes)

Talks
Selecting the right cloud instance type for model training or other compute-intensive R workloads is often a guesswork:
1. Unclear resource requirements, e.g. "How much RAM do I need to train this hierarchical model?" or "Can my script scale to multiple CPU cores or even GPUs?"
2. Pricing and hardware specs exist, but are fragmented across vendors and hard to compare, especially when real...
Go to contribution page
118. Make your Shiny dashboard screen reader friendly

Abigail Stamm (Minnesota Department of Health), Eric Kvale (Minnesota Department of Health)

08/07/2026, 13:20

Talks (15-20 minutes)

Virtual Presentation Room

Online dashboards use data visualizations to quickly convey information. Interactive elements like charts and data filters often display additional information not accessible by screen readers. Accessibility features improve the overall presentation and user experience by augmenting, amplifying, and enhancing the content so that users can interact with and understand the same content in...
Go to contribution page
65. The (aggregated) history of every CRAN package ever

Mark Padgham (rOpenSci)

08/07/2026, 13:20

Talks (15-20 minutes)

Talks

CRAN represents and curates a complex software ecosystem of over 25,000 packages. This ecosystem constantly evolves as packages are submitted, updated, and archived. We analysed the development of CRAN over its entire lifetime, both in terms of package inter-dependencies and the internal structures of every version of every package. These analyses used the...
Go to contribution page
157. Explosive debugging with the 'boomer' package

Antoine Fabri (cynkra)

08/07/2026, 13:40

Talks (15-20 minutes)

Talks

Debugging in R often relies on manually inspecting intermediate results, usually by adding print statements to track the evolution of variables. The {boomer} package simplifies this process by automatically displaying these results in a readable and flexible way, without even modifying the code of the analyzed functions.

It gets its name from the fact that it “explodes” a call into its...
Go to contribution page
142. Please correct to safely retain your package on CRAN

Ivan Krylov (Lomonosov Moscow State University)

08/07/2026, 13:40

Talks (15-20 minutes)

Virtual Presentation Room

Generally, CRAN packages are expected to keep passing their tests, and their updates should not break other packages. Debugging the occasional failure of this process can lead the developer down a very deep rabbit hole: our dependency stacks are very deep, and none of the layers are completely bug-free.

We're going to see how problems from real CRAN packages could be investigated and solved...
Go to contribution page
221. R Analysis and the Math of Automation at Scale

Magnus Mengelbier (Independent / Freelancer)

08/07/2026, 13:40

Lightning Talk (5 minutes)

Talks

The R language can be found on laptops, servers, clusters and high performance compute environments as well as embedded within databases, services, agents and business domain solutions. As the sheer number of analyses, more compute intensive analysis methods and the size of data steadily increases, using R for analysis at scale is becoming a math problem that seems to have no one simple...
Go to contribution page
170. Futurize - Tearing Down Parallelization Barriers in R with Transpilers

Henrik Bengtsson (University of California San Francisco (UCSF))

08/07/2026, 14:00

Talks (15-20 minutes)

Talks

Ever felt like parallelizing your R code requires a complete rewrite? Transitioning from sequential code to parallel execution has traditionally meant dealing with fragmented, obscure, package-specific APIs that distract us from our main goals. Which packages and functions should I use, and what platforms should I support? There are a lot of upfront decisions to make, with many...
Go to contribution page
131. Synthetic by Design: Two R Packages for Privacy-Safe Public Health Data Generation

Abigail Stamm (Minnesota Department of Health), Eric Kvale (Minnesota Department of Health)

08/07/2026, 14:00

Lightning Talk (5 minutes)

Virtual Presentation Room

Access to realistic public health data for training, pipeline validation, and methods development is constrained by privacy regulations that restrict use of real patient records. We present two complementary open-source R packages that address this problem at different points along the synthetic data design spectrum.

toysurveydata (Stamm, MDH) generates simple, customizable fake survey...
Go to contribution page
134. One Rating, Multiple Services. Solving the 100-Service Dilemma: Automated Attribution for Multi-Service Public Satisfaction Surveys

Abdul Aziz Nurussadad (Badan Informasi Geospasial)

08/07/2026, 14:05

Lightning Talk (5 minutes)

Virtual Presentation Room

Scaling public satisfaction surveys presents a significant challenge for government hubs managing hundreds of distinct services. National regulations of Indonesia (Minister of Administrative and Bureaucratic Reform Regulation 14/2017) mandate measuring nine specific quality components—including staff behavior, requirements, and facilities—for every service provided. However, requiring citizens...
Go to contribution page
101. Movie Recommendation Model

Dr Prince Sharma (Institute of Information Technology, Una)

08/07/2026, 14:10

Lightning Talk (5 minutes)

Virtual Presentation Room

This project presents the development of a movie recommendation system implemented using the R programming language. The primary objective of the project is to design a data-driven model capable of suggesting relevant movies to users based on patterns identified in historical rating data. R was chosen for this project due to its strong capabilities in statistical computing, data analysis, and...
Go to contribution page
206. What Wins the EPL Trophy? Player Quality, Tactics, or Money? A Multi-Season Regression Analysis Using R

Ayush Pundir (IIIT UNA)

08/07/2026, 14:15

Lightning Talk (5 minutes)

Virtual Presentation Room

What determines success in the English Premier League — superior
player quality, tactical approach, or transfer investment? This
study addresses that question empirically using R and the
worldfootballR package.

We construct a panel dataset of 100 team-season observations
across five EPL seasons (2019/20 to 2023/24) by combining FBref
and Transfermarkt data via worldfootballR....
Go to contribution page
32. FoReco and FoRecoML: Unified Forecast Reconciliation in R

Daniele Girolimetto (Department of Statistical Sciences, University of Padova)

08/07/2026, 14:20

Lightning Talk (5 minutes)

Lightning Talks

Forecast reconciliation has become key to improving the accuracy and coherence of forecasts for linearly constrained multiple time series, such as hierarchical and grouped series. Yet, comprehensive software that jointly covers cross-sectional, temporal, and cross-temporal reconciliation has so far been lacking. The R packages FoReco and FoRecoML address this gap by offering a comprehensive...
Go to contribution page
43. To Save or Not to Save: Parallel Bayesian Testing for R Packages

Shuai Wu (MSD)

08/07/2026, 14:20

Lightning Talk (5 minutes)

Lightning Talks

The development of R packages for Bayesian analysis is often slowed by the computationally intensive nature of MCMC sampling, which turns iterative testing into a major bottleneck. A recurring challenge in this domain is the trade-off between saving large fitted model objects to disk versus regenerating them on each test run, a question coming up repeatedly during local development, continuous...
Go to contribution page
204. Using R to Estimate Animal Population Densities

Melissa Bather (The University of Auckland)

08/07/2026, 14:20

Lightning Talk (5 minutes)

Virtual Presentation Room

Author information:

Melissa Bather is a statistician from New Zealand, currently living in Vancouver, BC, Canada. She has a Master of Science in Statistics with First Class Honours from the University of Auckland—the birthplace of R! She is currently researching new methods to introduce multi-species models into the field of spatially explicit capture-recapture for the University of...
Go to contribution page
167. celecx: Active Learning for Computer Experiments in R

Martin Binder (Department of Statistics, LMU Munich; Munich Center for Machine Learning (MCML))

08/07/2026, 14:25

Lightning Talk (5 minutes)

Lightning Talks

To explore the behaviour of expensive black-box functions, such as machine learning model evaluations or physical simulations, it is often useful to fit a surrogate regression model to a sequence of evaluated points. Choosing these points adaptively, rather than relying on a pre-specified design, is advantageous because it places greater emphasis on regions of the configuration space where the...
Go to contribution page
68. Imputation methods and their benchmarking for fuzzy datasets with R

Maciej Romaniuk (Systems Research Institute, PAS)

08/07/2026, 14:25

Lightning Talk (5 minutes)

Lightning Talks

Imputation methods are widely used to replace missing values in datasets, thereby improving the overall quality of samples and enabling further statistical procedures. Various measures and tools were proposed to compare the effectiveness and results of the imputation algorithms. However, they only aim at “crisp” (i.e., real-valued) datasets. Meanwhile, fuzzy sets are widely used to model...
Go to contribution page
235. In memoriam: Tomáš Kalibera

Simon Urbanek

08/07/2026, 15:00
232. Release management and governance structure of the R project

Peter Dalgaard

08/07/2026, 15:10

Keynote

Abstract:
R grew out of the PC, Internet, and Open Source revolution in the 1990s. I will give an account of the early history of R, and then outline the development principles of R Core, with special focus on release management. However, the R Core Team is not alone in the governance of the R project, other major actors being the CRAN team, the R Foundation and the R Consortium. I discuss...
Go to contribution page
185. A Package for Performing Multiple Interrupted Time Series with Control

Victor Yu (Hertfordshire County Council, UK)

08/07/2026, 16:20

Poster

Poster

This package allows the user to perform interrupted time series (ITS) with a control across successive interventions (up to 3). This code is based on a prior analysis done at our county where we compared the effect of two successive behavioural interventions designed in improving the uptake of a COVID-19 booster intervention programme amongst immunosuppressed patients at several primary care...
Go to contribution page
125. A Type System for the R Language

Dr Filip Křikava (Czech Technical University in Prague)

08/07/2026, 16:20

Poster

Poster

Dynamic programming languages are increasingly adopting explicit type annotations. Not only do they serve as documentation, but they also enable static type checking to eliminate entire classes of bugs and help tools provide a better development experience. In this talk, we will present our advancements in bringing types to R, including a type system with a static type checker with type...
Go to contribution page
208. Estimating Poland’s Natural Rate of Interest in R: A Bayesian VECM Workflow for Cointegration, Model Selection, and Monetary Policy Stance Analysis

Patryk Kołbyko (Szkoła Doktorska Nauk Społecznych UMCS. Uniwersytet Marii Curie-Skłodowskiej w Lublinie)

08/07/2026, 16:20

Poster

Poster

This study presents an end-to-end R-based workflow for estimating Poland’s natural rate of interest within a Bayesian vector error-correction setting. The empirical objective is to recover an equilibrium real interest rate and the associated monetary policy stance gap, whereas the methodological contribution lies in demonstrating how advanced macroeconometric inference can be structured,...
Go to contribution page
28. FinDash Pro: An Integrated R Shiny Dashboard for Real-Time Financial Analysis, Machine Learning Forecasting, and LLM-Augmented Decision Support

Ozancan Ozdemir (University of Groningen)

08/07/2026, 16:20

Poster

Poster

The increasing complexity of financial markets demands analytical tools that combine real-time data access, rigorous statistical modelling, and intuitive visual communication within a single, reproducible framework. This study presents FinDash Pro, a production-grade interactive dashboard developed entirely in R using the Shiny ecosystem, designed to bridge the gap between...
Go to contribution page
224. From Linear to Machine Learning: An Interactive Shiny Framework for Diagnostic Test Combination in R

Serra İlayda Yerlitaş Taştan (Department of Biostatistics, Erciyes University, Faculty of Medicine, 38030, Kayseri, Türkiye)

08/07/2026, 16:20

Poster

Poster

Accurate diagnosis often requires the integration of multiple biomarkers rather than relying on a single test. However, existing tools for combining diagnostic tests are limited in methodological diversity and usability, especially for clinicians without programming expertise. To address this gap, we present dtComb-Shiny, a user-friendly web-based interface built on the dtComb R package. The...
Go to contribution page
117. From Static to Smart: Exception-Driven Medical Data Review with Teal and AI

Ms Daphne Grasselly (Roche), Magdalena Krochmal (Roche)

08/07/2026, 16:20

Poster

Poster

Medical Data Review (MDR) in clinical trials requires study teams to examine patient-level data across dozens of CRF domains — adverse events, labs, vitals, ECGs, and more. Traditionally, this relies mainly on static listings generated per study, requiring extensive setup and line-by-line inspection. We present an R framework, built on teal, that replaces this workflow with interactive,...
Go to contribution page
169. genMPGMM: A synthetic data generator with controlled feature partitions and cluster similarity for evaluating pattern discovery methods

Karolina Widzisz (Department of Computer Graphics, Vision and Digital Systems, Silesian University of Technology, Gliwice, Poland)

08/07/2026, 16:20

Poster

Poster

We present a synthetic data generator for simulation studies in clustering and partition comparison. The generator creates datasets with controlled cluster structures and predefined similarity levels between alternative partitions, enabling systematic analysis of clustering algorithms' stability.

The framework uses a Gaussian mixture distribution and generates data through a three-stage...
Go to contribution page
145. Git-based workflow for the validation of a R package

Laure Cougnaud (Open Analytics NV)

08/07/2026, 16:20

Poster

Poster

The use of R packages in a regulated environment as in pharmaceutical companies might require a formal validation of the R package.

The Validation Hub introduces best practices and insights from pharmaceutical industries for the validation of R packages for use within the biopharmaceutical regulatory setting.

We will contribute to this effort by presenting a git-based workflow to...
Go to contribution page
122. Interactive and Instant: GenAI-Driven Creation of Clinical Trial Applications

Marcin Dubel (Appsilon)

08/07/2026, 16:20

Poster

Poster

Building exploratory analysis dashboards for clinical trials requires considerable expertise, extensive time, and deep familiarity with specialized frameworks. In this talk, we share our GenAI solution to significantly streamline this process. We will present a tool, powered by Claude Code, that enables biostatisticians and clinical researchers to effortlessly create and immediately preview...
Go to contribution page
114. Now we'R coding! Bringing Agent Assistants Into Live R Sessions

Adam Forys (Roche), Magdalena Krochmal (Roche)

08/07/2026, 16:20

Poster

Poster

AI code assistants such as Claude Code, opencode, and Aider can read, write, and run code. However, they work separately from the user's R session. They cannot look at live objects, call R functions, or update a running Shiny application. We present a way to connect these assistants directly to R and Shiny using the Model Context Protocol (MCP).

The main idea is to use CLI-based AI agents...
Go to contribution page
66. Patient Timeline Visualization: Presenting Individual and Population-Level Clinical Trial Journeys with R and D3.js

Winkle Lu

08/07/2026, 16:20

Poster

Poster

Clinical trial data analysis typically focuses on specific analysis datasets, but the complete journey of individual patients — from screening, enrollment, and first dose, through visit records and adverse events, to last dose and survival status — represents critical time-based data points that reviewers prioritize. This fragmentation of information forces reviewers to switch between multiple...
Go to contribution page
89. persephone3: Object oriented wrapper around the seasonal adjustment packages in the rjdverse

Angelika Meraner (Statistics Austria)

08/07/2026, 16:20

Poster

Poster

persephone3 is the updated R framework developed at Statistics Austria to enable efficient processing of large sets of time series in the production of seasonally adjusted estimates. It modernizes the original [persephone][1] package by moving from the RJDemetra backend to the new [rjd3 ecosystem][2], ensuring long term maintainability and compatibility with current JDemetra+...
Go to contribution page
166. peRsian: A collection of Color Palettes from Persian Carpets

Dr Jan Simson (LMU Munich)

08/07/2026, 16:20

Poster

Poster

We present peRsian, an R package containing color palettes based on handcrafted Persian carpets for use in data visualization. peRsian is a tribute to centuries of Persian carpet-making, a craft that’s been alive for over two thousand years. It’s dedicated to the incredible artisans who’ve kept this tradition alive: especially the women who spent countless hours knotting and weaving every...
Go to contribution page
7. rgamer: A package to help students learn game theory using R

Yuki Yanai (Kochi University of Technology)

08/07/2026, 16:20

Poster

Poster

We have developed rgamer, an R package for learning and applying game theory. The goal of rgamer is to support both teaching and learning by enabling students to explore game-theoretic concepts and instructors to demonstrate them effectively in R. The package not only solves standard models such as two-person normal-form games, but also provides visualizations that highlight key structural...
Go to contribution page
209. Statistical Analysis and Visualization of Indian Development

Prakhar Srivastava (Indian Institute of Information Technology Una)

08/07/2026, 16:20

Poster

Poster

This project offers a data-driven look at India’s socio-economic growth using the R programming language. The goal is to examine how key indicators like population, literacy rate, GDP growth, and other development measures have changed over time in various regions of India. By using publicly available datasets, the project employs statistical analysis and visualization techniques in R to turn...
Go to contribution page
130. The dpGMM package for stable initialization in Gaussian Mixture Modeling

Joanna Zyla (Department of Data Science and Engineering, Silesian University of Technology, Gliwice, Poland)

08/07/2026, 16:20

Poster

Poster

Gaussian Mixture Modeling (GMM) is a one of unsupervised techniques used in many fields of data analysis, such as bioinformatics, pattern recognition, and network traffic analysis. Yet, existing R implementations often lack support for binned data (commonly observed in image analysis) and suffer from initialization instability or massive memory usage. To address these limitations, the novel R...
Go to contribution page
49. TheseusPlot: Visualizing Decomposition of Differences in Rate Metrics

Daisuke Ichikawa (Kibaroku), Koji Makiyama (HOXO-M Inc.), Shinichi Takayanagi, kazuyuki sano

08/07/2026, 16:20

Poster

Poster

TheseusPlot is an R package for explaining why a rate metric (e.g., conversion rate, retention rate, or on-time rate) differs between two groups, such as time periods, cohorts, or A/B variants. The package decomposes an overall difference into contributions from individual subgroups using a procedure inspired by the Ship of Theseus: starting from Group A, it replaces subgroup data with the...
Go to contribution page
83. Visualisation of results from large simulation studies: introducing the multi-performance plot and its Shiny app

Dr Wang Pok Lo (University of Oxford)

08/07/2026, 16:20

Poster

Poster

Simulation studies allow comparisons of performance between statistical methods to be made. Tables are traditionally used to report study results, which are usually performance measures such as bias, empirical standard error, average model standard error and coverage. In large simulation studies, these tables of results may become too large for patterns to be readily identified. This occurs...
Go to contribution page
233. The Work Behind the Work: Sustaining R Through Community

Kari L. Jordan

09/07/2026, 09:00

Keynote

Abstract
Every R script, package, and analysis rests on a foundation of community labor that often goes unseen. While R is widely celebrated for its technical power, its longevity depends just as much on the people who teach, maintain, translate, mentor, and organize around it.
In this keynote, I will draw on my leadership at The Carpentries to share stories and insights about building and...
Go to contribution page
23. Good Programming Practices, Design and Agile in the Era of AI-Generated Code

Maciej Nasiński (UCB & University of Warsaw)

09/07/2026, 10:30

Talks (15-20 minutes)

Talks

By 2026, the era of vibe coding has made rapid prototyping effortless; however, it has also highlighted a significant gap between demos and production-quality systems. While AI agents can simulate agile processes, they often optimise for speed at the expense of architectural integrity, security, and long-term technical debt. Anyone can generate code, but not everyone can manage a project. The...
Go to contribution page
48. Moving towards R package 'Matrix' version 2.0-0

Mikael Jagan

09/07/2026, 10:30

Talks (15-20 minutes)

Talks

We present the history, structure, and design philosophy of "recommended" R package Matrix, which extends R with classes and methods for structured matrices having sparse or dense storage. We review recent and ongoing development in Matrix, covering matrices with integer or complex data, matrix factorizations, and improved documentation. Finally, as the number of reverse dependencies of...
Go to contribution page
234. Seeing Groups, Not Just Gradients: Supervised Class Differentiation to Better Interpret Beta Diversity

Dr Nicholas Spyrison (Unaffiliated (employed by IFF))

09/07/2026, 10:30

Talks (15-20 minutes)

Virtual Presentation Room

Ecological community data are inherently multivariate, and beta diversity is typically explored through unsupervised ordination (e.g., PCoA, NMDS). While these methods excel at revealing gradients of variance, they ignore a priori hypotheses about group structure. This talk introduces constrained ordination, specifically Canonical Analysis of Principal coordinates (CAP) and Linear Discriminant...
Go to contribution page
104. Travel Paths: Bridging the usability gap in open source animal movement algorithms

Mx Katrina Brock (Max Planck Institute of Animal Behavior)

09/07/2026, 10:30

Talks (15-20 minutes)

Talks

The behavior ecology literature offers a rich library of approaches for finding patterns in animal movement data. While many of the biologists developing these algorithms publish their code, it is rarely optimized for reuse. Even well-designed packages with similar workflows have different interfaces that potential users need to learn one by one. By wrapping these algorithms in a standardized...
Go to contribution page
171. Advanced Machine Learning, Visualization, and Agentic AI using rtemis

Efstathios Gennatas (UCSF)

09/07/2026, 10:50

Talks (15-20 minutes)

Talks

Basic research and clinical medicine are increasingly capitalizing on data-driven approaches to derive insights into disease pathophysiology and discover new therapeutic targets. While advanced algorithms are readily available, their application requires a combination of domain, quantitative, and technical expertise, leaving them out of reach for many domain expert researchers and clinicians....
Go to contribution page
120. Juggling with S7

Ella Kaye (University of Birmingham)

09/07/2026, 10:50

Talks (15-20 minutes)

Talks

The throw sequence "423" is a valid pattern for juggling with three balls, but "432" will result in collisions and dropped balls. How can you tell? All juggling patterns can be described in a notation called siteswap, and siteswap sequences can be mathematically validated and visualised.

In this talk, I introduce jugglr, an R package for working with siteswap sequences. It validates...
Go to contribution page
60. Mapping Climate Stories with {sf}

Renata Hirota

09/07/2026, 10:50

Talks (15-20 minutes)

Virtual Presentation Room

Climate data has coordinates, but also powerful stories hiding in plain sight. In this talk, I’ll show the reasons why I feel in love with the R package {sf} while covering climate change in the Amazon as a freelance journalist. Every Last Drop is a project about oil exploration in the Amazon that showcases that spatial analysis doesn't...
Go to contribution page
152. The TDAverse: Modular, interoperable, and extensible topological data analysis in R

Aymeric Stamm (Department of Mathematics Jean Leray, UMR CNRS 6629, Nantes University)

09/07/2026, 10:50

Talks (15-20 minutes)

Talks

Topological data analysis (TDA) is an emerging area of statistical research grounded in topology, intersecting with exploratory analysis, statistical inference, and machine learning. It is therefore important for R users to have access to comprehensive and reliable TDA tools.

Published R packages for TDA fall into three categories: First, {TDA} and {rgudhi} interface with comprehensive...
Go to contribution page
147. Metamorphoses: Transforming CVXR to S7 with an AI Agent

Balasubramanian Narasimhan (Stanford University)

09/07/2026, 11:10

Talks (15-20 minutes)

Talks

CVXR is the R implementation of CVXPY, a widely-used disciplined convex optimization framework. Maintained by two developers, the S4-based CVXR 1.0 had fallen significantly behind CVXPY in features. We report on a complete rewrite using S7 We report on a complete rewrite using S7, that is now on CRAN that targets current version of CVXPY. The new version is 4-5x faster than old CVXR and the...
Go to contribution page
210. Playful Teaching of Simulation Models: From Monolithic Shiny Apps to Quarto Dashboards and webR

Thomas Petzoldt (TUD - Dresden University of Technology)

09/07/2026, 11:10

Talks (15-20 minutes)

Talks

Quantitative modeling is essential in the life and environmental sciences, yet students often face significant barriers due to "math anxiety" and programming complexity. While differential equations are often perceived as dry or difficult, interactive simulations offer a playful entry point that fosters intuitive understanding. However, traditional "downloadable models" often suffer from...
Go to contribution page
192. The tractoverse: A unified collection of packages for microstructure-augmented connectivity analysis in R

Aymeric Stamm (Department of Mathematics Jean Leray, UMR CNRS 6629, Nantes University)

09/07/2026, 11:10

Talks (15-20 minutes)

Talks

Diffusion magnetic resonance imaging (MRI) is a non-invasive imaging technique that allows us to probe the microstructure in the brain at a mesoscopic scale by making the MR signal sensitive to the diffusion of water molecules in the brain, which is restricted or hindered by cellular structures such as axons or glial cells. Diffusion MRI suffers from a poor spatial resolution, which yields the...
Go to contribution page
79. Visual Explanations of XAI Explainers, to Gain Insight into Predictions from High-Dimensional Models

Janith Wanniarachchi (Monash University)

09/07/2026, 11:10

Talks (15-20 minutes)

Virtual Presentation Room

Understanding the behaviour of complex machine learning models has become a challenge in the modern day. Explainable AI (XAI) methods were introduced to provide insights into model predictions, however explaining these explanations can be difficult without proper visualisation methods. In addition, settling the disagreements between these explainers can be difficult based purely on numerical...
Go to contribution page
78. Gamifying data visualisation: Teaching ggplot2 through competitive code golf

Michael Lydeamore (Department of Econometrics and Business Statistics, Monash University, Victoria, Australia)

09/07/2026, 11:30

Talks (15-20 minutes)

Virtual Presentation Room

Code golf—writing the shortest possible code to solve a problem—has emerged as an engaging method for teaching programming fundamentals. Its competitive, game-like structure fosters student motivation and encourages self-directed learning.

Inspired by the success of CSSBattle, which attracts thousands of daily users with CSS challenges, I present ggplot battles: a new browser-based platform...
Go to contribution page
110. On Balancing Numeric and Categorical Variables for Clustering

Gero Szepannek (Stralsund university of Applied Sciences)

09/07/2026, 11:30

Talks (15-20 minutes)

Talks

The package clustMixType [3] is one of the most popular packages for clustering of mixed-type data. Nonethless, an open issue not only for clustering mixed-type data but also for clustering in general is an appropriate weighting of the variables. In Huang’s original paper [1] as well as in the clustMixType package only heuristics are given for this purpose. In the presentation it will be...
Go to contribution page
151. Semantic vectors for safer statistics

Mitchell O'Hara-Wild (Monash University)

09/07/2026, 11:30

Talks (15-20 minutes)

Talks

Statistical analysis on temporal, spatial, graph, and probabilistic data is error-prone when the data types lack intrinsic structure. Outputs from models typically return these composite data types separately, requiring the user to assemble and apply the results correctly. This reduces the accessibility of statistics and results in error-prone analysis. Representing these data types using...
Go to contribution page
2. Teaching Reproducibility by Design: An End-to-End R Workflow Using Quarto, Open Data, and Package Development

Christian Martinez (CUNY)

09/07/2026, 11:30

Talks (15-20 minutes)

Talks

Reproducibility in R is often taught as a final requirement rather than as a workflow that evolves over time. In many research methods courses, students write one-off scripts against artificial datasets, submit them, and never return to their code. This talk presents an alternative: an end-to-end, R-native ecosystem designed to move students from code users to reproducible researchers—and...
Go to contribution page
195. Exploring Global Stock Factors with R

Dr Prince Sharma (Indian Institute of Information Technology Una), Rituraj Singh (Indian Institute of Information Technology Una)

09/07/2026, 11:50

Lightning Talk (5 minutes)

Virtual Presentation Room

Author Information
Name: Rituraj Singh
Affiliation: Indian Institute of Information Technology Una, Himachal Pradesh, India
Programme: B.Tech Computer Science Engineering (Data Science)
Email: 24519@iiitu.ac.in

Title
Do Some Stocks Always Beat Others? Exploring Global Stock Factors with R

Primary Topic
*Finance and Economics Applications of R / Data Analysis and...
Go to contribution page
178. GO-a-GO: Gene Ontology enrichment analysis of gene pairs

Aleksander Jankowski (University of Warsaw)

09/07/2026, 11:50

Lightning Talk (5 minutes)

Lightning Talks

The identification of overrepresented Gene Ontology (GO) terms in a set of genes is a standard approach to obtain functional associations, e.g. to characterize the set of differentially expressed genes between treatment and control samples. Here, we present the R package GO-a-GO that annotates Gene Ontology terms that are enriched in a given set of gene pairs. This provides the opportunity to...
Go to contribution page
139. Making R a First-Class Environment for LLMs

Tomasz Kalinowski (Posit PBC)

09/07/2026, 11:50

Lightning Talk (5 minutes)

Lightning Talks

R is an interactive environment. Working effectively with R means being able to interact with a live session to inspect objects, view and iterate on plots, access help, and step through running code in the debugger. Making LLMs effective in R therefore means more than giving them a way to execute code: it means exposing R’s interactive affordances in a form the model can use.

This talk...
Go to contribution page
46. Teaching R with R. How to get started doing lectures with Quarto and reveal.js.

Dr Håvard R. Karlsen (NTNU)

09/07/2026, 11:50

Lightning Talk (5 minutes)

Lightning Talks

Traditional presentation software like PowerPoint or Keynote are commonly used for teaching, but not ideal for displaying and running code, as it involves a lot of copying and pasting. Moving to presenting Quarto documents makes it much easier to incorporate code and output in the presentation. But it can be daunting as it involves learning a new framework for creating and presenting slides....
Go to contribution page
84. Ensemble Models for missing multi-omics data

Jagoda Głowacka-Walas

09/07/2026, 11:55

Lightning Talk (5 minutes)

Virtual Presentation Room

Missing data is a common challenge in data analysis, particularly in multi-omics studies, where heterogeneous data sources and technical limitations often result in incomplete measurements. In these situations, predictive models may fail when some required variables are missing.

This presentation demonstrates ensemble learning strategies designed to improve prediction despite incomplete...
Go to contribution page
184. Evaluate Untrusted R Code Locally Isolated in a webR Sandbox?

Henrik Bengtsson (Futureverse.org, R Foundation, R Consortium, Bioconductor, R anno 2000, University of California San Francisco (UCSF))

09/07/2026, 11:55

Lightning Talk (5 minutes)

Lightning Talks

The ability to execute arbitrary R code securely is becoming increasingly critical, e.g., for use cases ranging from AI agents executing LLM-generated code to peer-to-peer (P2P) compute clusters. Sandboxing techniques such as virtual machines and Linux containers are commonly used to isolate the host machine from untrusted code. Because these technologies can be complicated to set up, and...
Go to contribution page
97. Teaching R for Applied Economics: From Zero to Causal Inference

Marta Bernardi (TU Dresden), Sarah Listabarth (TU Dresden)

09/07/2026, 11:55

Lightning Talk (5 minutes)

Lightning Talks

Our first session doesn't start with R. It starts with asking students to unzip a folder. Every year, some of them can't.
This is the reality of teaching empirical economics today. Students arrive having grown up on tablets and smartphones, fluent in apps but lost in a file system. Getting them to difference-in-differences feels, at first, impossibly far away.
And yet, that's exactly what we...
Go to contribution page

Choose timezone

useR! 2026