6–9 Jul 2026
Europe/Warsaw timezone

Teaching Reproducibility by Design: An End-to-End R Workflow Using Quarto, Open Data, and Package Development

9 Jul 2026, 10:30
20m
Talks (15-20 minutes) Talks

Speaker

Christian Martinez (CUNY)

Description

Reproducibility in R is often taught as a final requirement rather than as a workflow that evolves over time. In many research methods courses, students write one-off scripts against artificial datasets, submit them, and never return to their code. This talk presents an alternative: an end-to-end, R-native ecosystem designed to move students from code users to reproducible researchers—and ultimately, to contributors.

I describe a multi-semester pipeline built around R Markdown, Bookdown, and Quarto in which students repeatedly revisit and refactor their own work. Homework assignments are written as R Markdown files and later consolidated into individual Bookdown portfolios, requiring students to debug, standardize, and improve earlier code. In subsequent iterations, both student portfolios and final research projects are migrated to Quarto, positioning modern publishing as a continuation of reproducible practice rather than a separate skill, and reinforcing the idea that reproducible research is iterative rather than disposable.

This workflow culminates in three interconnected open-access books: an instructor-authored reproducible research textbook, a cohort-level Quarto book containing student research chapters built on NYC Open Data, and individual student portfolio books that students can continue to update beyond the course.

To ground this work in real-world analysis, students conduct original research using NYC Open Data. Because API complexity and data access friction proved to be a major barrier, I developed nycOpenData, a CRAN package that enables any R user to access datasets from the NYC Open Data portal without writing custom API calls. This reduces infrastructure overhead and allows students to focus on research questions rather than data acquisition.

As a full-circle outcome, advanced students now contribute their own functions back to the same package they used for their analyses, gaining experience with package development, GitHub workflows, and collaborative software practices.

The talk concludes with design lessons, technical tradeoffs, and a reproducible template that R users can adapt for teaching, onboarding, or collaborative research across disciplines.

Additional Material or Paper

The following links provide examples of the projects described in this abstract:

• nycOpenData R package (CRAN + documentation): https://martinezc1.github.io/nycOpenData/
• NYC Open Data Student Gallery (cohort-level Quarto book): https://martinezc1-nyc-open-data-student-gallery.share.connect.posit.cloud/
• Reproducible Research Using R (open-access textbook): https://martinezc1-reproducible-research-using-r.share.connect.posit.cloud/
• Example student portfolio (Bookdown): https://bookdown.org/jdratfield38/RClassAnalyticsPortfolio/

If you used AI tools or services to support the preparation of this submission, please state the name and reason for using each of them.

When created all of my learning material, I often would create it then ask ChatGPT to act as if they were one of my student

Keywords: Please list up to 5 keywords to help us find the right session for your contribution. Reproducible research, Quarto, Open Data, R package development, Teaching R
Virtual Option This submission is for onsite presentation primarily, but I would also like it to be considered for pre-recorded virtual presentation if I don't get an onsite slot
Video Recording Video sharing is fine
The author(s) agree(s) to take responsibility and be accountable for the contents of the submission and is/are authorized to present it. Confirm
Interested in serving as reviewer? c.martinez0@outlook.com

Author

Presentation materials

There are no materials yet.