useR! 2026

Name: useR! 2026
Start: 2026-07-06T08:00:00+02:00
End: 2026-07-09T19:00:00+02:00
Location: No location set

6–9 Jul 2026

Europe/Warsaw timezone

paperboy - A Collection of News Media Scrapers in R

7 Jul 2026, 11:55

Aula VI (SGH Warsaw School of Economics)

Aula VI

SGH Warsaw School of Economics

180

Show room on map

Lightning Talk (5 minutes) Building tools for reproducible research Lightning Talks

Sina Chen (GESIS Leibniz Institute for the Social Sciences)

The philosophy of the R package paperboy is that the package is a repository for webscraping scripts for news media sites, with advanced features for quick data retrieval - even for content behind log-ins or anti-scraping measures. Many data scientists and researchers write their own code when they have to retrieve news media content from websites. At the end of research projects, this code is often collecting digital dust on researchers hard drives instead of being made public for others to employ. Paperboy offers writers of webscraping scripts a clear path to publish their code and earn co-authorship on the package, while promising users to deliver news media data from many websites in a consistent format. With 179 covered as of today and a default scraper that often works well enough, paperboy can already facilitate a large range of research projects.

Additional Material or Paper

A preprint can be found here: https://osf.io/preprints/socarxiv/hu6qw_v1.
A tool demo was given at ICA 2025: https://github.com/JBGruber/ica25_tool-demos.

If you used AI tools or services to support the preparation of this submission, please state the name and reason for using each of them.

No AI tools/services were used.

Keywords: Please list up to 5 keywords to help us find the right session for your contribution.	data mining, open data, news media data, webscraping
Virtual Option	This submission is for onsite presentation only
Video Recording	Video sharing is fine
The author(s) agree(s) to take responsibility and be accountable for the contents of the submission and is/are authorized to present it.	Confirm
Interested in serving as reviewer?	sina.chen@gesis.org

Dr Johannes Gruber (GESIS Leibniz Institute for the Social Sciences)

Sina Chen (GESIS Leibniz Institute for the Social Sciences)

useR_talk_paperboy.pdf

useR_talk_paperboy.pptx

useR! 2026

paperboy - A Collection of News Media Scrapers in R

Aula VI

SGH Warsaw School of Economics

Speaker

Description

Additional Material or Paper

If you used AI tools or services to support the preparation of this submission, please state the name and reason for using each of them.

Author

Co-author

Presentation materials

Choose timezone

useR! 2026

Speaker

Description

Additional Material or Paper

If you used AI tools or services to support the preparation of this submission, please state the name and reason for using each of them.

Author

Co-author

Presentation materials