Speaker
Description
Listing over 23,000 Packages on CRAN alone, the R ecosystem provides a plethora of general-purpose and domain-specific packages. While there is a strict quality standard for packages to be accepted on CRAN, there is little information available about the evolution and current state of the ecosystem.
With the crawlR project, we statically analyzed all versions of all packages available on CRAN to gain insights into the evolution of the ecosystem and various aspects of package development and maintenance such as the use of semantic versioning, the prevalence of dead code, uncalled functions, and more.
To achieve this, we first analyze each package version individually using flowR, a static program analysis framework for R, to obtain a package-wide dataflow graph alongside call and control-flow information. We then use this information to extract a wide range of interesting aspects about the package including all of its functions and objects, such as the transitive package dependencies, call graphs, usage of language features, unreachable code, and more, obtaining roughly 80 GB of data for around 170,000 package versions.
In this talk, we present our methodology alongside the architecture of crawlR, explore the results, and provide a summary of our findings and insights obtained so far.
We also highlight open challenges for the ecosystem which we identified through our analysis and discuss interesting queries that we want to explore in the future with the help of the community.
If you used AI tools or services to support the preparation of this submission, please state the name and reason for using each of them.
No AI tools/services were used.
Additional Material or Paper
We will share material alongside the presentation, it was not presented before.
| Keywords: Please list up to 5 keywords to help us find the right session for your contribution. | program analysis, ecosystem evolution, CRAN |
|---|---|
| Virtual Option | This submission is for onsite presentation primarily, but I would also like it to be considered for pre-recorded virtual presentation if I don't get an onsite slot |
| Material License | CC-BY-SA 4.0 |
| Video Recording | Video sharing is fine |
| The author(s) agree(s) to take responsibility and be accountable for the contents of the submission and is/are authorized to present it. | Confirm |
| Interested in serving as reviewer? | florian.sihler@uni-ulm.de |