Assessing the reliability of an R package

In the most recent rOpenSci community call, Juliane Manitz presented on standards for statistical software in the pharmaceutical industry. One of her slides in particular was about how the pharmaceutical industry assesses the reliability of a package. This inspired me to write a little bit about how I assess the reliability of an R package that’s new to me, especially if I have the choice of two or more packages that do similar things.

I think the most common ways people in my field (ecology) choose an R package or function to learn to solve a particular problem are:

  • It’s what they were exposed to in a course or training
  • It’s widely used in their field (e.g. cited often)
  • It’s written or maintained by a “big-shot” in the field

After gaining experience in R package development, I’ve begun to place less importance on these factors and more importance on other factors that I’ll discuss briefly below.

Active Development

It’s really important to me that a package I’m using is being actively developed. There are a number of reasons to choose a package with active development over one that is not updated often. One obvious reason is that if you encounter bugs, you can be more confident that they’ll get fixed if you report them. Even mature, well established packages need active maintenance to ensure they remain functional as their dependencies get updated. I also like choosing packages with active development because I may have an opportunity help improve the package through my feedback and suggestions.

So, how to assess if a package is being actively developed?

  1. Check CRAN release date. How recent was the latest stable version published to CRAN?
  2. Check bug reports. How many issues are open and how many have been closed? How many really old bug reports are there, if any? Has the package author at least responded to bug reports made over a month ago?
  3. Check GitHub development activity. Have there been somewhat recent commits or pull requests? Is there a NEWS file documenting changes in the development version?

Development Lifecycle

Where an R package or function is in its lifecycle can help me make a decision about whether to use it over an alternative. You might have noticed lifecycle badges in tidyverse package help files (for example, the “superseded” badge on the help file for gather() from the tidyr package) or on README files on GitHub (e.g. the “maturing” badge on the ipmr package I’ve been learning lately).

a badge that reads 'lifecycle: superseded' a badge that read 'lifecycle: maturing' a badge that read 'lifecycle: maturing'
Examples of lifecycle badges

I generally want to avoid learning anything that has been superseded (no longer being developed) and I definitely don’t want to learn anything that’s been deprecated. Instead I focus my efforts on learning functions or packages that are in the maturing or stable stages. If a package is still in the experimental stage, I may choose to learn it, but probably only if I’m interested in contributing to the package development and it’s really the best option for what I’m trying to do.

These specific lifecycle stages are not always used in all packages, and sometimes the stage must be sussed out through some exploration. For example, if you check the website for the raster package, you’ll see that it has been superseded by the terra package although the lifecycle badges are not used by these packages.

Testing

Unit tests are code that is written to check for the correct behavior of functions under a range of possible inputs from users. Tests can help catch bugs, check that users get informative error messages, and check for correctness. Most ecologists probably don’t know to look for tests when choosing to use an R package. I think there is the assumption that if it’s on CRAN, and has been cited before, then it’s legit. That simply is not true. For example, the spi package was on CRAN (it’s now been archived) and had been cited in papers, but didn’t calculate anything even close to what it claimed to calculate. If the spi package had unit tests, perhaps this error would have been caught before it went to CRAN. The easiest way to check if a package has tests is to find the GitHub repository (often linked to from the CRAN page). If there is a “tests” folder, and it has some code in it, then that’s a start. You can also look for a badge in the README that describes the test coverage (how much of the package code is tested), such as the codecov badge here. If test coverage is > 75% you’re in really good shape.

Example of a Codecov badge

As Noam Ross pointed out in the collaborative notes for the call, having a big user base can catch bugs without formal tests, but “tests are the ‘ratchet’, though—they make sure you don’t go backwards, introducing old bugs again when you fix new ones”.

Avatar
Postdoctoral Researcher

I’m a postdoctoral researcher in Emilio Bruna’s lab at University of Florida working on the effects of drought and habitat fragmentation on a tropical plant. I’m interested in the mechanisms of plant responses to stress and their consequences for natural and agricultural ecosystems.

comments powered by Disqus

Related