Checks

TipLearning Goal

Add tests and checks to ensure functions work and error when expected.

Checks in functions are used to ensure inputs and arguments are as expected, to handle errors and to provide informative error messages to users. In this workshop we are going to focus on writing checks with the function stopifnot because it is simple and succinct. It was recently improved in R 4.0.0 when the R development team added the option to provide a more informative error which will help us clearly communicate with our users (and ourselves).

stopifnot() expects a logical statement that you are expecting to be true, and, if not, stopifnot() will throw an error.

To build logical statements, you may find following list of functions useful:

  • class checks
    • is.integer, is.numeric, is.data.frame, is.character
  • nulls, NAs, NaNs
    • is.null, is.na, is.nan
  • dimensions
    • length, nrow, ncol, dim
  • equality, comparisons
    • == (equals)
    • >, <, >=, <= (less than, greater than, less than or equal to, greater than or equal to)
    • identical
  • negation
    • !
  • directories, files
    • file.exists
    • dir
    • basename, dirname

Logical statements

Exercise: function checks

NoteObjective
  • Practice writing logical statements.

Instruction: write a series of logical statements for the following objects then consult the solution to check your work. Focus only on the logical statement for now, we’ll use stopifnot in the following exercises.

Number

  • check if x is a numeric, integer, double
  • check if x is of length 1
  • check if x is greater than 0
x <- 10
Show solution
[1] TRUE
[1] FALSE
[1] TRUE
length(x) == 1
[1] TRUE
x > 0
[1] TRUE

data.frame

  • check if DF is a data.frame, list, matrix
  • check if DF has two columns and three rows
  • check if DF’s column numbers is an integer, double, numeric
DF <- data.frame(colors = c('red', 'green', 'blue'), numbers = c(42.1, 2L, 10))
Show solution
[1] TRUE
[1] TRUE
[1] FALSE
ncol(DF) == 2
[1] TRUE
nrow(DF) == 3
[1] TRUE
is.integer(DF$numbers)
[1] FALSE
is.double(DF$numbers)
[1] TRUE
is.numeric(DF$numbers)
[1] TRUE

Filters

Exercise: filter_islands()

First, write a function to filter a data.frame, providing the column name for the data to be filtered and the values to filter on. Use the example data that we prepared with our prepare_csv function.

Follow the steps from our approach to developing functions:

  1. Setup the function
    • Make a function script in the R/ directory named filter_islands.R
    • Write the function’s skeleton (name, arguments, curly braces)
  2. Setup the test script
    • Make a corresponding test script in the tests/ directory named test_filter_islands.R.
    • Load any required packages (library(package))
    • Source the function (source('R/function.R))
    • Load example data and/or arguments for the function

Add a new section ‘Development’ in the test script (tests/test_filter_islands.R) to develop the body of your function. Since we have already built a function for preparing CSV files, use it in your test script, and your choice of baseR/data.table or dplyr functions. Add the code to your function filter_islands() and test!

path_counts <- file.path('raw-data', 'adelie-adult-chick-counts.csv')
prep_counts <- prepare_csv(path_counts)
Hints: baseR/data.table

This function has one step: taking an input data.frame and filtering across a single column.

The function’s arguments should include the data.frame, as well as the column name for the filtering and the filter value.

Pass the data.frame’s column that you want to filter on to the plot function’s filter_col arguments with the [[ syntax, eg. DF[[x_column]]
Hints: dplyr

This function has one step: taking an input data.frame and filtering across a single column.

The function’s arguments should include the data.frame, as well as the column name for the filtering and the filter value.

Pass the data.frame’s to dplyr::filter’s data argument then use the { } “embrace operator” to wrap the filter_col and filter_value arguments (unquoted* column names).

See the example with the “embrace operator”, and more details here: https://dplyr.tidyverse.org/articles/in-packages.html


Writing Checks

Exercise: writing checks for filter_islands()

NoteObjective

Write and test a complete function.

Instruction: add the following checks to your filter_islands() function and test them in your test script (tests/test_filter_islands.R).

  • check if the columns provided to the filter_col arguments exist in the data.frame (quoted column name arguments)
  • check that the values provided to the filter_value arguments are present in the filter_col column
  • check that the output is a data.frame
Hints: unquoted column names with dplyr::filter

If you opted to use unquoted column names, it might be trickier to check if column names exist in the data.frame. However - this check is already covered by filter() anyways.

In case you wanted to add this, on top of the internal filter check, try something like this:

as_name(enquo(filter_col)) %in% colnames(DT)

These are {rlang} functions, part of the “tidy eval tools” and examples of metaprogramming. The Advanced R book has a detailed section on this concept.


Bonus

Writing checks for prepare_csv()

Instruction: add the following checks to your prepare_csv() function and test them in your test script (tests/test_prepare_csv.R).

  • check if the path points to a file that exists
  • before returning the object, check that it is a data.frame

More informative errors with stopifnot()

Instruction: Write more informative errors for your stopifnot() checks using the following syntax: stopifnot("error message" = logical_statement). Think about your user (either someone else or future you) - what would help them understand and resolve this error?

For example,

stopifnot("x is not a numeric" = is.numeric(x))