x <- 10
Checks
Checks in functions are used to ensure inputs and arguments are as expected, to handle errors and to provide informative error messages to users. In this workshop we are going to focus on writing checks with the function stopifnot
because it is simple and succinct. It was recently improved in R 4.0.0 when the R development team added the option to provide a more informative error which will help us clearly communicate with our users (and ourselves).
stopifnot()
expects a logical statement that you are expecting to be true, and, if not, stopifnot()
will throw an error.
To build logical statements, you may find following list of functions useful:
- class checks
-
is.integer
,is.numeric
,is.data.frame
,is.character
-
- nulls, NAs, NaNs
-
is.null
,is.na
,is.nan
-
- dimensions
-
length
,nrow
,ncol
,dim
-
- equality, comparisons
-
==
(equals) -
>
,<
,>=
,<=
(less than, greater than, less than or equal to, greater than or equal to) identical
-
- negation
!
- directories, files
file.exists
dir
-
basename
,dirname
Exercises: function checks
Logical statements
Instruction: write a series of logical statements for the following objects then consult the solution to check your work. Focus only on the logical statement for now, we’ll use stopifnot
in the following exercises.
Number
- check if x is a numeric, integer, double
- check if x is of length 1
- check if x is greater than 0
Show solution
is.numeric(x)
[1] TRUE
is.integer(x)
[1] FALSE
is.double(x)
[1] TRUE
length(x) == 1
[1] TRUE
x > 0
[1] TRUE
data.frame
- check if DF is a data.frame, list, matrix
- check if DF has two columns and three rows
- check if DF’s column numbers is an integer, double, numeric
DF <- data.frame(colors = c('red', 'green', 'blue'), numbers = c(42.1, 2L, 10))
Show solution
is.data.frame(DF)
[1] TRUE
is.list(DF)
[1] TRUE
is.matrix(DF)
[1] FALSE
ncol(DF) == 2
[1] TRUE
nrow(DF) == 3
[1] TRUE
is.integer(DF$numbers)
[1] FALSE
is.double(DF$numbers)
[1] TRUE
is.numeric(DF$numbers)
[1] TRUE
plot_xy()
First, write a function to plot a data.frame, providing columns names for the data to be plotted on the x and y axes. Use the example data that we prepared with our prepare_csv
function.
Follow the steps from our approach to developing functions:
- Setup the function
- Make a function script in the
R/
directory namedplot_xy.R
- Write the function’s skeleton (name, arguments, curly braces)
- Make a function script in the
- Setup the test script
- Make a corresponding test script in the
tests/
directory namedtest_plot_xy.R
. - Load any required packages (
library(package)
) - Source the function (
source('R/function.R)
) - Load example data and/or arguments for the function
- Make a corresponding test script in the
Add a new section ‘Development’ in the test script (tests/test_plot_xy.R
) to develop the body of your function. Since we have already built a function for preparing CSV files, use it in your test script, and your choice of base R plotting or {ggplot2}. Add the code to your function plot_xy()
and test!
path_counts <- file.path('raw-data', 'adelie-adult-chick-counts.csv')
prep_counts <- prepare_csv(path_counts)
Hints: base R
This function has one step: taking an input data.frame and generating a plot.
The function’s arguments should include the data.frame, as well as the column names for the X and Y axes.
Pass the data.frame’s to the plot function’s x and y arguments with the[[
syntax, eg. DF[[x_column]]
Hints: ggplot2
This function has one step: taking an input data.frame and generating a plot.
The function’s arguments should include the data.frame, as well as the column names for the X and Y axes.
There are two main options:
Pass the data.frame’s to ggplot’s data argument then use the
.data[[col]]
syntax to wrap the x_col and y_col arguments (quoted column names) withinaes
.Pass the data.frame’s to ggplot’s data argument then use the
{{ }}
“embrace operator” to wrap the x_col and y_col arguments (unquoted* column names) withinaes
.
See the ?aes
help page and the example with the “embrace operator”, and more details here: https://ggplot2.tidyverse.org/articles/ggplot2-in-packages.html#using-aes-and-vars-in-a-package-function
Writing checks for plot_xy()
Instruction: add the following checks to your plot_xy()
function and test them in your test script (tests/test_plot_xy.R
).
- check if the columns provided to the
x_col
andy_col
arguments exist in the data.frame (quoted column name arguments) - check if the plot generated is of class
ggplot
before returning (ggplot2
)
Hints: unquoted column names with ggplot2
If you opted to use unquoted column names, it might be trickier to check if column names exist in the data.frame. However - this check is already covered by ggplot()
anyways.
In case you wanted to add this, on top of the internal ggplot2
check, try something like this:
as_name(enquo(x_col)) %in% colnames(DT)
These are {rlang} functions, part of the “tidy eval tools” and examples of metaprogramming. The Advanced R book has a detailed section on this concept.
Bonus
Writing checks for prepare_csv()
Instruction: add the following checks to your prepare_csv()
function and test them in your test script (tests/test_prepare_csv.R
).
- check if the path points to a file that exists
- before returning the object, check that it is a data.frame
More informative errors with stopifnot()
Instruction: Write more informative errors for your stopifnot()
checks using the following syntax: stopifnot("error message" = logical_statement)
. Think about your user (either someone else or future you) - what would help them understand and resolve this error?
For example,
stopifnot("x is not a numeric" = is.numeric(x))
Add a color_by
argument to plot_xy()
Instruction: using the same approach you used for x_col
and y_col
, add a color_col
argument to plot_xy()
.