class: center, middle, inverse, title-slide # Troubleshooting ## and problem solving in R ### Alec L. Robitaille ### Wildlife Evolutionary Ecology Lab
Memorial University ### 2022-04-18 --- .left-column[ <br><br> ### **Discussion** ### Prevention ### Simplifying ### and other strategies ### Resources ] --- .left-column[ <br><br> ### **Discussion** ### Prevention ### Simplifying ### and other strategies ### Resources ] #### What are your strategies for ...troubleshooting errors? ...problem solving? ...learning a new package? ...starting a new analysis? --- .left-column[ <br><br> ### Discussion ### **Prevention** ### Simplifying ### and other strategies ### Resources ] #### First, some preventative measures Adjust RStudio settings [[1]](https://csgillespie.github.io/efficientR/set-up.html#rstudio-options) Use an RStudio project, relative file paths Keep R, R packages, and OS updated --- .left-column[ <br><br> ### Discussion ### **Prevention** ### Simplifying ### and other strategies ### Resources ] #### Extras Track package versions with `renv` Reproducible workflows with `targets` Use Git ??? renv tracks package versions for each project, so global updates to packages won't break eg. older projects Git helps with the feedback cycle of coding -> tracked changes -> commit -> coding Also, Git lets you create branches to explore alternatives, without breaking your current working versions A workflow defined in targets ensures steps are only rerun when they are outdated and helps branching by eg. individuals or study sites --- .left-column[ <br><br> ### Discussion ### Prevention ### **Simplifying** ### and other strategies ### Resources ] #### Overview Simplify the complexity of the task ...to identify source of the error ...to clarify steps of the analysis ...to learn new functions or packages ??? Tasks: troubleshooting, learning a new package, starting a new analysis... We'll first look at making minimal data, then using that data to learn a new package, then finally taking a complex script + data and reducing it to a minimum example --- .left-column[ <br><br> ### Discussion ### Prevention ### **Simplifying** ### and other strategies ### Resources ] #### Data Goal: minimal and representative data Option 1 - available example data Option 2 - build your own ??? Representative: retains variables, types and grouping levels of original data --- .left-column[ <br><br> ### Discussion ### Prevention ### **Simplifying** ### and other strategies ### Resources ] #### Available data .panelset[ .panel[.panel-name[Penguins] ```r library(palmerpenguins) data(penguins) nrow(penguins) ``` ``` ## [1] 344 ``` ```r colnames(penguins) ``` ``` ## [1] "species" "island" "bill_length_mm" ## [4] "bill_depth_mm" "flipper_length_mm" "body_mass_g" ## [7] "sex" "year" ``` ] .panel[.panel-name[Cars] ```r data(mtcars) nrow(mtcars) ``` ``` ## [1] 32 ``` ```r colnames(mtcars) ``` ``` ## [1] "mpg" "cyl" "disp" "hp" "drat" "wt" "qsec" "vs" "am" "gear" ## [11] "carb" ``` ] .panel[.panel-name[Etc] ```r # Run data() to see a list of available data data() # for spatial data, see the spData package data(package = 'spData') ``` ] ] --- .left-column[ <br><br> ### Discussion ### Prevention ### **Simplifying** ### and other strategies ### Resources ] #### Exercise: build your own data 1) Take your real data 2) List -columns -data types -units -limits -factors or levels 3) Generate a representative and minimal fake data .footnote[ [`R/simplify-data.R`](https://github.com/robitalec/workshops/tree/master/2022-04-18-troubleshooting/R/simplify-data.R) ] --- .left-column[ <br><br> ### Discussion ### Prevention ### **Simplifying** ### and other strategies ### Resources ] #### Learning a new package Use a simple dataset to focus on learning functions without complexity of real data or specific analyses --- .left-column[ <br><br> ### Discussion ### Prevention ### **Simplifying** ### and other strategies ### Resources ] #### Exercise: learn a new package 1) Use a minimal dataset 2) Open package website or documentation 3) Fill the [template](https://github.com/robitalec/workshops/tree/master/2022-04-18-troubleshooting/R/learn-package-template.R) (see the [example](https://github.com/robitalec/workshops/tree/master/2022-04-18-troubleshooting/R/learn-package-chk.R)) 4) Learn a few functions! .footnote[ Pick any package or one from this list: -[`ggplot2`](https://ggplot2.tidyverse.org/) -[`sf`](https://r-spatial.github.io/sf/) -[`chk`](https://poissonconsulting.github.io/chk/) -[`anytime`](https://eddelbuettel.github.io/anytime/) -[`patchwork`](https://patchwork.data-imaginist.com/) ] --- .left-column[ <br><br> ### Discussion ### Prevention ### **Simplifying** ### and other strategies ### Resources ] #### Troubleshooting To troubleshoot an error, simplify the data and analyses Isolate the erroring function and use a minimal dataset If the error is not reproduced, check differences in data and/or use of functions If error is reproduced, check GitHub Issues, SO, try on another computer, or contact package authors. ??? A MWE that reproduces an error is ~~ proof of the error --- .left-column[ <br><br> ### Discussion ### Prevention ### **Simplifying** ### and other strategies ### Resources ] #### Exercise: troubleshooting Given a complex script (e.g. [1](https://github.com/wildlifeevoeco/SocCaribou/blob/master/scripts/2-HomeRangeAnalysis.R), [2](https://github.com/robitalec/ScaleInMultilayerNetworks/blob/master/scripts/03-temporal-layers.R), [3](https://github.com/wildlifeevoeco/study-area-figs/blob/master/R/10-riding-mountain-prep.R), [4](https://github.com/robitalec/spatsoc-application-paper/blob/master/scripts/figure-2-group-pts.R)), build a reprex 1) Start a new script 2) Only load essential packages 3) Use a minimal dataset 4) Isolate the erroring function 5) Reproduce the error or find your mistake .footnote[ [`reprex` vignette](https://reprex.tidyverse.org/articles/reprex-dos-and-donts.html) [SO reprex how to](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/16532098) ] ??? Ideally, bring your own script Say you hit an error halfway through some complex analysis... `reprex_selection` --- .left-column[ <br><br> ### Discussion ### Prevention ### Simplifying ### **and other strategies** ### Resources ] Explore the traceback [[1]](https://adv-r.hadley.nz/debugging.html#traceback) Read (and reread) documentation Restart R, clear the environment Update R, R packages, OS Check your .Rprofile, .Renviron [[2]](https://csgillespie.github.io/efficientR/set-up.html#an-overview-of-rs-startup-files=), [[3]](https://usethis.r-lib.org/reference/edit.html) GitHub Issues [(e.g. [4])](https://github.com/rstudio/bookdown/issues/1263) Stack Overflow [(e.g. [5])](https://stackoverflow.com/questions/42149427/group-membership-by-pairwise-criteria-matching) Ask for help! --- .left-column[ <br><br> ### Discussion ### Prevention ### Simplifying ### and other strategies ### **Resources** ] -[`reprex`: introduction](https://reprex.tidyverse.org/articles/reprex-dos-and-donts.html) -[SO: Making a great R reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/16532098) -[`reprex`: datapasta](https://reprex.tidyverse.org/articles/datapasta-reprex.html) -[Advanced R: Debugging](https://adv-r.hadley.nz/debugging.html0) -[Description of startup files](https://csgillespie.github.io/efficientR/set-up.html#an-overview-of-rs-startup-files=) -[Editing startup files](https://usethis.r-lib.org/reference/edit.html) -[`targets` manual](https://books.ropensci.org/targets/) -[`renv` documentation](https://rstudio.github.io/renv/) -[Efficient R: Version control](https://csgillespie.github.io/efficientR/collaboration.html#version-control) -[Happy Git with R](https://happygitwithr.com/)