A workshop by Ruan van Mazijk on common data cleaning, wrangling and wrangling tasks in R, focussing on the tidyverse
package ecosystem, supported by the UCT Biological Sciences Postgraduate Committee and iCWild.
The material taught in this workshop is based on Ruan’s own experience with the R-packages tidyr
and dplyr
, and general tidy-data-science principles he learnt from from R for Data Science (see further reading below).
The materials I have prepared for this workshop are all open source (CC-BY-4.0 Ruan van Mazijk). See the license here.
tidyr
dplyr
mutate()
, summarise()
& friends
Data-sets used are available and described here.
R for Data Science, by Garrett Grolemund & Hadley Wickham (available online, open source!)
RStudio cheatsheets (also available in RStudio directly via the menus Help > Cheatsheets
) (many of these have been translated into languages other than English!)
Also, check out the tidyverse
team’s “official” curated list of learning resources here.
Please bring your own laptops if you can. I find it easiest to learn new tools on my own computer. And there is a slim possibility that the lab computers might not allow you to install the packages we need. If you’d prefer to use the Sci Lab computers, the aforementioned risk is unlikely, so it should be fine!
If you are bringing your own laptop, please install the packages we need beforehand. The more people have tidyverse
ready and installed before we start the better. To install all of the packages we are going to use, and more, takes just one line in R:
install.packages("tidyverse")
That should get you up and running on the latest version. Give it some time to run if need be. If it installed correctly, running this line:
"tidyverse" %in% installed.packages()
should come back as TRUE
.
This workshop is not affiliated with or supported by the tidyverse
development team or RStudio and only aims to teach and foster the use of these tools.