Data wrangling & manipulation in R - workshop site

A workshop by Ruan van Mazijk on common data cleaning, wrangling and wrangling tasks in R, focussing on the tidyverse package ecosystem, supported by the UCT Biological Sciences Postgraduate Committee and iCWild.

The material taught in this workshop is based on Ruan’s own experience with the R-packages tidyr and dplyr, and general tidy-data-science principles he learnt from from R for Data Science (see further reading below).

The materials I have prepared for this workshop are all open source (CC-BY-4.0 Ruan van Mazijk). See the license here.

Workshop outline & materials

Data-sets used are available and described here.

Further reading

R for Data Science, by Garrett Grolemund & Hadley Wickham (available online, open source!)

RStudio cheatsheets (also available in RStudio directly via the menus Help > Cheatsheets) (many of these have been translated into languages other than English!)

Also, check out the tidyverse team’s “official” curated list of learning resources here.

Details

Please bring your own laptops if you can. I find it easiest to learn new tools on my own computer. And there is a slim possibility that the lab computers might not allow you to install the packages we need. If you’d prefer to use the Sci Lab computers, the aforementioned risk is unlikely, so it should be fine!

Package prep.

If you are bringing your own laptop, please install the packages we need beforehand. The more people have tidyverse ready and installed before we start the better. To install all of the packages we are going to use, and more, takes just one line in R:

install.packages("tidyverse")

That should get you up and running on the latest version. Give it some time to run if need be. If it installed correctly, running this line:

"tidyverse" %in% installed.packages()

should come back as TRUE.

Disclaimer

This workshop is not affiliated with or supported by the tidyverse development team or RStudio and only aims to teach and foster the use of these tools.