Running a workshop on data wrangling in R with the tidyverse

1 minute read

Recently (at time of writing: the past three days), I have had the privilege of running a three-afternoon workshop on data wrangling and manipulation in R, using the tidyverse packages tidyr and dplyr.

The workshop was dreamt up by the University of Cape Town Biological Sciences Postgraduate Committee, of which I am a member. It was further supported by iCWild, an institute within our department.

I had an absolute blast running it and making all the materials and slides for it (which are all freely available and open source here), as data wrangling with tidyverse packages is something that I am very passionate about showing people. I also really enjoy chatting to people about programming in R, especially fellow ecologists and evolutionary biologists. Learning about other people’s analytical or data wrangling problems is really very interesting.

The workshop was a great experience for me, both in teaching and in refreshing and consolidating my own understanding of these packages! We got great feedback from those that attended, and hopefully the department will support more of these sorts of workshops in future.

One thing I did learn, though, is that I should have spent much more time on preparing “untidy” data to use as examples and for exercises. This was harder than I anticipated, so it didn’t end up getting as much of my time as it probably needed. During the workshop, I got my points across with a lot of inventive playing with datasets (live!) in front of the class. But this was not ideal compared to more rigorously structured exercises. I’ve learnt this now, at least, for any future programming workshops I (hopefully) run!

Updated: