In this course we will introduce the basic ideas of Data Science and we will implement them using the R programming language. We will use the Tidyverse, which is a collection of R packages that facilitate data import, manipulation, encoding, exploration and visualisation.
Day 1: Introduction to the Tidyverse
Introduction to R and RStudio. Workflow. Tidy data. The Tidyverse ecosystem. Data import.
Tibbles. Dplyr basics. Pipes.
Day 2: Data Manipulation
Dplyr verbs. Numerical summaries. SQL and Dplyr.
Day 3: Categorical Variables
Factors. The package forcats. Modifying factor order. Modifying factors levels.
Day 4: Relational Data
Mutating joins. Filtering joins. Set operations.
Day 5: Data Visualisation I
Introduction to ggplot2. Creating a ggplot. Aesthetic mappings. Geometric objects.
Day 6: Data Visualisation II
More geometric objects. Themes.
Day 7: Exploratory Data Analysis I
Visualising distributions. Typical vs unusual values. Missing values.
Day 8: Exploratory Data Analysis II
Covariation. A categorical and continuous variable. Two categorical variables. Two continuous
variables.
Your training will be led by:
Pre-requisites
A basic knowledge of R can be helpful but not necessary.
Audience
This course is free and available to all those working in the Midlands Public Health and Social Care sector , e.g. NHS, Public Health, Local Authority, ICBs
Duration
Eight half days (3.5 hours each day)
Location
Online – delivered via Zoom with a combination of delivery styles.
Dates:
Mondays from 9:30am to 1:00pm, with short breaks
Registration: Please use the online link below
For more information about this course, please contact:
Training & Development Operational Lead, Rachel Caswell