In this course we will introduce the basic ideas and algorithms of supervised learning and we will implement them using R programming language. A brief theoretical overview of the so-called learning setting will be provided, then the main focus will be on showing practical analysis and modelling of data related to healthcare.
Day 1: Introduction
What is machine learning? Types of machine learning. Classification and regression. Training and test sets. Model evaluation. Over-fitting. Overview of Machine Learning Algorithms. No free lunch theorem.
Day 2: Data Preparation and Feature Engineering
Data analysis and pre-processing, exploratory data analysis, handling missing data. Feature engineering techniques including but not limited to: transformations, feature extraction, reduction and selection.
Day 3: Classification
Day 4: Regression
Linear and multiple regression. Linear and polynomial. Parameter estimates. Residual analysis. Metrics for model evaluation. Plots. Using decision trees and random forests for regression.
Your training will be led by: Filippo Cavallari, Data Science Lecturer, Data Science Campus, Office for National Statistics | Swyddfa Ystadegau Gwladol
Audience and pre-requisites
To do this course you will need to be comfortable with using base R and the tidyverse. You will need, in particular, to be familiar with the “pipe” operator, the use of dplyr verbs and experience with ggplot2.
You will need to work in an Integrated Care Systems (ICSs) in the Midlands
Duration
Four day course
Location
Online
For more information about this course, please contact:
Training & Development Operational Lead, Rachel Caswell