Machine Learning


Machine Learning

In this course we will introduce the basic ideas and algorithms of supervised learning and we will implement them using R programming language. A brief theoretical overview of the so-called learning setting will be provided, then the main focus will be on showing practical analysis and modelling of data related to healthcare.

Learning outcomes

  • To understand concepts of machine learning for healthcare and compare and test a range of techniques.
  • To classify features of data sources, analysing and interpreting the outputs of machine learning techniques in the context of practical solutions in the area of healthcare.

Detailed Programme

Day 1: Introduction
What is machine learning? Types of machine learning. Classification and regression. Training and test sets. Model evaluation. Over-fitting. Overview of Machine Learning Algorithms. No free lunch theorem.

Day 2: Data Preparation and Feature Engineering
Data analysis and pre-processing, exploratory data analysis, handling missing data. Feature engineering techniques including but not limited to: transformations, feature extraction, reduction and selection.

Day 3: Classification

  • Logistic Regression: why logistic regression; logistic function; simple logistic regression; multiple logistic regression (tentative; ROC curve; feature interpretation; predictions using logistic regression.
  • Decision Trees: classification using decision trees; understanding and visualising decision trees; advantages and disadvantages of decision trees; predictions.
  • Random Forests: from decisions trees to random forests; training and tuning random forests; predictions.

Day 4: Regression
Linear and multiple regression. Linear and polynomial. Parameter estimates. Residual analysis. Metrics for model evaluation. Plots. Using decision trees and random forests for regression.

Your training will be led by: Filippo Cavallari, Data Science Lecturer, Data Science Campus, Office for National Statistics | Swyddfa Ystadegau Gwladol

Audience and pre-requisites
To do this course you will need to be comfortable with using base R and the tidyverse. You will need, in particular, to be familiar with the “pipe” operator, the use of dplyr verbs and experience with ggplot2.

You will need to work in an Integrated Care Systems (ICSs) in the Midlands 

Number of places per ICS
2 spaces available per ICS

Duration and start date
Four day course dates: 17/12/21, 21/01/22, 04/02/22 and 18/02/22, 09:00-16:00


Online application: Applications for this course are now closed.

For more information about this course, please contact:

Training & Development Operational Lead, Rachel Caswell