Courses 2021
Past

Machine Learning Methods and Data Analytics in Finance and Insurance



From whom?

Pavel Shevchenko is a Professor in the Department of Actuarial Studies and Business Analytics, Director of the Risk Analytics Lab (since 2016) and Co-Director of the Centre for Risk Analytics (since 2017) at Macquarie University. Prior to joining Macquarie University in 2016, he worked as a Research Scientist in the government science agency CSIRO Australia (1999-2016) holding a position of Senior Principal Research Scientist (2012-2016). Since 1999, Prof Shevchenko has been working in the area of risk analytics leading research and commercial projects on: modelling of operational and credit risks; longevity and mortality, retirement products; option pricing; insurance; modelling commodities and foreign exchange; and the development of relevant numerical methods and software.

Course language

English, but you may ask questions in Russian


Course Description

This course aims to equip course participants with important computing and statistical tools to undertake quantitative modelling activities required from risk modellers and quantitative analysts in modern financial institutions and insurance companies. This course focuses on machine learning and data analytics methods in applications for finance and insurance. The topics include parametric regressions (GLM and neural network), tree methods (regression tree, boosting, bagging, random forest), classification and clustering. The course aims to develop a core mathematical and statistical understanding of the methods and their applications to problems in the field. The methods will be applied using the R language.

Date: 24-27 January 2022

Time: 11.00 – 12.30 (Moscow)


Schedule

  • Day 1 (2 lectures) Parametric regression methods (GLM, NN, lasso & ridge)

  • Day 2 (2 lectures) Non-parametric regression methods (regression tree, bagging, boosting, random forests)

  • Day 3 (2 lectures) Classification methods (logistic regression, discriminant analysis, decision tree, KNN, SVM)

  • Day 4 (2 lectures) Unsupervised learning (clustering)

Two lectures will be presented each day: the 1st lecture is presentation of slides skewed toward mathematical and statistical description of model and assumptions underlying the methods and the 2nd lecture will be presentation of these methods in R. The duration of each lecture is 45 minutes.

Additional requirements:
Participants should have the most recent versions of statistical software R and R-studio installed to follow numerical examples presented during lectures.

Readings and course materials:
  • "An Introduction to Statistical Learning with Applications in R" by Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani. It can be downloaded for free from here

  • "The Elements of Statistical Learning" by Jerome Friedman, Trevor Hastie, and Robert Tibshirani. It can be downloaded for free from here