Fraud detection with unsupervised ML

Workshop / Overview

In many fraud- (or general outlier-) detection situations, labelled data is not available. We therefore need to resort to unsupervised methods to identify points that are somehow untypical.

In this workshop, a short introduction will be given that discusses the main outlier detection methods (from the classic LOF to modern algorithms such as Isolation Forest and autoencoders) and appropriate metrics for highly imbalanced datasets.

Then, participants will be given unlabelled datasets to make predictions on. Scores will be compared on a leader board, with the emphasis on comparing techniques.

Workshop / Outcome

After the workshop, participants will:

Know the main algorithms for unsupervised outlier detection, and their pros and cons
Understand what scoring metrics may be used for highly imbalanced classification, and how these relate to business costs
Have gained practical experience doing outlier analysis in Python

Workshop / Difficulty

Intermediate level

Workshop / Prerequisites

Intermediate Python skills
Basic understanding of Machine Learning concepts
Laptop with internet access (teams of two may be formed), a Google account for colab, alternatively Docker with downloaded image or with correct Python packages installed (see instructions in the Github page).

Track / Co-organizers

Ernst Oldenhof

Senior Data Scientist, Julius Baer

AMLD EPFL 2020 / Workshops

View workshops

A Conceptual Introduction to Reinforcement Learning

With Kevin Smeyers, Katrien Van Meulder & Bram Vandendriessche

09:00-12:30 January 251ABC

Applied ML with R

With Dirk Wulff, Markus Steiner & Michael Schulte-Mecklenbeck

09:00-17:00 January 25Foyer 6

Augmenting the Web browsing experience using machine learning

With Oleksandr Paraska, Vasily Kuznetsov, Tudor Avram & Levan Tsinadze

09:00-12:30 January 253A

AMLD / Global partners

AMLD EPFL 2020 Fraud detection with unsupervised ML