Workshop / Overview

⚠️ A valid COVID certificate must be presented on site to enter the event. ⚠️

In many fraud- (or general outlier-) detection situations, labelled data is not available. We therefore need to resort to unsupervised methods to identify points that are somehow untypical.
In this workshop, a short introduction will be given that discusses the main outlier detection methods (from the classic LOF to more modern algorithms such as Isolation Forest and Autoencoders) and appropriate metrics for highly imbalanced datasets.

Then, participants will be given unlabelled datasets to make predictions on. Scores will be compared on a leader board, with the emphasis on comparing techniques.

Workshop / Outcome

After the workshop, participants will:

  • Know the main algorithms for unsupervised outlier detection, and their pros and cons
  • Understand what scoring metrics may be used for highly imbalanced classification, and how these relate to business costs
  • Have gained practical experience doing outlier analysis in Python

Workshop / Difficulty

Intermediate level

Workshop / Prerequisites

  • Intermediate Python skills
  • Basic understanding of Machine Learning concepts
  • Laptop with internet access (teams of two may be formed)
  • A Google account for colab
  • Alternatively, Docker with downloaded image

Track / Co-organizers

Ernst Oldenhof

Senior Data Scientist, Julius Baer

Alessandro Scarpato

Data Scientist, Julius Bär

Giulio Ghirardo

data engineer, Doodle

Steffen Terhaar

Data Science Consultant, D|ONE

AMLD EPFL 2021 / Workshops

Towards ethical AI – practical tools for responsible data scientists

With Johan Rochel & Lea Strohm

10:00-11:30 November 10Online

How to make your NLP system multilingual

With Adam Bittlingmayer & Nerses Nersesyan

10:00-12:00 March 02Online

Deep Learning-Driven Text Summarization & Explainability with Reuters News Data

With Nadja Herger, Nina Hristozova & Andreea Iuga

15:00-17:30 March 02Online

AMLD / Global partners