Federated Learning: collaborative machine learning on sensitive decentralized data

Workshop / Overview

⚠️ A valid COVID certificate must be presented on site to enter the event. ⚠️

Datasets of interest in many application domains (e.g. healthcare, finance data,
manufacturing) contain sensitive or private information and cannot easily be shared.
Additionally, such data frequently belongs to multiple distinct parties and
combining it in one location would expose a lucrative target to hackers.
Therefore, it is desirable to make use of such data without a need
to disclose it or store it in a central location.

Unfortunately, traditional methods to train predictive models expect data to
be fully accessible and centralized on a single server.
Research work therefore has to rely on small or artificial datasets that
can safely be centralized. As a result, findings frequently do not generalize well
to real-world datapoints and progress is hampered.

Federated Learning (FL) is a recently introduced paradigm that addresses this limitation by training models on
decentralized datasets without requiring centralized data access.
This approach allows multiple distinct parties to collaboratively train predictive
models without a need to directly share sensitive data. Instead of combining
datasets, FL trains a model in multiple iterations on data subsets stored
in different locations.
In every iteration, every party owning a data subset downloads a copy of the current model weights.
An updated model is computed for each data subset in a local training step.
Per-party model updates are then aggregated (a step that can be centralized, as it does not require
data access) resulting in a single overarching FL update step.

As an introduction to the workshop, we will introduce the basic concepts underlying
FL and discuss a few of the key related topics (e.g. Differencial Privacy,
Model Encryption). Our focus however, will be on gaining hands-on experience.
We will implement a simple Federated Learning system using tensorflow (tensorflow/federated) and pytorch (PySyft).
We will give a quick introduction to all needed libraries and tools at the start.

Workshop / Outcome

Teach your models how to use decentralized, sensitive data without a need to have any direct access to it.

Hands-on is key: learn how to train models using Federated Learning in tensorflow or pytorch.

Workshop / Difficulty

Intermediate level

Workshop / Prerequisites

For technical preparations please follow instructions of the "Preparation" section on the following page.

Participation with a laptop is recommended as a large part of this workshop will be hands-on.

Track / Co-organizers

Moritz Freidank

Machine Learning Engineer, Novartis

Aman Apte

Associate, Novartis

Praveen Uggappakodi

Service Delivery Manager, Novartis

AMLD EPFL 2021 / Workshops

View workshops

Towards ethical AI – practical tools for responsible data scientists

With Johan Rochel & Lea Strohm

10:00-11:30 November 10Online

How to make your NLP system multilingual

With Adam Bittlingmayer & Nerses Nersesyan

10:00-12:00 March 02Online

Deep Learning-Driven Text Summarization & Explainability with Reuters News Data

With Nadja Herger, Nina Hristozova & Andreea Iuga

15:00-17:30 March 02Online

AMLD / Global partners

AMLD EPFL 2021 Federated Learning: collaborative machine learning on sensitive decentralized data