Track / Overview

Machine learning (ML) continues to make inroads in many industry applications. This said, in at least four focus areas—healthcare, finance, insurance, and critical infrastructure— data privacy requirements and data sparsity limit useful estimation and deployment of ML-enabled algorithms. Recent advances in data synthesis show promise for addressing these challenges. If realistic data can be synthesized in a manner that still reflects to a sufficient degree the underlying granular properties of the ground-truth data, ML can be extended in substantial and scalable ways. If successful, we can improve materially decision-support systems in industries that still struggle with deploying industrial-strength ML-enabled models.

Early approaches (Sklar, 1959) use univariate marginal distributions together with a copula. Others have extended this probabilistic approach to remedy a range of issues with Sklar’s approach (Kamthe, et. al, 2021). Another interesting (and popular) thread focuses on generative adversarial networks (GANs) (Goodfellow, et. al., 2020) and extensions, such as a conditional generative adversarial net (CGAN) (Fu, et. al., 2019)

In this track, we will provide a review of some of the more promising synthetic data generation approaches. In addition, we will describe applications of these approaches to show how data privacy and data sparsity can be suitably addressed.

Track / Schedule

Introduction

With Jeffrey R. Bohn

Towards Closing Synthetic-to-Real Gap with Domain Adaptation for Unsupervised Fault Diagnosis

With Olga Fink

Model-Privacy: Enforce a Purpose on Shared Data

With Gerald Friedland

Differentially Private Data Generative Models and Safety-Critical Scenario Generation for Autonomous Driving

With Bo Li

Break

Completing Power Grid Network Graph Data for Property Resilience Modeling

With Jeffrey R. Bohn

Wikipedia Reader Navigation: When Synthetic Data Is Enough

With Akhil Arora

Synthetic Data Generation for Natural Language Understanding with Probabilistic Context Free Grammars

With Georgios Balikas

Panel Discussion

With Jeffrey R. Bohn, Bo Li & Gerald Friedland

Wrap-up

With Jeffrey R. Bohn

Track / Speakers

Georgios Balikas

Data Scientist, Salesforce

Jeffrey R. Bohn

Chief Strategy Officer, One Concern

Olga Fink

Professor, EPFL

Akhil Arora

Doctoral Researcher, EPFL

Bo Li

Assistant Professor, University of Illinois at Urbana-Champaign

Gerald Friedland

Adjunct Assistant Professor, EECS University of California, Berkeley, CTO Brainome, Inc.

Track / Co-organizers

Jeffrey R. Bohn

Chief Strategy Officer, One Concern

AMLD EPFL 2022 / Tracks & talks

AMLD Keynote Session – Monday morning

Marcel Salathé, Lenka Zdeborová, Carmela Troncoso, Chiara Enderle, Patrick Barbey, Thomas Wolf, Gunther Jansen, Laure Willemin, Simon Hefti, Arthur Gassner

10:00-12:00 March 28Auditorium A

AI & Physics

Francesca Mignacco, Gert-Jan Both, Michael Unser, Thomas Asikis, Dalila Salamani, Pietro Rotondo, Tom Beucler, Giulio Biroli

12:30-18:00 March 285BC

AI & Pharma

Asif Jan, Jonas Richiardi, Patrick Schwab, Naghmeh Ghazaleh, Alexander Büsser, Carlos Ciller, Caibin Sheng, Silvia Zaoli, Félix Balazard, Giulia Capestro, Marianna Rapsomaniki, Martijn van Attekum

13:30-17:30 March 281BC

AMLD / Global partners