Workshop / Overview

Leave the comfort of the simple rectangular dataset behind. During this workshop we will come up with generic and yet powerful approach for dealing with multi-time series datasets.

We will get hands-on experience with real data provided by the data centers of VMware . Yes, it is a real data, so we won’t get that easily without taking care and massaging it a little bit. We will characterize the workload of multiple virtual machines based on diverse set of performance measurements like those, generated by your OS's performance manager.

We will demonstrate how you can derive meaningful features that capture diverse aspects of our data. We will leverage them in several expressive data embedding models and come up with meaningful interpretation. The hardcore theory will be left behind, however by the end of the workshop you will develop intuition about the algorithms we are using and how they can make our data tell us a story.

Workshop / Outcome

As a participant you will be able to approach real enterprise  data science project by:

  • Organize multi time series data for further analysis
  • Apply different time series imputation methods
  • Properly derive robust descriptive statistics from noisy time series
  • Extract more advanced features from TS domain specific methods like ACF and spectral density
  • Get high level overview of unsupervised algorithms like t-SNE, UMAP, DBSCAN and use them for modelling aforementioned data
  • Profile regions of the embedded data and interpret the results

Workshop / Difficulty

Beginner level

Workshop / Prerequisites

  • Basic level in ML and statistics
  • Basic knowledge of Python
  • R practitioners are also welcomed
  • Own laptop with a modern browser
  • Google account (for using Colab)
  • Repository:

Track / Co-organizers

Zhivko Kolev

Senior Data Scientist, VMware

Dimira Petrova

Data Analyst, VMware

Dragomir Nikolov

Director R&D, VMware

AMLD EPFL 2020 / Workshops

A Conceptual Introduction to Reinforcement Learning

With Kevin Smeyers, Katrien Van Meulder & Bram Vandendriessche

09:00-12:30 January 251ABC

Applied Machine Learning with R

With Dirk Wulff, Markus Steiner & Michael Schulte-Mecklenbeck

09:00-17:00 January 25Foyer 6

Augmenting the Web browsing experience using machine learning

With Oleksandr Paraska, Vasily Kuznetsov, Tudor Avram & Levan Tsinadze

09:00-12:30 January 253A

AMLD / Global partners