Workshop / Overview
Find the workshop material on GitHub: https://github.com/versatile-data-kit-amld/workshop/blob/main/README.md
Enterprises nowadays invest heavily in data analytics and data science. One well-recognized problem is the inability to move a project quickly from proof-of-concept to production without the involvement of multiple teams in the organization. Versatile Data Kit is an open-source project that brings some of the best Dev Ops practices into the data world.
Participants will get hands-on experience with a modern data engineering tool that allows data practitioners to be able to bring new data, transform raw data into business meaningful KPIs, create data science models and feature engineering. The tool also allows to move predictive models from Proof-of-Concept to production in a self-service manner.
During the workshop participants will have the ability to perform feature engineering, build their predictive models and move it to production.
We will introduce the main challenges faced when moving a predictive model in production as well as the best Dev Ops practices in the data world that are easily accessible by any data users. We would start with the process of data acquisition. Next, we will perform feature engineering and run predictive data model. Once the model is tested, we will make sure it is deployed, scheduled and we can monitor our data science product.
Workshop / Outcome
Attendees will discover how to move their data science projects and underlying data processing and feature engineering to production. They will be introduced to some of the well-established Data Ops practices.
Attendees will gain hands-on-experience applying these best practices using a new open-source tool called Versatile Data Kit.
Workshop / Difficulty
Workshop / Prerequisites
Each attendee is expected to actively take part in the feature engineering and developing predictive model. They can form small teams if they wish to work on the same laptop. Recent laptop is recommended but no GPU is required.
Attendees should have some knowledge of Python and/or SQL and basic understanding of the data preparation process.
Workshop repo: https://github.com/versatile-data-kit-amld/workshop/blob/main/README.md
Track / Co-organizers
AMLD EPFL 2022 / Workshops
MLOps on AWS: a Hands-On Tutorial
With Gabriele Mazzola, Emanuele Fabbiani, Marco Paruscio, Matteo Moroni, Marta Peroni & Gabriele Orlandi09:00-13:00 March 262ABC
Designing Effective Visualisations to Communicate Data Stories
With Jacqueline Stählin, Charlotte Cabane, Diana Mitache & Sebastian Baumhauer10:00-16:00 March 264ABC