Advances in artificial intelligence (AI) have touched nearly every industry and scientific discipline. Machine learning (ML) models are now routinely used throughout academia, but often by practitioners with little to no practical education. Indeed, most AI/ML resources are focused on industry applications, where the purpose and goals of ML models often differ from those in academia.
We therefore propose a full day workshop focused on AI/ML in science. The workshop is intended to touch on topics critical to the application of ML models in science contexts, including how to integrate domain knowledge in machine learning models (e.g., encoding physical constraints), ensuring reproducibility, and best development practices.
The workshop will begin with a theoretical introduction to AI/ML in science, followed by a hands-on session to introduce participants to best practices, as well as how to encode physics into ML models.
In the theoretical session, we will motivate the problem, by showing concrete examples in astrophysics where constraints are not only desirable, but necessary (and establish a connection with other scientific and engineering fields). Then, we will present different methods that are designed to be physics preserving, for example, equivariant neural networks. We will finish the theoretical session by connecting ML development to traditional computational mathematics frameworks, as established by Weinan E’s review paper [2].
In the practical session, before jumping in and starting to explore physical constrained methods, we will first spend an hour showing different practices and tools that aid the development of ML models in the scientific context (e.g.: autoML frameworks), especially with respect to reproducibility, which is of critical importance to the scientific method. Finally, attendees will be able to try and experiment with the presented methods, on a series of novel datasets from different domains, such as astrophysics [1] and engineering.
[1] Timpe, Miles et al. (2020), Simulations of planetary-scale collisions between rotating, differentiated bodies, v2, Dryad, Dataset, https://doi.org/10.5061/dryad.j6q573n94
[2] Weinan E (2020), "Machine learning and computational mathematics"
The participants will learn about the current state of encoding physical constraints in ML models, as well as good practices for ML development in scientific context with a focus on reproducibility.
Intermediate level
Participants should be comfortable with Python. They should have a laptop and should be able to run google colab notebooks. Slides and dataset will be provided in advance.