Modern data science pipelines involve a complex array of operations, with many sources of stochastic behaviour, some controlled, some uncontrolled. In pharma, an exemplar scenario is that of data-driven biomarker selection. This talk will discuss statistical methods to quantify the stability and hence *reproducibility* of results coming from such pipelines.

Gavin Brown

University of Manchester

On the Stability and Reproducibility of Data Science Pipelines

With Gavin BrownPublished March 12, 2020

