Modern data science pipelines involve a complex array of operations, with many sources of stochastic behaviour, some controlled, some uncontrolled. In pharma, an exemplar scenario is that of data-driven biomarker selection. This talk will discuss statistical methods to quantify the stability and hence *reproducibility* of results coming from such pipelines.
Download the slides for this talk.Download ( PDF, 6369.69 MB)