Clinical trials are designed to answer a single scientific question, such as the efficacy or safety of a new therapeutic. Beyond this initial purpose, there is great interest in how to re-analyse clinical trial data to support new scientific discoveries, such as the identification of prognostic measures of disease or predictive variables of treatment efficacy. For example, identifying patients with an increased treatment effect may help inform future treatment policies. This task is often referred to as subgroup identification and has been framed the "hardest problem there is". Several solutions to subgroup identification have been proposed from statistical models to machine learning approaches. In this presentation, we present on a company-internal subgroup challenge aimed at developing approaches to this problem for future clinical trials. The challenge took place on the data42 platform, an environment that gathers a large amount of clinical and RWE data and allows the use of several data analysis tools, fostering collaboration and reproducibility. To mimic a realistic setting, participants had access to 4 Phase III clinical trials to derive a subgroup and predict its treatment effect on a future study that was not accessible to challenge participants. We present the challenge aims, structure and results, and we summarise what we can learn from its outcomes to inform scientific research.
Download the slides for this talk.Download ( PDF, 939.77 MB)