Semi-structured data like a combination of images and tabular patient data are common in health care to predict a patient’s outcome and estimate effect estimates of risk factors. Deep learning models have proven outstanding prediction performances on unstructured data but often lack interpretability in favor of prediction performance. Classical statistical models, on the other hand, provide interpretable effect estimates such as odds ratios but only apply to structured tabular data. Here, we present a novel class of models that join deep learning with classical statistical approaches and enable the integration of semi-structured data while achieving state-of-the-art results and yielding interpretable effect estimates. We demonstrate on data from stroke patients how to use this class of interpretable models for predicting the outcomes, identifying risk factors, and assessing the relative importance of the different data modalities for a high performance.