Towards Exploiting Automatic Speech Recognition Error For Generating Synthetic Clinical Speech Data

Talk / Overview

Applying natural language processing (NLP) techniques to clinical speech is a low-cost and effective solution for many healthcare challenges. For example, detecting levels of cognition from speech tasks for the diagnosis of Alzheimer’s Disease (AD). One of the primary limiting factors of the success of artificial intelligence (AI) solutions in health is the amount of relevant data available; clinical data is expensive to collect and rarely enough for large-scale machine learning. With the increasing demand for AI in health systems, generating synthetic clinical data that maintains the nuance of underlying patient pathology is the next pressing task. Previous work has shown that automated evaluation of clinical speech tasks via automatic speech recognition (ASR) is comparable to manually annotated results in diagnostic scenarios even though ASR systems produce errors during the transcription process, namely deletion. In this work, we propose to generate additional synthetic clinical data by simulating ASR errors on the transcript the rate at which the errors naturally occur to produce additional data. Using an age- and education-balanced dataset of 50 cognitively impaired and 50 healthy Dutch speakers, 1000 additional data points are synthetically generated for each subject. Clinically-relevant features are extracted from the manually annotated, original ASR and generated files and statistically compared to check for significant differences between the three classes of data as well as their diagnostic value.

Talk / Speakers

Hali Lindsay

M.Sc., German Research Center for Artificial Intelligence

Talk / Slides

Download the slides for this talk.Download ( PDF, 1063.1 MB)

Talk / Highlights