Spoken Language Understanding on the Edge

Talk / Overview

Spoken Language Understanding (SLU) is the task of extracting meaning from a spoken utterance. Over the last years, thanks in part to steady improvements brought by deep learning approaches to Automatic Speech Recognition (ASR), voice interfaces implementing SLU have greatly evolved from spotting limited and predetermined keywords to understanding arbitrary formulations of a given intention, and are becoming ubiquitous in connected devices. Most current solutions however offload their processing to the cloud, where computationally demanding engines can be deployed. The size of these models, along with the computational resources necessary to run them in real-time, make them unfit for deployment on small devices. Running SLU on the edge however offers several advantages. First, on-device processing removes the need to send speech, or other personal data to third-party servers, therefore guaranteeing a high level of privacy. Additional benefits include a reduction in latency and offline capabilities as well as the possibility to perform user-based personalization. This talk will outline the design of an embedded, private-by-design SLU system offering all the advantages of edge computing and performance on-par with commercial, cloud-based solutions in closed-domain.

Talk / Speakers

Francesco Caltagirone

Senior Manager, Sonos

Talk / Highlights

20:03