Talk / Overview

Machine translation is one of the most successful text processing application. Current state-of-the-art systems leverage large amounts of translated text to learn how to translate, but is it possible to translate between two languages without having any bilingual data? In this presentation we will show that this is indeed the case. We will first map the word embedding spaces of two languages to each other, with and without seed bilingual dictionaries. This allows to produce accurate bilingual dictionaries based on monolingual corpora alone, with the same quality as supervised methods. Based on these mappings, it is then possible to train machine translation systems without accessing any bilingual data.

Talk / Speakers

Eneko Agirre

Professor, University of the Basque Country

Talk / Slides

Download the slides for this talk.Download ( PDF, 2473.2 MB)

Talk / Highlights

30:42

Cross-linguality and machine translation without bilingual data

With Eneko AgirrePublished March 12, 2020

AMLD / Global partners