The Impact of Linguistic Knowledge in Different Strategies to Learn Cross-Lingual Distributional Models

TítuloThe Impact of Linguistic Knowledge in Different Strategies to Learn Cross-Lingual Distributional Models
AutoresPablo Gamallo
TipoComunicación para congreso
Fonte European Association for Artificial Intelligence, Sanitago de Compostela (España), Frontiers in Artificial Intelligence and Applications (Volume 325), pp. 2014-2021 , 2020.
ISBN978-1-64368-100-9
DOI10.3233/FAIA200322
AbstractIn recent years, with the emergence of neural networks and word embeddings, there has been a growing interest in working on cross-lingual distributional models learned from monolingual corpora to induce bilingual lexicons However, interest in these models existed prior to the emergence of deep learning. In this article, we will study the differences between the recent strategies, which are based on the alignment of models, as opposed to the old methods focused on the use of bilingual anchors aligning the text itself. We will also analyze the impact of including different levels of linguistic knowledge (e.g. lemmatization, PoS tagging, syntactic dependencies) in the process of building cross-lingual models for English and Spanish. Our experiments show that syntactic information benefits traditional models based on text alignment but harms mapped cross-lingual embeddings.
Palabras chaveCross-Lingual Embeddings, Monolingual Corpora, Information Extraction, Natural Language Processing