Unsupervised Compositional Translation of Multiword Expressions

AutoresPablo Gamallo, Marcos Garcia
TipoComunicación para congreso
Fonte Proceedings of the Joint Workshop on Multiword Expressions and WordNet, at ACL-2019, Florencia, Italia, 2019.
AbstractThis article describes a dependency-based strategy that uses compositional distributional semantics and cross-lingual word embeddings to translate multiword expressions (MWEs). Our unsupervised approach performs translation as a process of word contextualization by taking into account lexico-syntactic contexts and selectional preferences. This strategy is suited to translate phraseological combinations and phrases whose constituent words are lexically restricted by each other. Several experiments in adjective-noun and verb-object compounds show that mutual contextualization (co-compositionality) clearly outperforms other compositional methods. The paper also contributes with a new freely available dataset of English-Spanish MWEs used to validate the proposed compositional strategy.
Palabras chavenatural language processing, unsupervised translation, multiword extraction, compositional distributional semantics