Cross-lingual Diachronic Distance: Application to Portuguese and Spanish

AutoresJosé Ramom Pichel, Pablo Gamallo, Iñaki Alegria
TipoArtículo de revista
Fonte Procesamiento del Lenguaje Natural, Sociedad Española para el Procesamiento del Lenguaje Natural , No. 63, pp. 77-84 , 2019.
AbstractThe aim of this paper is to establish a corpus-based methodology for automatically measuring the cross-lingual distance between historical periods of two languages using perplexity. The corpus of both has been constructed adhoc with the closest spelling to the original representing chronologically and in a balanced way fiction and non-fiction. The methodology has been applied to two related languages, Portuguese and Spanish, and measured their diachronic distances both in original orthography and in an automatically transcribed spelling.
Palabras chaveCorpus linguistics, Historical Linguistics, Language distance, Development of linguistic resources and tools