LinguaKit: a Big Data-based multilingual tool for linguistic analysis and information extraction

TítuloLinguaKit: a Big Data-based multilingual tool for linguistic analysis and information extraction
AutoresPablo Gamallo, Marcos Garcia, César Piñeiro, Rodrigo Martínez-Castaño and Juan C. Pichel
TipoComunicación para congreso
Fonte International Conference on Social Networks Analysis, Management and Security, Valencia (España), pp. 239-244 , 2018.
ISBN978-1-5386-9588-3
DOI10.1109/SNAMS.2018.8554689
AbstractThis paper presents LinguaKit, a multilingual suite of tools for analysis, extraction, annotation and linguistic cor- rection, as well as its integration into a Big Data infrastructure. LinguaKit allows the user to perform different tasks such as PoS-tagging, syntactic parsing, coreference resolution (among others), including applications for relation extraction, sentiment analysis, summarization, extraction of multiword expressions, or entity linking to DBpedia. Most modules work in four languages: Portuguese, Spanish, English, and Galician. The system is pro- grammed in Perl and is freely available under a GPLv3 license.