Yet another suite of multilingual NLP tools

TítuloYet another suite of multilingual NLP tools
AutoresMarcos García, Pablo Gamallo
TipoComunicación para congreso
Fonte Third Symposium on Languages, Applications and Technologies, Madrid (España), pp. 65-75 , 2015.
ISBN978-84-606-8762-7
ISSN1865-0929
DOI10.1007/978-3-319-27653-3_7
AbstractThis paper presents the current development of a multilin-gual suite for Natural Language Processing. It consists of a sentence chunker, a tokenizer, a PoS-tagger, a dictionary-based lemmatizer and a Named Entity Recognizer (both for enamex and numex expressions). The architecture of the pipeline and the main resources used for its development are described. Besides, the PoS-tagger and Named Entity Recognizer are evaluated against several state-of-the-art systems. The experiments performed in Portuguese and English show that, in spite of its simplicity, our system competes with some well known tools for NLP. It is entirely written in Perl and distributed under a GPL license.
Palabras chavenatural language processing, PoS-tagging, named entity recog- nition, portuguese, english