Análisis morfosintáctico y clasificación de entidades nombradas en un entorno Big Data

TítuloAnálisis morfosintáctico y clasificación de entidades nombradas en un entorno Big Data
AutoresPablo Gamallo, Juan Carlos Pichel, Marcos Garcia, José Manuel Abuín, Tomás Fernández-Pena
TipoArtículo de revista
Fonte Procesamiento del Lenguaje Natural, Sociedad Española para el Procesamiento del Lenguaje Natural , No. 53, pp. 17-24 , 2014.
ISSN1135-5948
AbstractThis article describes a suite of linguistic modules for the Spanish language based on a pipeline architecture, which contains tasks for PoS tagging and Named Entity Recognition and Classification (NERC). We have applied run-time parallelization techniques in a Big Data environment in order to make the suite of modules more efficient and scalable, and thereby to reduce computation time in a significant way. Therefore, we can address problems at Web scale. The linguistic modules have been developed using basic NLP techniques in order to easily integrate them in distributed computing environments. The qualitative performance of the modules is close the the state of the art.
Palabras chavePoS tagging, Named Entity Recognition, Big Data, Parallel Computing