Naive-Bayesian Classification for Bot Detection in Twitter Notebook for PAN at CLEF 2019

TítuloNaive-Bayesian Classification for Bot Detection in Twitter Notebook for PAN at CLEF 2019
AutoresPablo Gamallo, Sattam Almatarneh
TipoComunicación para congreso
Fonte Conference and Labs of the Evaluation Forum, Lugano (Switzerland), 2019.
ISSN1613-0073
AbstractThis article describes a system that participated in the Bots and Gender Profiling shared task at PAN 2019. The first objective of the task is to detect whether the author of a Twitter account is a bot or a human; and in case of human, the second objective is to identify the gender of the user account. For this purpose, we present a Bayesian strategy based on features, including specific content of tweets and automatically built lexicons. The best configuration of features reached 0.88 accuracies in the official Spanish test dataset and 0.81 in the English one for the bot/human classification. For gender profiling, the scores we obtained were lower, around 0.70.
Palabras chaveBot Detection, Gender Detection, Naive-Bayesian Classification