A Machine Learning approach for Subjectivity Classification based on Positional and Discourse Features

In recent years, several machine learning methods have been proposed to detect subjective (opinionated) expressions within on-line documents. This task is important in many Opinion Mining and Sentiment Analysis applications. However, the opinion extraction process is often done with rough content-based features. In this paper, we study the role of structural features to guide sentence-level subjectivity classification. More specifically, we combine classical n-grams features with novel features defined from positional information and from the discourse structure of the sentences. Our experiments show that these new features are beneficial in the classification of subjective sentences.

keywords: Information Retrieval, Opinion Mining, Subjectivity Classfication, Sentiment Analysis, Machine Learning, Rhetorical Structure Theory