Multilingual Open Information Extraction

Open Information Extraction (OIE) is a recent unsupervised strategy to extract great amounts of basic propositions (verb-based triples) from massive text corpora which scales to Web-size document collections. We propose a multilingual rule-based OIE method that takes as input dependency parses in the CoNLL-X format, identifies argument structures within the dependency parses, and extracts a set of basic propositions from each argument structure. Our method requires no training data and, according to experimental studies, obtains higher recall and higher precision than existing approaches relying on training data. Experiments were performed in three languages: English, Portuguese, and Spanish.

keywords: