A dependency-based approach to word contextualization using compositional distributional semantics

We propose a strategy to build the distributional meaning of sentences, which is mainly based on two types of semantic objects: context vectors associated with content words and compositional operations driven by syntactic dependencies. The compositional operations of a syntactic dependency make use of two input vectors to build two new compositional vectors representing the contextualized sense of the two related words. Given a sentence, the iterative application of the dependencies results in as many contextualized vectors as content words the sentence contains. At the end of the compositional semantic process, we do not obtain a single compositional vector representing the semantic denotation of the whole sentence (or of the root word), but one contextualized vector for each constituent word of the sentence. Our method avoids the troublesome high-order tensor representations of approaches relying on category theory, by defining all words as first-order tensors (i.e. standard vectors). Some corpus-based experiments are performed to both evaluate the quality of the compositional vectors built with our strategy, and to compare them to other approaches on distributional compositional semantics. The experiments show that our dependency-based compositional method performs as (or even better than) the state-of-the-art.

keywords: