Inproceedings,

Obtaining a Better Understanding of Distributional Models of German Derivational Morphology

, , , and .
Proceedings of IWCS, page 58--63. London, UK, (2015)

Abstract

Derivationally related words (read / read+er) usually have closely related meanings. It is an interesting challenge for distributional semantics to account for this relationship by predicting the meaning (represented as a vector) of a derived term (read+er) from the meaning of its base term (read). Previous work has framed this task as an instance of compositional meaning construction, but its properties are not yet well understood. Our goal is to better understand the factors influencing performance on this task via quantitative and qualitative analysis of two existing composition models on a set of German derivation patterns (e.g., -in, durch-). We begin by introducing a rank-based evaluation metric that provides a more relevant assessment of the models’ practical value and reveals the task to be challenging due to specific properties of German (compounding, capitalization). We also find that performance varies greatly between patterns and even among base-derived term pairs of the same pattern. A regression analysis shows that semantic coherence of the base and derived terms within a pattern, as well as coherence of the semantic shifts from base to derived terms, all significantly impact prediction quality. Finally, we investigate false positives, finding that different models capture complementary aspects of the semantic shifts.

Tags

Users

  • @sp

Comments and Reviews