@sp

Inducing a Computational Lexicon from a Corpus with Syntactic and Semantic Annotation

, , , , and . Proceedings of IWCS-7, Tilburg, The Netherlands, (2007)

Abstract

To date, linguistically annotated corpora are mainly exploited for feature-based training of automatic labelling systems. In this paper, we present a general approach for the Description Logics-based modelling of multi-layered annotated corpora which offers (i) flexible and enhanced querying functionality that goes beyond current XML-based query languages, (ii) a basis for consistency checking, and (iii) a general method for defining abstractions over corpus annotations. We apply this method to the syntactically and semantically annotated SALSA/TIGER corpus . By defining abstractions over the corpus data, we generalise from a large set of individual corpus annotations to a corresponding lexicon model. We discuss issues arising from modelling multi-layered corpus annotations in Description Logics and illustrate the benefits of our approach at concrete examples.

Links and resources

Tags