Inproceedings,

Predicting Degrees of Technicality in Automatic Terminology Extraction

, , , and .
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, page 2883--2889. Online, Association for Computational Linguistics, (July 2020)
DOI: 10.18653/v1/2020.acl-main.258

Abstract

While automatic term extraction is a well-researched area, computational approaches to distinguish between degrees of technicality are still understudied. We semi-automatically create a German gold standard of technicality across four domains, and illustrate the impact of a web-crawled general-language corpus on technicality prediction. When defining a classification approach that combines general-language and domain-specific word embeddings, we go beyond previous work and align vector spaces to gain comparative embeddings. We suggest two novel models to exploit general- vs. domain-specific comparisons: a simple neural network model with pre-computed comparative-embedding information as input, and a multi-channel model computing the comparison internally. Both models outperform previous approaches, with the multi-channel model performing best.

Tags

Users

  • @dschlechtweg

Comments and Reviews