copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

From 0 to 10 million annotated words: part-of-speech tagging for Middle High German

S. Schulz, and N. Ketschik. Language Resources and Evaluation, 53 (4): 837-863 (2019)

Abstract

By building a part-of-speech (POS) tagger for Middle High German, we investigate strategies for dealing with a low resource, diverse and non-standard language in the domain of natural language processing. We highlight various aspects such as the data quantity needed for training and the influence of data quality on tagger performance. Since the lack of annotated resources poses a problem for training a tagger, we exemplify how existing resources can be adapted fruitfully to serve as additional training data. The resulting POS model achieves a tagging accuracy of about 91% on a diverse test set representing the different genres, time periods and varieties of MHG.

Links and resources

BibTeX key: schulz2019million
entry type: article
year: 2019
journal: Language Resources and Evaluation
number: 4
pages: 837-863
volume: 53
url: http://dblp.uni-trier.de/db/journals/lre/lre53.html#SchulzK19

@nora-ketschik's tags highlighted

Cite this publication

search on

Meta data

Last update 4 years ago
Created 4 years ago

Comments and Reviews
(0)

There is no review or comment yet. You can write one!

PUMA

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

From 0 to 10 million annotated words: part-of-speech tagging for Middle High German

Abstract

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

PUMA

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML From 0 to 10 million annotated words: part-of-speech tagging for Middle High German

Abstract

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

From 0 to 10 million annotated words: part-of-speech tagging for Middle High German

Comments and Reviews
(0)