Inproceedings,

Recovering Patient Journeys: A Corpus of Biomedical Entities and Relations on Twitter (BEAR)

A. Wührl, and R. Klinger.
Proceedings of the Language Resources and Evaluation Conference, page 4439--4450. Marseille, France, European Language Resources Association, (June 2022)

Abstract

Text mining and information extraction for the medical domain has focused on scientific text generated by researchers. However, their access to individual patient experiences or patient-doctor interactions is limited. On social media, doctors, patients and their relatives also discuss medical information. Individual information provided by laypeople complements the knowledge available in scientific text. It reflects the patient's journey making the value of this type of data twofold: It offers direct access to people's perspectives, and it might cover information that is not available elsewhere, including self-treatment or self-diagnose. Named entity recognition and relation extraction are methods to structure information that is available in unstructured text. However, existing medical social media corpora focused on a comparably small set of entities and relations. In contrast, we provide rich annotation layers to model patients' experiences in detail. The corpus consists of medical tweets annotated with a fine-grained set of medical entities and relations between them, namely 14 entity (incl. environmental factors, diagnostics, biochemical processes, patients' quality-of-life descriptions, pathogens, medical conditions, and treatments) and 20 relation classes (incl. prevents, influences, interactions, causes). The dataset consists of 2,100 tweets with approx. 6,000 entities and 2,200 relations.

BibTeX key: whrl-klinger:2022:LREC
entry type: inproceedings
address: Marseille, France
booktitle: Proceedings of the Language Resources and Evaluation Conference
year: 2022
month: June
pages: 4439--4450
publisher: European Language Resources Association
pdf: http://www.lrec-conf.org/proceedings/lrec2022/pdf/2022.lrec-1.472.pdf
internaltype: conf
url: https://aclanthology.org/2022.lrec-1.472

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

@inproceedings{whrl-klinger:2022:LREC, abstract = {Text mining and information extraction for the medical domain has focused on scientific text generated by researchers. However, their access to individual patient experiences or patient-doctor interactions is limited. On social media, doctors, patients and their relatives also discuss medical information. Individual information provided by laypeople complements the knowledge available in scientific text. It reflects the patient's journey making the value of this type of data twofold: It offers direct access to people's perspectives, and it might cover information that is not available elsewhere, including self-treatment or self-diagnose. Named entity recognition and relation extraction are methods to structure information that is available in unstructured text. However, existing medical social media corpora focused on a comparably small set of entities and relations. In contrast, we provide rich annotation layers to model patients' experiences in detail. The corpus consists of medical tweets annotated with a fine-grained set of medical entities and relations between them, namely 14 entity (incl. environmental factors, diagnostics, biochemical processes, patients' quality-of-life descriptions, pathogens, medical conditions, and treatments) and 20 relation classes (incl. prevents, influences, interactions, causes). The dataset consists of 2,100 tweets with approx. 6,000 entities and 2,200 relations.}, added-at = {2022-12-23T13:22:01.000+0100}, address = {Marseille, France}, author = {W\"uhrl, Amelie and Klinger, Roman}, biburl = {https://puma.ub.uni-stuttgart.de/bibtex/23ebd307721ec8cc7e1572c83ae85957c/dr.romanklinger}, booktitle = {Proceedings of the Language Resources and Evaluation Conference}, interhash = {9f08fe4d6f0980dc30ed2d3278614bfe}, internaltype = {conf}, intrahash = {3ebd307721ec8cc7e1572c83ae85957c}, keywords = {imported myown}, month = {June}, pages = {4439--4450}, pdf = {http://www.lrec-conf.org/proceedings/lrec2022/pdf/2022.lrec-1.472.pdf}, publisher = {European Language Resources Association}, timestamp = {2022-12-23T12:27:25.000+0100}, title = {Recovering Patient Journeys: A Corpus of Biomedical Entities and Relations on Twitter (BEAR)}, url = {https://aclanthology.org/2022.lrec-1.472}, year = 2022 }

PUMA

Recovering Patient Journeys: A Corpus of Biomedical Entities and Relations on Twitter (BEAR)

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on