copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

CNVVE Dataset clean audio samples

R. Hedeshy, R. Menges, and S. Staab. Dataset, (2024)Related to: CNVVE: Dataset and Benchmark for Classifying Non-verbal Voice Expressions.R. Hedeshy, R. Menges, and S. Staab. Interspeech 2023, August 20-24, 2023. Dublin, Ireland, (2023). doi: 10.21437/Interspeech.2023-201.
DOI: 10.18419/darus-3898

Abstract

This CNVVE Dataset contains clean audio samples encompassing six distinct classes of voice expressions, namely “Uh-huh” or “mm-hmm”, “Uh-uh” or“mm-mm”, “Hush” or “Shh”, “Psst”, “Ahem”, and Continuous humming, e.g., “hmmm.” Audio samples of each class are found in the respective folders. These audio samples have undergone a thorough cleaning process. The raw samples are published in https://doi.org/10.18419/darus-3897. Initially, we applied the Google WebRTC voice activity detection (VAD) algorithm on the given audio files to remove noise or silence from the collected voice signals. The intensity was set to "2", which could be a value between "1" and "3". However, because of variations in the data, some files required additional manual cleaning. These outliers, characterized by sharp click sounds (such as those occurring at the end of recordings), were addressed. The samples are recorded through a dedicated website for data collection that defines the purpose and type of voice data by providing example recordings toparticipants as well as the expressions’ written equivalent, e.g., “Uh-huh”. Audio recordings were automatically saved in the .wav format and keptanonymous, with a sampling rate of 48 kHz and a bit depth of 32 bits. For more info, please check the paper or feel free to contact the authors for any inquiries.

Links and resources

BibTeX key: hedeshy2024cnvve
entry type: misc
year: 2024
howpublished: Dataset
affiliation: Hedeshy, Ramin/Universität Stuttgart, Menges, Raphael/Semanux, Staab, Steffen/Universität Stuttgart
orcid-numbers: Hedeshy, Ramin/0000-0001-5854-4033, Menges, Raphael/0000-0002-2112-7065, Staab, Steffen/0000-0002-0780-4154
DOI: 10.18419/darus-3898
note: Related to: CNVVE: Dataset and Benchmark for Classifying Non-verbal Voice Expressions.R. Hedeshy, R. Menges, and S. Staab. Interspeech 2023, August 20-24, 2023. Dublin, Ireland, (2023). doi: 10.21437/Interspeech.2023-201

Cite this publication

@misc{hedeshy2024cnvve, abstract = {This CNVVE Dataset contains clean audio samples encompassing six distinct classes of voice expressions, namely “Uh-huh” or “mm-hmm”, “Uh-uh” or“mm-mm”, “Hush” or “Shh”, “Psst”, “Ahem”, and Continuous humming, e.g., “hmmm.” Audio samples of each class are found in the respective folders. These audio samples have undergone a thorough cleaning process. The raw samples are published in https://doi.org/10.18419/darus-3897. Initially, we applied the Google WebRTC voice activity detection (VAD) algorithm on the given audio files to remove noise or silence from the collected voice signals. The intensity was set to "2", which could be a value between "1" and "3". However, because of variations in the data, some files required additional manual cleaning. These outliers, characterized by sharp click sounds (such as those occurring at the end of recordings), were addressed. The samples are recorded through a dedicated website for data collection that defines the purpose and type of voice data by providing example recordings toparticipants as well as the expressions’ written equivalent, e.g., “Uh-huh”. Audio recordings were automatically saved in the .wav format and keptanonymous, with a sampling rate of 48 kHz and a bit depth of 32 bits. For more info, please check the paper or feel free to contact the authors for any inquiries. }, added-at = {2024-02-19T15:14:11.000+0100}, affiliation = {Hedeshy, Ramin/Universität Stuttgart, Menges, Raphael/Semanux, Staab, Steffen/Universität Stuttgart}, author = {Hedeshy, Ramin and Menges, Raphael and Staab, Steffen}, biburl = {https://puma.ub.uni-stuttgart.de/bibtex/25b90dbde35048b600299bb33c9dc379c/unibiblio}, doi = {10.18419/darus-3898}, howpublished = {Dataset}, interhash = {495829c90ab3af88e7162a6a4eba3295}, intrahash = {5b90dbde35048b600299bb33c9dc379c}, keywords = {darus ubs_10005 ubs_20008 ubs_30082 ubs_40488 unibibliografie}, note = {Related to: CNVVE: Dataset and Benchmark for Classifying Non-verbal Voice Expressions.R. Hedeshy, R. Menges, and S. Staab. Interspeech 2023, August 20-24, 2023. Dublin, Ireland, (2023). doi: 10.21437/Interspeech.2023-201}, orcid-numbers = {Hedeshy, Ramin/0000-0001-5854-4033, Menges, Raphael/0000-0002-2112-7065, Staab, Steffen/0000-0002-0780-4154}, timestamp = {2024-02-19T15:14:11.000+0100}, title = {CNVVE Dataset clean audio samples}, year = 2024 }

PUMA

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

CNVVE Dataset clean audio samples

Abstract

Links and resources

Tags

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

PUMA

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML CNVVE Dataset clean audio samples

Abstract

Links and resources

Tags

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

CNVVE Dataset clean audio samples

Comments and Reviews
(0)