Classification of Speaking and Singing Voices Using Bioimpedance Measurements and Deep Learning

Lists

Donati, Eugenio ORCID: https://orcid.org/0000-0002-0048-1858, Chousidis, Christos, Ribeiro, Henrique De Melo and Russo, Nicola (2023) Classification of Speaking and Singing Voices Using Bioimpedance Measurements and Deep Learning. Journal of Voice, 2023. ISSN 0892-1997

Preview

PDF (PDF/A)
A.1-s2.0-S0892199723001200-main.pdf - Accepted Version
Available under License Creative Commons Attribution.
Download (2MB) | Preview

Official URL: https://www.sciencedirect.com/science/article/pii/...

Abstract

The acts of speaking and singing are different phenomena displaying distinct characteristics. The classification and distinction of these voice acts is vastly approached utilizing voice audio recordings and microphones. The use of audio recordings, however, can become challenging and computationally expensive due to the complexity of the voice signal. The research presented in this paper seeks to address this issue by implementing a deep learning classifier of speaking and singing voices based on bioimpedance measurement in replacement of audio recordings. In addition, the proposed research aims to develop a real-time voice act classification for the integration with voice-to-MIDI conversion. For such purposes, a system was designed, implemented, and tested using electroglottographic signals, Mel Frequency Cepstral Coefficients, and a deep neural network. The lack of datasets for the training of the model was tackled by creating a dedicated dataset 7200 bioimpedance measurement of both singing and speaking. The use of bioimpedance measurements allows to deliver high classification accuracy whilst keeping low computational needs for both preprocessing and classification. These characteristics, in turn, allows a fast deployment of the system for near-real-time applications. After the training, the system was broadly tested achieving a testing accuracy of 92% to 94%.

Item Type:	Article
Identifier:	10.1016/j.jvoice.2023.03.018
Keywords:	Speech classification; Singing detection; Bioimpedance measurements; Electroglottography; EGG-to-MIDI; Voice-to-MIDI−Voice; information retrieval; Real-time voice classification.
Subjects:	Natural sciences
Depositing User:	Eugenio Donati
Date Deposited:	22 Sep 2023 09:19
Last Modified:	04 Nov 2024 11:15
URI:	https://repository.uwl.ac.uk/id/eprint/9974