Is Dynamic Time Warping of speech signals suitable for articulatory signal comparison using ultrasound tongue images?

View/ Open
Metadata
Show full item record
Link to refer to this document:
Abstract
In speech technology, the examination of speaker dependency is vital - that is, whether methods developed for one speaker can be adapted to another speaker or not. In the case of text-to-speech synthesis, well-usable speaker adaptation methods are already available, but they cannot be used directly for articulatory data (movement of the tongue, lips, etc, during speech production). In this research, we investigate the above question and analyze the speaker dependency of the articulatory movement, using audio signal and ultrasound tongue imaging (UTI) recorded in parallel during speech production. For the comparison, we use the well-known Dynamic Time Warping (DTW) procedure of speech technology. DTW of the speech signal has already been successfully applied 1) with UTI, for within-speaker comparisons, 2) with electromagnetic articulography (EMA), for the analysis of inter-speaker differences, 3) with EMA and electrocortocography (ECoG), also for inter-speaker comparisons. However, there has been no previous research yet on the application of DTW on speech signals with ultrasound tongue images for different speakers. In the present research, we examine the applicability of DTW for comparing speakers' speech and articulatory data on a few Hungarian and English examples, and visually analyze them. In the long term, we plan to use the results for speech-based brain-computer interfaces, so that we can supplement the brain signal with ultrasound-based articulation information.- Title
- Is Dynamic Time Warping of speech signals suitable for articulatory signal comparison using ultrasound tongue images?
- Author
- Csapó, Tamás Gábor
- Date of issue
- 2023
- Access level
- Open access
- Copyright owner
- Szerző
- Conference title
- 1st Workshop on Intelligent Infocommunication Networks, Systems and Services (WI2NS2)
- Conference place
- Budapest
- Conference date
- 2023.02.07
- Language
- en
- Page
- 65 - 70
- Subject
- speech technology, dynamic time warping, signal processing
- Version
- Post print
- Identifiers
- DOI: 10.3311/WINS2023-012
- Title of the container document
- 1st Workshop on Intelligent Infocommunication Networks, Systems and Services
- ISBN, e-ISBN
- 978-963-421-902-6
- Document type
- Konferenciaközlemény
- Document genre
- Konferenciacikk
- University
- Budapest University of Technology and Economics
- Faculty
- Faculty of Electrical Engineering and Informatics