Implementing a Text-to-Speech synthesis model on a Raspberry Pi for Industrial Applications

View/ Open
Metadata
Show full item record
Link to refer to this document:
Abstract
Text-to-Speech (TTS) produces human-like speech from input text. It has recently acquired prominence by applying deep neural networks. Nowadays, end-to-end TTS models produce highly natural synthesized speech but require extremely high computational resources. Deploying such high-quality TTS models in a real-time environment has been a challenging problem due to the limited resources of embedding systems and cell phones. This paper demonstrated the implementation of an end-to-end TTS model (FastSpeech 2) in an embedded device (Raspberry Pi4 B+). The objective experimental results showed that the TTS model is compatible with the Raspberry Pi with high-quality synthesized speech and acceptable performance in terms of processing speed. Our proposed model could be used in many real-life applications if used together with a mechanism for caching, such as railway announcements and industrial purposes.- Title
- Implementing a Text-to-Speech synthesis model on a Raspberry Pi for Industrial Applications
- Author
- Mandeel, Ali Raheem
- Aggar, Ammar Abdullah
- Al-Radhi, Mohammed Salah
- Csapó, Tamás Gábor
- Date of issue
- 2023
- Access level
- Open access
- Copyright owner
- Szerző
- Conference title
- 1st Workshop on Intelligent Infocommunication Networks, Systems and Services (WI2NS2)
- Conference place
- Budapest
- Conference date
- 2023.02.07
- Language
- en
- Page
- 77 - 81
- Subject
- Real-time system, speech synthesis, FastSpeech
- Version
- Post print
- Identifiers
- DOI: 10.3311/WINS2023-014
- Title of the container document
- 1st Workshop on Intelligent Infocommunication Networks, Systems and Services
- ISBN, e-ISBN
- 978-963-421-902-6
- Document type
- Konferenciaközlemény
- Document genre
- Konferenciacikk
- University
- Budapest University of Technology and Economics
- Faculty
- Faculty of Electrical Engineering and Informatics