Conformer-based Automatic Speech Recognition for Arabic Dialects
Date
Type
Language
Reading access rights:
Rights Holder
Conference Date
Conference Place
Conference Title
ISBN, e-ISBN
Container Title
Version
Faculty
First Page
Subject (OSZKAR)
Conformer-CTC
Character-based
Arabic language
Deep Learning
Gender
University
- Cite this item
- https://doi.org/10.3311/WINS2025-004
OOC works
Abstract
Automatic Speech Recognition has shown a significant upward trend in recent years. This paper investigates an ASR system for the Arabic language, developed using the Conformer-CTC character-based model within the NeMo framework. The system leverages the latest deep learning techniques, focusing on the conformer architecture combined with Connectionist Temporal Classification for sequence-to-sequence learning. The model is supervised, using labeled training data to map the input audio to text. The Mozilla Common Voice 11.0 dataset, which offers diverse spoken Arabic samples, is used for training. This paper details the model training process, including configuration setup, data processing, and optimization strategies. The performance of the model is evaluated, offering insights into the challenges and effectiveness of the Conformer-CTC character-based model for Arabic speech recognition tasks.