Műegyetemi Digitális Archívum

Conformer-based Automatic Speech Recognition for Arabic Dialects

Date

Type

Könyvfejezet

Language

en

Reading access rights:

Open access

Rights Holder

Szerző

Conference Date

2025-02-03

Conference Place

Budapest

Conference Title

3rd Workshop on Intelligent Infocommunication Networks, Systems and Services

ISBN, e-ISBN

978-963-421-982-8

Container Title

3rd Workshop on Intelligent Infocommunication Networks, Systems and Services

Version

Post print

Faculty

Faculty of Electrical Engineering and Informatics

First Page

21

Subject (OSZKAR)

ASR
Conformer-CTC
Character-based
Arabic language
Deep Learning

Gender

Konferenciacikk

University

Budapest University of Technology and Economics

OOC works

Abstract

Automatic Speech Recognition has shown a significant upward trend in recent years. This paper investigates an ASR system for the Arabic language, developed using the Conformer-CTC character-based model within the NeMo framework. The system leverages the latest deep learning techniques, focusing on the conformer architecture combined with Connectionist Temporal Classification for sequence-to-sequence learning. The model is supervised, using labeled training data to map the input audio to text. The Mozilla Common Voice 11.0 dataset, which offers diverse spoken Arabic samples, is used for training. This paper details the model training process, including configuration setup, data processing, and optimization strategies. The performance of the model is evaluated, offering insights into the challenges and effectiveness of the Conformer-CTC character-based model for Arabic speech recognition tasks.

Description

Keywords