Műegyetemi Digitális Archívum

Model-centric data selection: Refining end-to-end speech recognition

Date

Type

Konferenciaközlemény

Language

en

Reading access rights:

Open access

Rights Holder

Szerző

Conference Date

2024-02-05

Conference Place

Budapest

Conference Title

2nd Workshop on Intelligent Infocommunication Networks, Systems and Services (WI2NS2)

ISBN, e-ISBN

978-963-421-944-6

Container Title

2nd Workshop on Intelligent Infocommunication Networks, Systems and Services

Version

Post print

Faculty

Faculty of Electrical Engineering and Informatics

First Page

1

Subject (OSZKAR)

Data selection
Dataset Optimization
Machine learning
Automatic Speech Recognition

Gender

Konferenciacikk

University

Budapest University of Technology and Economics

OOC works

Abstract

Data selection can be an important step in pre-processing datasets for Automatic Speech Recognition (ASR) -- still its application is not general. In order to handle potential labeling errors and other anomalies in the dataset, we introduced a simple model-centric speech data selection strategy. It discards samples in the dataset that is difficult to recognize by the model, and use a restricted dataset to retrain the model. This technique improved the recognition accuracy of Hungarian ASR both on the BEA-Base and Common Voice (CV) datasets by using the Conformer model architecture. The proposed approach achieved a consistent relative improvement in terms of both Character and Word Error Rates (CER, WER), up to (3%, 2.5%).

Description

Keywords