Műegyetemi Digitális Archívum

Model-centric data selection: Refining end-to-end speech recognition

Kedalai, Meng
Meng, Yan
Mihajlik, Péter
2024-02-26T15:41:54Z
2024-02-26T15:41:54Z
2024

Abstract

Data selection can be an important step in pre-processing datasets for Automatic Speech Recognition (ASR) -- still its application is not general. In order to handle potential labeling errors and other anomalies in the dataset, we introduced a simple model-centric speech data selection strategy. It discards samples in the dataset that is difficult to recognize by the model, and use a restricted dataset to retrain the model. This technique improved the recognition accuracy of Hungarian ASR both on the BEA-Base and Common Voice (CV) datasets by using the Conformer model architecture. The proposed approach achieved a consistent relative improvement in terms of both Character and Word Error Rates (CER, WER), up to (3%, 2.5%).

http://hdl.handle.net/10890/54980
en
Model-centric data selection: Refining end-to-end speech recognition
Konferenciaközlemény
Open access
Szerző
2024-02-05
Budapest
2nd Workshop on Intelligent Infocommunication Networks, Systems and Services (WI2NS2)
2024-02-05
978-963-421-944-6
Budapest University of Technology and Economics
Online
2nd Workshop on Intelligent Infocommunication Networks, Systems and Services
Post print
Faculty of Electrical Engineering and Informatics
1
10.3311/WINS2024-001
5
Data selection
Dataset Optimization
Machine learning
Automatic Speech Recognition
Konferenciacikk
Budapest University of Technology and Economics

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
WINS_01_d.pdf
Size:
112.64 KB
Format:
Adobe Portable Document Format
Description:
WINS_01_d.pdf