Přístupnostní navigace
E-přihláška
Vyhledávání Vyhledat Zavřít
Detail publikačního výsledku
RANGAPPA, P.; CAROFILIS, A.; PRAKASH, J.; KUMAR, S.; BURDISSO, S.; MADIKERI, S.; VILLATORO-TELLO, E.; SHARMA, B.; MOTLÍČEK, P.; HACIOGLU, K.; VENKATESAN, S.; VYAS, S.; STOLCKE, A.
Originální název
Efficient Data Selection for Domain Adaptation of ASR Using Pseudo-Labels and Multi-Stage Filtering
Anglický název
Druh
Stať ve sborníku v databázi WoS či Scopus
Originální abstrakt
Fine-tuning pretrained ASR models for specific domains is challenging for small organizations with limited labeled data and computational resources. Here we explore different data selection pipelines and propose a robust approach that improves ASR adaptation by filtering pseudo-labels generated using Whisper (encoder-decoder) and Zipformer (transducer) models. Our approach integrates multiple selection strategies-including word error rate (WER) prediction, named entity recognition (NER), and character error rate (CER) analysis-to extract high-quality training segments. We evaluate our method on Whisper and Zipformer using a 7500-hour baseline, comparing it to a CER-based approach relying on hypotheses from three ASR systems. Fine-tuning on 7500 hours of pseudo-labeled call center data achieves 12.3% WER, while our filtering reduces the dataset to 100 hours (1.4%) with similar performance; a similar trend is observed on Fisher English.
Anglický abstrakt
Klíčová slova
speech recognition, data selection, whisper, zip-formers
Klíčová slova v angličtině
Autoři
Rok RIV
2026
Vydáno
17.08.2025
Nakladatel
Isca-Int Speech Communication Assoc
Místo
Rotterdam, The Netherlands
Kniha
Interspeech
Periodikum
Stát
Francouzská republika
Strany od
4928
Strany do
4932
Strany počet
5
URL
https://www.fit.vut.cz/research/group/speech/public/publi/2025/rangappa_INTERSPEECH_2025_co-author_Motlicek.pdf
BibTex
@inproceedings{BUT201433, author="{} and {} and {} and {} and {} and {} and {} and {} and Petr {Motlíček} and {} and {} and {} and {}", title="Efficient Data Selection for Domain Adaptation of ASR Using Pseudo-Labels and Multi-Stage Filtering", booktitle="Interspeech", year="2025", journal="Interspeech", pages="4928--4932", publisher="Isca-Int Speech Communication Assoc", address="Rotterdam, The Netherlands", doi="10.21437/Interspeech.2025-2580", url="https://www.fit.vut.cz/research/group/speech/public/publi/2025/rangappa_INTERSPEECH_2025_co-author_Motlicek.pdf" }
Dokumenty
rangappa_INTERSPEECH_2025_co-author_Motlicek