Přístupnostní navigace
E-přihláška
Vyhledávání Vyhledat Zavřít
Detail publikačního výsledku
BENEŠ, K.; BURGET, L.
Originální název
Text Augmentation for Language Models in High Error Recognition Scenario
Anglický název
Druh
Stať ve sborníku v databázi WoS či Scopus
Originální abstrakt
In this paper, we explore several data augmentation strategiesfor training of language models for speech recognition. Wecompare augmentation based on global error statistics withone based on unigram statistics of ASR errors and with labelsmoothingand its sampled variant. Additionally, we investigatethe stability and the predictive power of perplexity estimatedon augmented data. Despite being trivial, augmentation drivenby global substitution, deletion and insertion rates achieves thebest rescoring results. On the other hand, even though the associatedperplexity measure is stable, it gives no better predictionof the final error rate than the vanilla one. Our best augmentationscheme increases the WER improvement from second-passrescoring from 1.1% to 1.9% absolute on the CHiMe-6 challenge.
Anglický abstrakt
Klíčová slova
data augmentation, error simulation, languagemodeling, automatic speech recognition
Klíčová slova v angličtině
Autoři
Rok RIV
2022
Vydáno
30.08.2021
Nakladatel
International Speech Communication Association
Místo
Brno
Kniha
Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
ISSN
1990-9772
Periodikum
Proceedings of Interspeech
Svazek
2021
Číslo
8
Stát
Francouzská republika
Strany od
1872
Strany do
1876
Strany počet
5
URL
https://www.isca-speech.org/archive/interspeech_2021/benes21_interspeech.html
BibTex
@inproceedings{BUT175841, author="Karel {Beneš} and Lukáš {Burget}", title="Text Augmentation for Language Models in High Error Recognition Scenario", booktitle="Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH", year="2021", journal="Proceedings of Interspeech", volume="2021", number="8", pages="1872--1876", publisher="International Speech Communication Association", address="Brno", doi="10.21437/Interspeech.2021-627", issn="1990-9772", url="https://www.isca-speech.org/archive/interspeech_2021/benes21_interspeech.html" }
Dokumenty
benes21_interspeech