Detail výsledku VaV

Originální název

Text Augmentation for Language Models in High Error Recognition Scenario

Anglický název

Text Augmentation for Language Models in High Error Recognition Scenario

Druh

Stať ve sborníku v databázi WoS či Scopus

Originální abstrakt

In this paper, we explore several data augmentation strategiesfor training of language models for speech recognition. Wecompare augmentation based on global error statistics withone based on unigram statistics of ASR errors and with labelsmoothingand its sampled variant. Additionally, we investigatethe stability and the predictive power of perplexity estimatedon augmented data. Despite being trivial, augmentation drivenby global substitution, deletion and insertion rates achieves thebest rescoring results. On the other hand, even though the associatedperplexity measure is stable, it gives no better predictionof the final error rate than the vanilla one. Our best augmentationscheme increases the WER improvement from second-passrescoring from 1.1% to 1.9% absolute on the CHiMe-6 challenge.

Anglický abstrakt

In this paper, we explore several data augmentation strategiesfor training of language models for speech recognition. Wecompare augmentation based on global error statistics withone based on unigram statistics of ASR errors and with labelsmoothingand its sampled variant. Additionally, we investigatethe stability and the predictive power of perplexity estimatedon augmented data. Despite being trivial, augmentation drivenby global substitution, deletion and insertion rates achieves thebest rescoring results. On the other hand, even though the associatedperplexity measure is stable, it gives no better predictionof the final error rate than the vanilla one. Our best augmentationscheme increases the WER improvement from second-passrescoring from 1.1% to 1.9% absolute on the CHiMe-6 challenge.

Klíčová slova

data augmentation, error simulation, languagemodeling, automatic speech recognition

Klíčová slova v angličtině

data augmentation, error simulation, languagemodeling, automatic speech recognition

Autoři

BENEŠ, K.; BURGET, L.

Rok RIV

2022

Vydáno

30.08.2021

Nakladatel

International Speech Communication Association

Místo

Brno

Kniha

Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH

ISSN

1990-9772

Periodikum

Proceedings of Interspeech

Svazek

2021

Číslo

8

Stát

Francouzská republika

Strany od

1872

Strany do

1876

Strany počet

5

URL

https://www.isca-speech.org/archive/interspeech_2021/benes21_interspeech.html

BibTex

@inproceedings{BUT175841,
  author="Karel {Beneš} and Lukáš {Burget}",
  title="Text Augmentation for Language Models in High Error Recognition Scenario",
  booktitle="Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH",
  year="2021",
  journal="Proceedings of Interspeech",
  volume="2021",
  number="8",
  pages="1872--1876",
  publisher="International Speech Communication Association",
  address="Brno",
  doi="10.21437/Interspeech.2021-627",
  issn="1990-9772",
  url="https://www.isca-speech.org/archive/interspeech_2021/benes21_interspeech.html"
}

Dokumenty

benes21_interspeech

VUT

Fakulty a vysokoškolské ústavy

Součásti

Text Augmentation for Language Models in High Error Recognition Scenario