Publication detail

BUT system for low resource Indian language ASR

PULUGUNDLA, B. BASKAR, M. KESIRAJU, S. EGOROVA, E. KARAFIÁT, M. BURGET, L. ČERNOCKÝ, J.

Original Title

BUT system for low resource Indian language ASR

Type

conference paper

Language

English

Original Abstract

This paper describes the BUT Jilebi teams speech recognition systems created for the 2018 low resource speech recognition challenge for Indian languages. We investigate modifications of multilingual time-delay neural network (TDNN) architectures with transfer learning and compare them to bi-directional residual memory networks (BRMN) and bi-directional LSTM. Our best submission based on system combination achieved word error rates of 13.92% (Tamil), 14.71% (Telugu) and 14.06% (Gujarati). We present the details of submitted systems and also the post-evaluation analysis done for lexicon discovery using unsupervised word segmentation.

Keywords

Indian languages, low resource ASR, multilingual, LF-MMI

Authors

PULUGUNDLA, B.; BASKAR, M.; KESIRAJU, S.; EGOROVA, E.; KARAFIÁT, M.; BURGET, L.; ČERNOCKÝ, J.

Released

2. 9. 2018

Publisher

International Speech Communication Association

Location

Hyderabad

ISBN

1990-9772

Periodical

Proceedings of Interspeech

Year of study

2018

Number

9

State

French Republic

Pages from

3182

Pages to

3186

Pages count

5

URL

BibTex

@inproceedings{BUT155101,
  author="Bhargav {Pulugundla} and Murali Karthick {Baskar} and Santosh {Kesiraju} and Ekaterina {Egorova} and Martin {Karafiát} and Lukáš {Burget} and Jan {Černocký}",
  title="BUT system for low resource Indian language ASR",
  booktitle="Proceedings of Interspeech 2018",
  year="2018",
  journal="Proceedings of Interspeech",
  volume="2018",
  number="9",
  pages="3182--3186",
  publisher="International Speech Communication Association",
  address="Hyderabad",
  doi="10.21437/Interspeech.2018-1302",
  issn="1990-9772",
  url="https://www.isca-speech.org/archive/Interspeech_2018/abstracts/1302.html"
}