Detail publikačního výsledku

Boosting of Contextual Information in ASR for Air-Traffic Call-Sign Recognition

KOCOUR, M.; VESELÝ, K.; BLATT, A.; ZULUAGA-GOMEZ, J.; SZŐKE, I.; ČERNOCKÝ, J.; KLAKOW, D.; MOTLÍČEK, P.

Originální název

Boosting of Contextual Information in ASR for Air-Traffic Call-Sign Recognition

Anglický název

Boosting of Contextual Information in ASR for Air-Traffic Call-Sign Recognition

Druh

Stať ve sborníku v databázi WoS či Scopus

Originální abstrakt

Contextual adaptation of ASR can be very beneficial for multiaccentand often noisy Air-Traffic Control (ATC) speech. Ourfocus is call-sign recognition, which can be used to track conversationsof ATC operators with individual airplanes. Wedeveloped a two-stage boosting strategy, consisting of HCLGboosting and Lattice boosting. Both are implemented as WFSTcompositions and the contextual information is specific to eachutterance. In HCLG boosting we give score discounts to individualwords, while in Lattice boosting the score discountsare given to word sequences. The context data have origin insurveillance database of OpenSky Network. From this, we obtainlists of call-signs that are made more likely to appear inthe best hypothesis of ASR. This also improves the accuracyof the NLU module that recognizes the call-signs from the besthypothesis of ASR.As part of ATCO2 project, we collected liveatc test set2.The boosting of call-signs leads to 4.7% absolute WER improvementand 27.1% absolute increase of Call-Sign recognitionAccuracy (CSA). Our best result of 82.9% CSA is quitegood, given that the data is noisy, and WER 28.4% is relativelyhigh. We believe there is still room for improvement.

Anglický abstrakt

Contextual adaptation of ASR can be very beneficial for multiaccentand often noisy Air-Traffic Control (ATC) speech. Ourfocus is call-sign recognition, which can be used to track conversationsof ATC operators with individual airplanes. Wedeveloped a two-stage boosting strategy, consisting of HCLGboosting and Lattice boosting. Both are implemented as WFSTcompositions and the contextual information is specific to eachutterance. In HCLG boosting we give score discounts to individualwords, while in Lattice boosting the score discountsare given to word sequences. The context data have origin insurveillance database of OpenSky Network. From this, we obtainlists of call-signs that are made more likely to appear inthe best hypothesis of ASR. This also improves the accuracyof the NLU module that recognizes the call-signs from the besthypothesis of ASR.As part of ATCO2 project, we collected liveatc test set2.The boosting of call-signs leads to 4.7% absolute WER improvementand 27.1% absolute increase of Call-Sign recognitionAccuracy (CSA). Our best result of 82.9% CSA is quitegood, given that the data is noisy, and WER 28.4% is relativelyhigh. We believe there is still room for improvement.

Klíčová slova

Air Traffic Control, Automatic Speech Recognition,Contextual Adaptation, Call-sign Recognition, Call-signDetection, OpenSky Network

Klíčová slova v angličtině

Air Traffic Control, Automatic Speech Recognition,Contextual Adaptation, Call-sign Recognition, Call-signDetection, OpenSky Network

Autoři

KOCOUR, M.; VESELÝ, K.; BLATT, A.; ZULUAGA-GOMEZ, J.; SZŐKE, I.; ČERNOCKÝ, J.; KLAKOW, D.; MOTLÍČEK, P.

Rok RIV

2022

Vydáno

30.08.2021

Nakladatel

International Speech Communication Association

Místo

Brno

Kniha

Proceedings Interspeech 2021

ISSN

1990-9772

Periodikum

Proceedings of Interspeech

Svazek

2021

Číslo

8

Stát

Francouzská republika

Strany od

3301

Strany do

3305

Strany počet

5

URL

BibTex

@inproceedings{BUT175845,
  author="KOCOUR, M. and VESELÝ, K. and BLATT, A. and ZULUAGA-GOMEZ, J. and SZŐKE, I. and ČERNOCKÝ, J. and KLAKOW, D. and MOTLÍČEK, P.",
  title="Boosting of Contextual Information in ASR for Air-Traffic Call-Sign Recognition",
  booktitle="Proceedings Interspeech 2021",
  year="2021",
  journal="Proceedings of Interspeech",
  volume="2021",
  number="8",
  pages="3301--3305",
  publisher="International Speech Communication Association",
  address="Brno",
  doi="10.21437/Interspeech.2021-1619",
  issn="1990-9772",
  url="https://www.isca-speech.org/archive/interspeech_2021/kocour21_interspeech.html"
}

Dokumenty