Project detail

DARPA Robust Automatic Transcription of Speech (RATS) - RATS Patrol I

Duration: 23.9.2010 — 30.6.2014

Funding resources

Neveřejný sektor - Přímé kontrakty - smluvní výzkum, neveřejné zdroje

On the project

Existing speech signal processing technologies are inadequate for most noisy or degraded speech signals that are important to military intelligence. The Robust Automatic Transcription of Speech (RATS) program is creating algorithms and software for performing the following tasks on potentially speech-containing signals received over communication channels that are extremely noisy and/or highly distorted: Speech Activity Detection, Language Identification, Speaker Identification and Key Word Spotting.

Description in Czech
Existující technologie zpracování řečového signálu jsou nedostačující pro většinu hlučných nebo degradovaných řečových signálů, které jsou důležité pro vojenskou špionáž. Program robustní automatické transkripce řeči vytváří algoritmy a software, které provedou následující úkony na signálech potenciálně obsahujících řeč, které byly získány prostřednictvím komunikačních kanálů, jež jsou extrémně hlučné a/nebo vysoce deformované: detekce řečové aktivity, rozpoznávání jazyka, rozpoznávání mluvčího a detekce klíčových slov.

Keywords
speech recognition, speaker recognition, language recognition, keyword spotting, robustness, noise, transmission channels

Key words in Czech
rozpoznávání řeči, rozpoznávání mluvčího, rozpoznávání jazyka, detekce klíčových slov, robustnost, šum, přenosové kanály

Default language

English

People responsible

Matějka Pavel, Ing., Ph.D. - principal person responsible

Units

Department of Computer Graphics and Multimedia
- responsible department (10.5.2011 - not assigned)
Speech Data Mining Research Group BUT Speech@FIT
- internal (10.5.2011 - 30.6.2014)
Raytheon BBN Technologies Corp
- client (10.5.2011 - 30.6.2014)
Department of Computer Graphics and Multimedia
- beneficiary (10.5.2011 - 30.6.2014)

Results

GLEMBEK, O.; MA, J.; MATĚJKA, P.; ZHANG, B.; PLCHOT, O.; BURGET, L.; MATSOUKAS, S. Domain adaptation via within-class covariance correction in I-vector based speaker recognition systems. In Proceedings of ICASSP 2014. Florencie: IEEE Signal Processing Society, 2014. p. 4060-4064. ISBN: 978-1-4799-2892-7.
Detail

MARTÍNEZ GONZÁLEZ, D.; PLCHOT, O.; BURGET, L.; GLEMBEK, O.; MATĚJKA, P. Language Recognition in iVectors Space. In Proceedings of Interspeech 2011. Proceedings of Interspeech. Florence: International Speech Communication Association, 2011. no. 8, p. 861-864. ISBN: 978-1-61839-270-1. ISSN: 1990-9772.
Detail

MATĚJKA, P.; ZHANG, L.; NG, T.; MALLIDI, S.; GLEMBEK, O.; MA, J.; ZHANG, B. Neural Network Bottleneck Features for Language Identification. In Proceedings of Odyssey 2014. Proceedings of Odyssey: The Speaker and Language Recognition Workshop Odyssey 2014, Joensuu, Finland. Joensuu: International Speech Communication Association, 2014. no. 6, p. 299-304. ISSN: 2312-2846.
Detail

PLCHOT, O.; MATSOUKAS, S.; MATĚJKA, P.; DEHAK, N.; MA, J.; CUMANI, S.; GLEMBEK, O.; HEŘMANSKÝ, H.; MESGARANI, N.; SOUFIFAR, M.; THOMAS, S.; ZHANG, B.; ZHOU, X. Developing A Speaker Identification System For The DARPA RATS Project. Proceedings of ICASSP 2013. Vancouver: IEEE Signal Processing Society, 2013. p. 6768-6772. ISBN: 978-1-4799-0355-9.
Detail

CUMANI, S.; BRUMMER, J.; BURGET, L.; LAFACE, P.; PLCHOT, O.; VASILAKAKIS, V. Pairwise Discriminative Speaker Verification in the I -Vector Space. IEEE Transactions on Audio Speech and Language Processing, 2013, vol. 2013, no. 6, p. 1217-1227. ISSN: 1558-7916.
Detail

MATĚJKA, P.; PLCHOT, O.; SOUFIFAR, M.; GLEMBEK, O.; D'HARO, L.; VESELÝ, K.; GRÉZL, F.; MA, J.; MATSOUKAS, S.; DEHAK, N. Patrol Team Language Identification System for DARPA RATS P1 Evaluation. Proceedings of Interspeech 2012. Proceedings of Interspeech. Portland, Oregon: International Speech Communication Association, 2012. no. 9, p. 1-4. ISBN: 978-1-62276-759-5. ISSN: 1990-9772.
Detail

SOUFIFAR, M.; KOCKMANN, M.; BURGET, L.; PLCHOT, O.; GLEMBEK, O.; SVENDSEN, T. iVector Approach to Phonotactic Language Recognition. In Proceedings of Interspeech 2011. Proceedings of Interspeech. Florence: International Speech Communication Association, 2011. no. 8, p. 2913-2916. ISBN: 978-1-61839-270-1. ISSN: 1990-9772.
Detail

PLCHOT, O.; MATĚJKA, P.; SILNOVA, A.; NOVOTNÝ, O.; DIEZ SÁNCHEZ, M.; ROHDIN, J.; GLEMBEK, O.; BRÜMMER, N.; SWART, A.; PRIETO, J.; GARCIA PERERA, L.; BUERA, L.; KENNY, P.; ALAM, J.; BHATTACHARYA, G. Analysis and Description of ABC Submission to NIST SRE 2016. In Proceedings of Interspeech 2017. Proceedings of Interspeech. Stockholm: International Speech Communication Association, 2017. no. 08, p. 1348-1352. ISSN: 1990-9772.
Detail

BAHARI, M.; DEHAK, N.; VAN HAMME, H.; BURGET, L.; ALI, A.; GLASS, J. Non-Negative Factor Analysis of Gaussian Mixture Model Weight Adaptation for Language and Dialect Recognition. IEEE-ACM Transactions on Audio Speech and Language Processing, 2014, vol. 2014, no. 7, p. 1117-1129. ISSN: 2329-9290.
Detail

NG, T.; ZHANG, B.; NGUYEN, L.; MATSOUKAS, S.; ZHOU, X.; MESGARANI, N.; VESELÝ, K.; MATĚJKA, P. Developing a Speech Activity Detection System for the DARPA RATS Program. Proceedings of Interspeech 2012. Proceedings of Interspeech. Portland, Oregon: International Speech Communication Association, 2012. no. 9, p. 1-4. ISBN: 978-1-62276-759-5. ISSN: 1990-9772.
Detail

BRUMMER, J.; CUMANI, S.; GLEMBEK, O.; KARAFIÁT, M.; MATĚJKA, P.; PEŠÁN, J.; PLCHOT, O.; SOUFIFAR, M.; DE VILLIERS, E.; ČERNOCKÝ, J. Description and analysis of the Brno276 system for LRE2011. In Proceedings of Odyssey 2012: The Speaker and Language Recognition Workshop. Singapur: International Speech Communication Association, 2012. p. 216-223. ISBN: 978-981-07-3093-2.
Detail

D'HARO, L.; GLEMBEK, O.; PLCHOT, O.; MATĚJKA, P.; SOUFIFAR, M.; CORDOBA, R.; ČERNOCKÝ, J. Phonotactic Language Recognition using i-vectors and Phoneme Posteriogram Counts. Proceedings of Interspeech 2012. Proceedings of Interspeech. Portland, Oregon: International Speech Communication Association, 2012. no. 9, p. 1-4. ISBN: 978-1-62276-759-5. ISSN: 1990-9772.
Detail

CUMANI, S.; PLCHOT, O.; LAFACE, P. Probabilistic Linear Discriminant Analysis Of I-Vector Posterior Distributions. Proceedings of ICASSP 2013. Vancouver: IEEE Signal Processing Society, 2013. p. 7644-7648. ISBN: 978-1-4799-0355-9.
Detail

LEI, Y.; BURGET, L.; SCHEFFER, N. Bilinear Factor Analysis for iVector Based Speaker Verification. Proceedings of Interspeech. Portland, Oregon: International Speech Communication Association, 2012. p. 1-4. ISBN: 978-1-62276-759-5.
Detail

MARTÍNEZ GONZÁLEZ, D.; BURGET, L.; STAFYLAKIS, T.; LEI, Y.; KENNY, P.; LLEIDA, E. Unscented Transform For Ivector-based Noisy Speaker Recognition. In Proceedings of ICASSP 2014. Florencie: IEEE Signal Processing Society, 2014. p. 4070-4074. ISBN: 978-1-4799-2892-7.
Detail

CUMANI, S.; LAFACE, P.; PLCHOT, O. On the use of i-vector posterior distributions in Probabilistic Linear Discriminant Analysis. IEEE-ACM Transactions on Audio Speech and Language Processing, 2014, vol. 22, no. 4, p. 846-857. ISSN: 2329-9290.
Detail

PLCHOT, O.; DIEZ SÁNCHEZ, M.; SOUFIFAR, M.; BURGET, L. PLLR Features in Language Recognition System for RATS. In Proceedings of Interspeech 2014. Singapore: International Speech Communication Association, 2014. p. 3048-3051. ISBN: 978-1-63439-435-2.
Detail

MATĚJKA, P.; NOVOTNÝ, O.; PLCHOT, O.; BURGET, L.; DIEZ SÁNCHEZ, M.; ČERNOCKÝ, J. Analysis of Score Normalization in Multilingual Speaker Recognition. In Proceedings of Interspeech 2017. Proceedings of Interspeech. Stockholm: International Speech Communication Association, 2017. no. 08, p. 1567-1571. ISSN: 1990-9772.
Detail

NG, T.; HSIAO, R.; ZHANG, L.; KARAKOS, D.; MALLIDI, S.; KARAFIÁT, M.; VESELÝ, K.; SZŐKE, I.; ZHANG, B.; NGUYEN, L.; SCHWARTZ, R. Progress in the BBN Keyword Search System for the DARPA RATS Program. In Proceedings of Interspeech 2014. Singapore: International Speech Communication Association, 2014. p. 959-963. ISBN: 978-1-63439-435-2.
Detail

SOUFIFAR, M.; BURGET, L.; PLCHOT, O.; CUMANI, S.; ČERNOCKÝ, J. Regularized Subspace n-Gram Model for Phonotactic iVector Extraction. Proceedings of Interspeech 2013. Proceedings of the 14th Annual Conference of the International Speech Communication Association (Interspeech 2013). Lyon: International Speech Communication Association, 2013. no. 8, p. 74-78. ISBN: 978-1-62993-443-3. ISSN: 2308-457X.
Detail

PLCHOT, O.; KARAFIÁT, M.; BRUMMER, J.; GLEMBEK, O.; MATĚJKA, P.; DE VILLIERS, E.; ČERNOCKÝ, J. Speaker vectors from Subspace Gaussian Mixture Model as complementary features for Language Identification. In Proceedings of Odyssey 2012, The Speaker and Language Recognition Workshop. Singapur: International Speech Communication Association, 2012. p. 330-333. ISBN: 978-981-07-3093-2.
Detail

Link

Responsibility: Matějka Pavel, Ing., Ph.D.