Přístupnostní navigace
E-application
Search Search Close
Doctoral Thesis
Author of thesis: Ing. Ondřej Novotný, Ph.D.
Acad. year: 2021/2022
Supervisor: prof. Dr. Ing. Jan Černocký
Reviewers: Luciana Ferrer, Petr Pollák
This work deals with discriminative techniques in speaker verification systems to improve robustness of the systems against factors that negatively affect their performance. These factors include noise, reverberation, or the transmission channel.The thesis consists of two main parts. In the first part, it deals with a theoretical introduction to current state-of-the-art speaker verification systems. The recognition system's steps are described, starting from the extraction of acoustic features, the extraction of vector representations of recordings, and the final recognition score computation. Particular emphasis is paid to the techniques of extraction of a vector representation of a recording, where we describe two different paradigms: the i-vectors and the x-vectors.The second part of the work focuses more on discriminative techniques to increase robustness. Their description is organized to match the gradual passage of the recording through the verification system. First, attention is paid to signal pre-processing using a neural network for noise reduction and speech enhancement. This pre-processing is a universal technique independent of the verification system. The work follows by focusing on the use of a discriminative approach in the extraction of features and the extraction of vector representations of recordings.Furthermore, this work sheds light on the transition from generative systems to discriminative systems.In order to give a fuller context, the work also describes techniques that had historically preceded this transition. All presented techniques are always experimentally verified and their advantages evaluated.We are proposing several techniques that have proved successful in both the generative approach in the form of i-vectors and discriminative x-vectors, and thanks to them, considerable improvement has been achieved.For completeness, in the field of robustness, other techniques are included in the work, such as normalization of scores or multi-condition training.Finally, the work deals with the robustness of discriminative systems in terms of data used in their training.
Speaker verification, generative training, discriminative training, speech enhancement, i-vector, x-vector, robustness, noise, reverberation, neural networks.
Date of defence
03.12.2021
Result of the defence
Defended (thesis was successfully defended)
Process of defence
Student přednesl cíle a výsledky, kterých v rámci řešení disertační práce dosáhl. V rozpravě student odpověděl na otázky komise a oponentů a hostů. Diskuze je zaznamenána na diskuzních lístcích, které jsou přílohou protokolu. Počet diskuzních lístků: 1 Komise se v závěru jednomyslně usnesla, že student splnil podmínky pro udělení akademického titulu doktor.
Language of thesis
English
Faculty
Fakulta informačních technologií
Department
Department of Computer Graphics and Multimedia
Study programme
Computer Science and Engineering (CSE-PHD-4)
Field of study
Computer Science and Engineering (DVI4)
Composition of Committee
prof. Ing. Martin Drahanský, Ph.D. (předseda) prof. Ing. Adam Herout, Ph.D. (člen) doc. RNDr. Aleš Horák, Ph.D. (člen) doc. Ing. Radim Kolář, Ph.D. (člen) doc. Ing. Petr Pollák, CSc. (člen)
Supervisor’s reportprof. Dr. Ing. Jan Černocký
Reviewer’s reportLuciana Ferrer
Reviewer’s reportPetr Pollák
Responsibility: Mgr. et Mgr. Hana Odstrčilová