Přístupnostní navigace
E-application
Search Search Close
Master's Thesis
Author of thesis: Bc. Josef Macek
Acad. year: 2025/2026
Supervisor: Ing. Daniel Kováč, Ph.D.
Reviewer: Ing. Richard Ladislav
The aim of this master’s thesis is to evaluate the applicability of interpretable speech biomarkers and deep speech representations for the automatic classification of individuals with Parkinson’s disease and healthy controls. The thesis is based on the assumption that Parkinson’s disease affects speech production, particularly in relation to hypokinetic dysarthria, and that these changes can be captured using acoustic and deep speech features. The experimental part is based on a database of spontaneous monologues. Adjusted interpretable speech biomarkers and x-vector representations extracted using a pretrained ECAPA-TDNN model are used. Three approaches for combining both types of features were designed and compared: late fusion, early fusion, and hybrid fusion. The biomarkerbased models as well as the early and hybrid fusion models use the XGBoost algorithm with hyperparameter tuning performed using Optuna, whereas the x-vector branch and the late-fusion meta-model are based on logistic regression. The final evaluation was performed using the leave-one-out method. The results show that x-vector representations provide stronger discriminative information than standalone interpretable biomarkers. The standalone x-vector model based on logistic regression achieved an accuracy of 0.7519 and an AUC value of 0.8045, while the biomarker-based XGBoost model achieved an accuracy of 0.6589 and an AUC value of 0.6878. Fusion approaches made it possible to combine the performance of deep representations with the information contained in interpretable biomarkers; however, their contribution depended on the specific feature combination strategy. Among the fusion approaches, hybrid fusion achieved the best results, with an accuracy of 0.7597 and an AUC value of 0.8117. The results confirm the potential of automatic speech analysis for assessing speech manifestations of Parkinson’s disease; however, they must be interpreted with respect to the limited size of the dataset, the validation protocol used, and the possible influence of individual differences between speakers.
ECAPA-TDNN, feature fusion, hypokinetic dysarthria, Optuna, Parkinson’s disease, speech biomarker, XGBoost, x-vector
Date of defence
11.06.2026
Result of the defence
Defended (thesis was successfully defended)
Grading
A
Process of defence
Student prezentoval výsledky své práce a komise byla seznámena s posudky. Student obhájil diplomovou práci a odpověděl na otázky členů komise a oponenta. Otázky oponenta: Čím si vysvětlujete zlepšení výsledků při využití principu hybridní fúze v porovnání s principy časné a pozdní fúze?
Language of thesis
Czech
Faculty
Fakulta elektrotechniky a komunikačních technologií
Department
Department of Telecommunications
Study programme
Audio Engineering (MPC-AUD)
Specialization
Audio Production and Recording (AUDM-ZVUK)
Composition of Committee
prof. Ing. Zdeněk Smékal, CSc. (předseda) Ing.MgA. Edgar Mojdl, Ph.D. (místopředseda) Dr. Ing. Libor Husník (člen) Ing. Václav Mach, Ph.D. (člen) Ing. Matěj Ištvánek, Ph.D. (člen)
Supervisor’s reportIng. Daniel Kováč, Ph.D.
Grade proposed by supervisor: A
Reviewer’s reportIng. Richard Ladislav
Grade proposed by reviewer: A
Responsibility: Mgr. et Mgr. Hana Odstrčilová