Přístupnostní navigace
E-application
Search Search Close
Bachelor's Thesis
Author of thesis: Richard Martinec
Acad. year: 2025/2026
Supervisor: Ing. Igor Szőke, Ph.D.
Reviewer: Ing. Oldřich Plchot, Ph.D.
This thesis focuses on classifying dyslexia from read speech using machine learning. A HuBERT speech model was used to extract features from the speech recordings. Features were processed using various methods and multiple support vector machine (SVM) models were trained on the features to determine which feature layers are best for dyslexia detection, if any. A maximum detection accuracy of 96.2% was achieved; various other experiments were carried out to further validate the chosen approach. Some experiments revealed the presence of a dataset bias, skewing the accuracy; different experiments have suggested that dyslexia traits are still being considered during the classification. The results suggest that this approach (or similar approaches) could aid in child dyslexia diagnosis in the real world.
dyslexia, dyslexia classification, speech disorders, machine learning, support vector machines, HuBERT, audio embedding, speech features
Date of defence
17.06.2026
Result of the defence
Defended (thesis was successfully defended)
Grading
C
Process of defence
Student nejprve prezentoval výsledky, kterých dosáhl v rámci své práce. Komise se poté seznámila s hodnocením vedoucího a posudkem oponenta práce. Student následně odpověděl na otázky oponenta a na další otázky přítomných. Komise se na základě posudku oponenta, hodnocení vedoucího, přednesené prezentace a odpovědí studenta na položené otázky rozhodla práci hodnotit stupněm C.
Topics for thesis defence
Language of thesis
English
Faculty
Fakulta informačních technologií
Department
Department of Computer Graphics and Multimedia
Study programme
Information Technology (BIT)
Composition of Committee
doc. Ing. Lukáš Burget, Ph.D. (předseda) doc. RNDr. Milan Češka, Ph.D. (místopředseda) Dr. Ing. Petr Peringer (člen) Ing. Matěj Grégr, Ph.D. (člen) Ing. Jakub Husa, Ph.D. (člen)
Supervisor’s reportIng. Igor Szőke, Ph.D.
Student pracoval příkladně po celý akademický rok. Téma řešil aktivně a se zájmem. Podle mého názoru i přes menší komplikace dosáhl slušných výsledků. Jako bonus udělal jednoduchý webový demonstrátor.
Jedná se o obtížnější zadání s dostatkem prostoru pro možné rozšíření. Zadání bylo splněno. Obtížnost shledávám v tématu, které není příliš prozkoumané. Dále bylo třeba aktivně komunikovat s FF MU. Student nemohl kvůli GDP získat řečové nahrávky, takže pracoval pouze s extrahovanými příznaky. To hodnotím jako další komplikace stěžující řešení. Téma bylo řešené s lehkou spoluprací s FF MU (kde se specializují na detekce dyslexie z očních pohybů). S výsledky práce jsem velmi spokojen. Student nasadil maximální úsilí pro tvorbu věrohodných závěrů. Cením též tvorbu webového demonstrátoru.
Student literaturu získával samostatně a aktivně.
Student konzultoval práci po celou dobu akademického roku pravidelně cca 1x za 2 týdny. Byl vždy připraven a ukázal slušný pokrok. Práci řešil aktivně a samostatně.
Aktivita při dokončování nebyla pod stresem. Text byl dodán ke kontrole včas. Vedoucím byly doporučeny dílčí úpravy struktury a jazyková kontrola.
Student se zúčastnil Excel@FIT a byl oceněn.
Grade proposed by supervisor: A
Reviewer’s reportIng. Oldřich Plchot, Ph.D.
In conclusion, I liked the work, as the assignment was fulfilled through careful data handling and multiple approaches to dyslexia detection. I also appreciate that the author has presented the outcomes in a student conference and has brought this interesting topic to broader attention. I had minor issues with some theoretical and introductory sections of the work, and I think more experiments could be conducted with the extracted SSL features. I was missing a simple baseline that would clearly show the superiority of using an SSL model as a feature extractor here.
Evaluation level: průměrně obtížné zadání
The assignment is of average difficulty. The core of the work is organizing and understanding the dataset consisting of speech and eye-tracking data from dyslexic and intact individuals. Later, the problem of building a binary classifier is rather easy, given ample available examples that can be easily adapted to the task at hand. Later on, the correct interpretation of the achieved results needs attention to detail.
Individual chapters of the work logically follow each other, and especially the sections presenting the data and results give the reader a good insight into the dataset and the basics of the machine learning approach to detecting dyslexia from speech. I would, however, appreciate a more detailed introduction to the problematic than the four small paragraphs here. Later in the text, I also have problems with some figures:
The typographical and grammatical level of this work is very good. Some figures could be handled better, as mentioned above.
The work is certainly complete and functional. It indeed solves the assignment and is demonstrated online (and it was also demonstrated at the student conference).
I appreciate the work with data, honesty with the discovered bias and cross-analysis with eye-tracking data.
I would appreciate, however, more attention to finding optimal approaches in the speech model and setting various baselines. In Fig. 2.2, we get a nice overview of what worked on detecting dysarthria, but I am missing some simple MFCC-based baseline, which could easily justify using models such as HuBERT here. Then, in every experiment, we get the result for features extracted from every layer of the HuBERT model, which is nice, but as it is common knowledge that these layers look at different kinds of speech characteristics, I would appreciate trying to leverage that, perhaps with some simple fusion (even as simple as a weighted average of features or even scores).
Also, I think that the author could compare SVM classifier to a simple neural network.
The work brings insights into a published dataset for detecting dislexia and offers a lot of interesting results and analysis. It can serve as a starting point and a source of information for the next collections (where the acoustic bias could be better handled).
The author has also published a web application that allows people to try the models for detecting dyslexia from speech, which could be very interesting for researchers and the general public alike.
Evaluation level: zadání splněno
Overall, I am satisfied with the work and the approaches taken to analyze the dataset and the achieved results. One minor complaint could be perhaps the point where the student is asked to experiment with different architectures of neural networks, which we do not see much, but we certainly see various methods for detecting dyslexia.
Evaluation level: je v obvyklém rozmezí
The presented work is of a commonly accepted size, with individual chapters that are information-rich and free of redundancies.
The author correctly cites numerous sources throughout the work. In general, I have no issues with the bibliography in this work; I would sometimes recommend offering some of the details within the text where they matter instead of simply giving a list of citations. Like the already mentioned introduction.
Grade proposed by reviewer: C
Responsibility: Mgr. et Mgr. Hana Odstrčilová