Přístupnostní navigace
E-application
Search Search Close
Master's Thesis
Author of thesis: Bc. Roman Křivánek
Acad. year: 2025/2026
Supervisor: doc. Ing. Tomáš Frýza, Ph.D.
Reviewer: doc. Ing. Roman Jarina, PhD.
This diploma thesis presents the design, implementation, and evaluation of a software application for audio source separation and speech transcription. The application integrates advanced separation methods, including Spleeter, Demucs, and Open-Unmix, with speech transcription backends such as Whisper, Wav2Vec2, and Vosk. A modular backend architecture with a graphical user interface allows local execution, user configuration, and support for pre-trained models. Evaluation using the MUSDB18-hq dataset demonstrates that Demucs achieves the highest separation performance, while Spleeter offers computational efficiency. The work provides a functional, extensible platform for further development, including specialized speech transcription and cross-platform deployment.
audio source separation, speech transcription, deep learning, Python, MUSDB18-hq, Spleeter, Demucs, Open-Unmix, Whisper, Wav2Vec2, Vosk
Date of defence
09.06.2026
Result of the defence
Defended (thesis was successfully defended)
Grading
A
Process of defence
Student prezentuje výsledky a postupy řešení závěrečné práce. Následně odpovídá na dotazy vedoucího a oponenta práce a na dotazy členů zkušební komise.
Language of thesis
English
Faculty
Fakulta elektrotechniky a komunikačních technologií
Department
Department of Radio Electronics
Study programme
Electronics and Communication Technologies (MPC-EKT)
Composition of Committee
doc. Ing. Tomáš Frýza, Ph.D. (předseda) doc. Ing. Ladislav Polák, Ph.D. (místopředseda) Ing. Tomáš Urbanec, Ph.D. (člen) doc. Ing. Jan Mikulka, Ph.D. (člen) doc. Ing. Patrik Kamencay, Ph.D. (člen)
Supervisor’s reportdoc. Ing. Tomáš Frýza, Ph.D.
Grade proposed by supervisor: A
Reviewer’s reportdoc. Ing. Roman Jarina, PhD.
Grade proposed by reviewer: A
Responsibility: Mgr. et Mgr. Hana Odstrčilová