Přístupnostní navigace
E-application
Search Search Close
Master's Thesis
Author of thesis: Ing. Juraj Hatala
Acad. year: 2025/2026
Supervisor: Ing. Tomáš Kašpárek, Ph.D.
Reviewer: Ing. Jiří Novák, Ph.D.
This work focuses on detecting extreme events in satellite imagery from multiple sensors. The aim is to enable machine learning models to simultaneously utilize multispectral and radar data. To achieve this, a disentangled architecture, that focuses on separating sensor-specific generative factors, is proposed for unsupervised pretraining. This approach aims to address known issues of the heterogeneity gap between sensors and the inherent lack of data for extreme event detection. The implemented architecture was validated on the Sen1Floods11 benchmark dataset and achieved an IOU score of 0.4471, outperforming the baseline models. Furthermore, the proposed solution is discussed in the context of state-of-the-art models and its theoretical applicability on-board satellite constellations. The robustness of the proposed model was subsequently demonstrated on a custom subset designed to challenge individual sensors. This thesis highlights the potential of using content-style models and other disentangled architectures for processing multispectral and radar remote sensing data.
Remote sensing, machine learning, change detection, image analysis, sensory-independence, multimodality, disentangled representation
Date of defence
22.06.2026
Result of the defence
Defended (thesis was successfully defended)
Grading
C
Process of defence
Student nejprve prezentoval výsledky, kterých dosáhl v rámci své práce. Komise se poté seznámila s hodnocením vedoucího a posudkem oponenta práce. Student následně odpověděl na otázky oponenta a na další otázky přítomných. Komise se na základě posudku oponenta, hodnocení vedoucího, přednesené prezentace a odpovědí studenta na položené otázky rozhodla práci hodnotit stupněm C.
Topics for thesis defence
Language of thesis
English
Faculty
Fakulta informačních technologií
Department
Department of Computer Graphics and Multimedia
Study programme
Information Technology and Artificial Intelligence (MITAI)
Specialization
Machine Learning (NMAL)
Composition of Committee
prof. Dr. Ing. Jan Černocký (předseda) prof. Ing. Martin Čadík, Ph.D. (místopředseda) doc. Ing. Vladimír Janoušek, Ph.D. (člen) doc. Ing. Michal Bidlo, Ph.D. (člen) doc. Ing. František Zbořil, Ph.D. (člen) Ing. Petr Veigend, Ph.D. (člen)
Supervisor’s reportIng. Tomáš Kašpárek, Ph.D.
Student prokázal schopnost samostatně nastudovat aktuální stav fůzních algoritmů pro multimodální data a navrhnout řešení vhodné pro aplikaci na palubě družic s omezeným výpočetním výkonem. Jeho řešení je univerzální pro různé modely i druhy senzorů a má praktické uplatnění v budoucích misích.
Práce měla za cíl navržení vhodného řešení pro fůzi multimodálních dat z pohledu detekčních schopností nad unifikovanými daty pro hledání změn jako jsou například stavy vodní hladiny při povodních.
Práce byla dokončena včas a její výsledná podoba řádně konzultována.
Student byl schopen samostatně a aktivně vyhledávat vhodné a užitečné studijní materiály a pracovat s nimi.
Práce byla konzultována a realizována průběžně.
Grade proposed by supervisor: A
Reviewer’s reportIng. Jiří Novák, Ph.D.
Overall, the thesis is a well-executed applied research work. It is based on a solid selection of relevant literature and implements a reasonably well-designed experimental pipeline. The work demonstrates good engineering effort and produces results that improve over unimodal baselines, with some potential for practical use in flood detection and disaster monitoring. The thesis is evaluated overall with grade B.
Evaluation level: zadání splněno
The submitted thesis fulfills all points of the assignment. The student thoroughly studied the possibilities of using multiple remote sensing modalities for anomaly and change detection in satellite imagery and provided a comprehensive overview of the relevant sensing technologies and multimodal learning approaches. Based on this analysis, the student proposed a custom architecture for multimodal representation learning focused on reducing the heterogeneity gap between SAR and multispectral data through disentangled representations.
A suitable dataset was selected and appropriately processed for the conducted experiments, including preprocessing, normalization, clipping of SAR intensities, dataset splitting, and generation of specialized validation subsets. The proposed solution was fully implemented using modern machine learning frameworks and experimentally validated on benchmark datasets.
The thesis also satisfies the requirement of comparison with existing approaches. The proposed architecture was evaluated against multiple benchmark models on the Sen1Floods11 dataset.
Minor shortcomings can be found mainly in the limited scope of ablation studies and the relatively constrained evaluation on a single benchmark domain. Nevertheless, these limitations do not significantly reduce the overall quality of the thesis and are understandable given the complexity of the assignment.
Evaluation level: je v obvyklém rozmezí
The technical report is slightly shorter than the expected range for a master's thesis. However, all chapters contain the necessary information and adequately cover the relevant aspects of the work. Some sections could be expanded with additional details, discussion, or analysis to increase the overall length, but the current content remains informative and sufficient for understanding the methodology, implementation, and results.
The technical report is generally well structured and understandable. The individual chapters follow a logical order. The proposed approach and its motivation are explained clearly, and the reader can follow the overall objectives of the work.
However, some parts of the theoretical background are presented in a rather point-by-point manner, resembling a collection of related topics rather than a continuously connected narrative. For example, the sections on image fusion categorization, common fusion objectives, image fusion challenges, and image registration provide relevant information but could be linked more explicitly to the proposed method and research objectives. Stronger transitions between these topics and more discussion of their relevance to the selected architecture would improve the overall coherence of the text. I would also expect generally more information in form of legend, axes and colorbars in figures 5.1., 5.2. and 5.3. to better present the data and results.
Despite these minor shortcomings, the report remains readable and technically sound. Therefore, I evaluate the presentation level with Grade B.
The thesis is written in clear and technically correct English and follows standard academic conventions. The typography, referencing style, figures, tables, and mathematical notation are generally consistent and well presented. Thre are minor issues such as missing equation referencing in the text or excessively lengthty figure captions are present. Some nonuniformity in citation typography is also present. These shortcomings do not significantly affect readability. Overall, the formal and language quality of the thesis is at a good level and corresponds to grade A.
The thesis is based on a broad and highly relevant set of academic sources covering multimodal machine learning, remote sensing, change detection, and disentangled representation learning. The bibliography includes both foundational works and recent state-of-the-art publications, including journal articles, conference papers, surveys, and preprints.
A minor limitation is the strong reliance on survey literature in some parts of the related work. Additionally, minor inconsistencies in citation formatting are present. Nevertheless, the overall quality of literature work is very high and corresponds to grade A.
The realization part of the thesis demonstrates a well-designed and carefully implemented experimental pipeline. The proposed architecture is validated using multiple datasets, including a standard benchmark (Sen1Floods11) and additional custom evaluation scenarios designed to test robustness under varying modality conditions.
The results are generally well presented and show consistent improvements of the proposed multimodal approach over unimodal baselines.
A weakness of the experimental validation is the absence of systematic ablation studies isolating the contribution of individual architectural components and loss functions.
Overall, despite these limitations, the realization output is of good quality and corresponds to grade B.
The results demonstrate practical relevance, particularly for remote sensing-based disaster monitoring tasks such as flood detection. The work provides a meaningful extension of existing research with practical applicability.
Evaluation level: obtížnější zadání
I consider the assignment to be above-average in difficulty. The student had to combine knowledge from the fields of remote sensing, physical principles of individual sensors, multimodal data processing, and modern machine learning methods. The complexity was further increased by the need to design a custom architecture capable of processing heterogeneous multimodal dat and implement the complete proposed solution.
Grade proposed by reviewer: B
Responsibility: Mgr. et Mgr. Hana Odstrčilová