Master's Thesis

Detection of extreme events in satellite imagery utilizing multiple sensors

Final Thesis 4.71 MB

Author of thesis: Ing. Juraj Hatala

Acad. year: 2025/2026

Supervisor: Ing. Tomáš Kašpárek, Ph.D.

Reviewer: Ing. Jiří Novák, Ph.D.

Abstract:

This work focuses on detecting extreme events in satellite imagery from multiple sensors. The aim is to enable machine learning models to simultaneously utilize multispectral and radar data. To achieve this, a disentangled architecture, that focuses on separating sensor-specific generative factors, is proposed for unsupervised pretraining. This approach aims to address known issues of the heterogeneity gap between sensors and the inherent lack of data for extreme event detection. The implemented architecture was validated on the Sen1Floods11 benchmark dataset and achieved an IOU score of 0.4471, outperforming the baseline models. Furthermore, the proposed solution is discussed in the context of state-of-the-art models and its theoretical applicability on-board satellite constellations. The robustness of the proposed model was subsequently demonstrated on a custom subset designed to challenge individual sensors. This thesis highlights the potential of using content-style models and other disentangled architectures for processing multispectral and radar remote sensing data.

Keywords:

Remote sensing, machine learning, change detection, image analysis, sensory-independence, multimodality, disentangled representation

Date of defence

22.06.2026

Result of the defence

Defended (thesis was successfully defended)

znamkaCznamka

Grading

C

Process of defence

Student nejprve prezentoval výsledky, kterých dosáhl v rámci své práce. Komise se poté seznámila s hodnocením vedoucího a posudkem oponenta práce. Student následně odpověděl na otázky oponenta a na další otázky přítomných. Komise se na základě posudku oponenta, hodnocení vedoucího, přednesené prezentace a odpovědí studenta na položené otázky rozhodla práci hodnotit stupněm C.

Topics for thesis defence

  1. Why were the selected hyperparameter values (such as the learning rate, loss weights, and batch size) chosen for the final model configuration, and was any empirical comparison performed to justify these choices?
  2. The proposed architecture is based on a disentanglement model originally developed for infrared-visible image fusion and was subsequently modified for remote sensing representation learning. Could you explain the rationale behind selecting this particular architecture over alternative multimodal fusion or representation learning approaches?
  3. Did you perform any ablation studies to evaluate the contribution of the individual components and loss terms (e.g., scene consistency loss, attribute regularization, reconstruction loss, and domain-translation loss) to the final performance?

Language of thesis

English

Faculty

Department

Study programme

Information Technology and Artificial Intelligence (MITAI)

Specialization

Machine Learning (NMAL)

Composition of Committee

prof. Dr. Ing. Jan Černocký (předseda)
prof. Ing. Martin Čadík, Ph.D. (místopředseda)
doc. Ing. Vladimír Janoušek, Ph.D. (člen)
doc. Ing. Michal Bidlo, Ph.D. (člen)
doc. Ing. František Zbořil, Ph.D. (člen)
Ing. Petr Veigend, Ph.D. (člen)

Supervisor’s report
Ing. Tomáš Kašpárek, Ph.D.

Student prokázal schopnost samostatně nastudovat aktuální stav fůzních algoritmů pro multimodální data a navrhnout řešení vhodné pro aplikaci na palubě družic s omezeným výpočetním výkonem. Jeho řešení je univerzální pro různé modely i druhy senzorů a má praktické uplatnění v budoucích misích.

Evaluation criteria Verbal classification
Informace k zadání

Práce měla za cíl navržení vhodného řešení pro fůzi multimodálních dat z pohledu detekčních schopností nad unifikovanými daty pro hledání změn jako jsou například stavy vodní hladiny při povodních.

Aktivita při dokončování

Práce byla dokončena včas a její výsledná podoba řádně konzultována.

Publikační činnost, ocenění
Práce s literaturou

Student byl schopen samostatně a aktivně vyhledávat vhodné a užitečné studijní materiály a pracovat s nimi.

Aktivita během řešení, konzultace, komunikace

Práce byla konzultována a realizována průběžně.

Points proposed by supervisor: 90

Grade proposed by supervisor: A

Reviewer’s report
Ing. Jiří Novák, Ph.D.

Overall, the thesis is a well-executed applied research work. It is based on a solid selection of relevant literature and implements a reasonably well-designed experimental pipeline. The work demonstrates good engineering effort and produces results that improve over unimodal baselines, with some potential for practical use in flood detection and disaster monitoring. The thesis is evaluated overall with grade B.

Evaluation criteria Verbal classification Points
Rozsah splnění požadavků zadání

Evaluation level: zadání splněno

The submitted thesis fulfills all points of the assignment. The student thoroughly studied the possibilities of using multiple remote sensing modalities for anomaly and change detection in satellite imagery and provided a comprehensive overview of the relevant sensing technologies and multimodal learning approaches. Based on this analysis, the student proposed a custom architecture for multimodal representation learning focused on reducing the heterogeneity gap between SAR and multispectral data through disentangled representations.

A suitable dataset was selected and appropriately processed for the conducted experiments, including preprocessing, normalization, clipping of SAR intensities, dataset splitting, and generation of specialized validation subsets. The proposed solution was fully implemented using modern machine learning frameworks and experimentally validated on benchmark datasets.

The thesis also satisfies the requirement of comparison with existing approaches. The proposed architecture was evaluated against multiple benchmark models on the Sen1Floods11 dataset.

Minor shortcomings can be found mainly in the limited scope of ablation studies and the relatively constrained evaluation on a single benchmark domain. Nevertheless, these limitations do not significantly reduce the overall quality of the thesis and are understandable given the complexity of the assignment.

Rozsah technické zprávy

Evaluation level: je v obvyklém rozmezí

The technical report is slightly shorter than the expected range for a master's thesis. However, all chapters contain the necessary information and adequately cover the relevant aspects of the work. Some sections could be expanded with additional details, discussion, or analysis to increase the overall length, but the current content remains informative and sufficient for understanding the methodology, implementation, and results.

Prezentační úroveň technické zprávy

The technical report is generally well structured and understandable. The individual chapters follow a logical order. The proposed approach and its motivation are explained clearly, and the reader can follow the overall objectives of the work.

However, some parts of the theoretical background are presented in a rather point-by-point manner, resembling a collection of related topics rather than a continuously connected narrative. For example, the sections on image fusion categorization, common fusion objectives, image fusion challenges, and image registration provide relevant information but could be linked more explicitly to the proposed method and research objectives. Stronger transitions between these topics and more discussion of their relevance to the selected architecture would improve the overall coherence of the text.  I would also expect generally more information in form of legend, axes and colorbars in figures 5.1., 5.2. and 5.3. to better present the data  and results.

Despite these minor shortcomings, the report remains readable and technically sound. Therefore, I evaluate the presentation level with Grade B.

83
Formální úprava technické zprávy

The thesis is written in clear and technically correct English and follows standard academic conventions. The typography, referencing style, figures, tables, and mathematical notation are generally consistent and well presented. Thre are minor issues such as missing equation referencing in the text or excessively lengthty figure captions are present. Some nonuniformity in citation typography is also present. These shortcomings do not significantly affect readability. Overall, the formal and language quality of the thesis is at a good level and corresponds to grade A.

92
Práce s literaturou

The thesis is based on a broad and highly relevant set of academic sources covering multimodal machine learning, remote sensing, change detection, and disentangled representation learning. The bibliography includes both foundational works and recent state-of-the-art publications, including journal articles, conference papers, surveys, and preprints.

A minor limitation is the strong reliance on survey literature in some parts of the related work. Additionally, minor inconsistencies in citation formatting are present. Nevertheless, the overall quality of literature work is very high and corresponds to grade A.

94
Realizační výstup

The realization part of the thesis demonstrates a well-designed and carefully implemented experimental pipeline. The proposed architecture is validated using multiple datasets, including a standard benchmark (Sen1Floods11) and additional custom evaluation scenarios designed to test robustness under varying modality conditions. 

The results are generally well presented and show consistent improvements of the proposed multimodal approach over unimodal baselines.

A weakness of the experimental validation is the absence of systematic ablation studies isolating the contribution of individual architectural components and loss functions.

Overall, despite these limitations, the realization output is of good quality and corresponds to grade B.

84
Využitelnost výsledků

The results demonstrate practical relevance, particularly for remote sensing-based disaster monitoring tasks such as flood detection. The work provides a meaningful extension of existing research with practical applicability.

Náročnost zadání

Evaluation level: obtížnější zadání

I consider the assignment to be above-average in difficulty. The student had to combine knowledge from the fields of remote sensing, physical principles of individual sensors, multimodal data processing, and modern machine learning methods. The complexity was further increased by the need to design a custom architecture capable of processing heterogeneous multimodal dat and implement the complete proposed solution.

Topics for thesis defence:
  1. Why were the selected hyperparameter values (such as the learning rate, loss weights, and batch size) chosen for the final model configuration, and was any empirical comparison performed to justify these choices?
  2. The proposed architecture is based on a disentanglement model originally developed for infrared-visible image fusion and was subsequently modified for remote sensing representation learning. Could you explain the rationale behind selecting this particular architecture over alternative multimodal fusion or representation learning approaches?
  3. Did you perform any ablation studies to evaluate the contribution of the individual components and loss terms (e.g., scene consistency loss, attribute regularization, reconstruction loss, and domain-translation loss) to the final performance?
Points proposed by reviewer: 88

Grade proposed by reviewer: B

Responsibility: Mgr. et Mgr. Hana Odstrčilová