Přístupnostní navigace
E-application
Search Search Close
Master's Thesis
Author of thesis: Ing. Lenka Šoková
Acad. year: 2025/2026
Supervisor: Ing. Tomáš Goldmann, Ph.D.
Reviewer: Ing. Filip Orság, Ph.D.
This thesis presents a mobile Android application for detecting and identifying visually similar objects using reference images. The proposed system allows users to store reference images along with object labels and descriptions, and subsequently identify similar objects in a camera stream. The application primarily focuses on plant identification and comparison. The thesis reviews object detection methods, lightweight neural network architectures, and low-shot and open-vocabulary detection approaches. Due to the computational complexity of existing methods, the proposed solution employs a lightweight hybrid pipeline based on embedding similarity comparison. The application supports multiple target selection approaches, including object detection, segmentation, and manual image cropping. Several lightweight embedding backbones and loss functions are evaluated and fine-tuned on a plant-specific dataset. The resulting models are further optimized through quantization and pruning and deployed using suitable on-device inference frameworks with hardware acceleration support where available. Experimental results demonstrate that the proposed approach enables practical real-time object identification on mobile devices while balancing accuracy, latency, and model size.
object detection, few-shot detection, open-vocabulary detection, Android, on-device inference, quantization, embedding similarity, computer vision
Date of defence
23.06.2026
Result of the defence
Defended (thesis was successfully defended)
Grading
B
Process of defence
Studentka nejprve prezentovala výsledky, kterých dosáhla v rámci své práce. Komise se poté seznámila s hodnocením vedoucího a posudkem oponenta práce. Studentka následně odpověděla na otázky oponenta a na další otázky přítomných. Komise se na základě posudku oponenta, hodnocení vedoucího, přednesené prezentace a odpovědí studentky na položené otázky rozhodla práci hodnotit stupněm B.
Topics for thesis defence
Language of thesis
English
Faculty
Fakulta informačních technologií
Department
Department of Intelligent Systems
Study programme
Information Technology and Artificial Intelligence (MITAI)
Specialization
Machine Learning (NMAL)
Composition of Committee
doc. Ing. Vítězslav Beran, Ph.D. (předseda) prof. Ing. Hynek Heřmanský, Dr. Eng. (místopředseda) doc. Ing. Ondřej Lengál, Ph.D. (člen) doc. Ing. František Zbořil, Ph.D. (člen) doc. Ing. Michal Bidlo, Ph.D. (člen) RNDr. Marek Rychlý, Ph.D. (člen)
Supervisor’s reportIng. Tomáš Goldmann, Ph.D.
Overall, this is a high-quality master's thesis, both in terms of the technical report and the final application. The student approached the work consistently and systematically, conducted extensive experiments to determine the applicability and limitations of the proposed application, and achieved convincing results. I evaluate the student's approach as excellent (A).
I consider the assignment to be of average difficulty. The goal was to develop an experimental application to verify the feasibility of recognizing selected objects, specifically flowers in the final application. While I regard the assignment itself as average in terms of complexity, I evaluate the resulting work as above average. The student conducted a comprehensive series of relevant experiments focused on object recognition on mobile devices, and the resulting application can serve as a practical foundation for a fully-fledged user-facing application.
The thesis was completed well ahead of the deadline. I was given sufficient time to review both the final technical report and the application before submission. The majority of the supervisor's comments were taken into account and incorporated into the final version of the technical report.
No publications or additional awards related to this thesis are known to me.
The student independently gathered all necessary study materials and academic literature without requiring significant guidance from the supervisor. The literature review is well-structured and corresponds appropriately to the topic being addressed.
Throughout the development of the thesis, the student regularly attended consultations, always arriving well-prepared. She actively discussed key milestones with me and approached any issues that arose with initiative and thoughtfulness. Communication remained at a good level throughout the entire duration of the project.
Grade proposed by supervisor: A
Reviewer’s reportIng. Filip Orság, Ph.D.
The diploma thesis addresses a demanding and relevant topic at the intersection of computer vision, mobile deployment, and applied machine learning. The student successfully designed and implemented a complete Android application and supported the implementation with extensive model training, optimization, and evaluation. The work is technically mature, well documented, and practically oriented. The main weaknesses are the domain limitation to plants and the relatively limited systematic usability evaluation. However, these limitations are clearly discussed and do not substantially reduce the overall quality of the work. I evaluate the thesis as excellent.
Evaluation level: zadání splněno
Evaluation level: je v obvyklém rozmezí
The thesis is well structured and logically organized. The theoretical chapters introduce object detection, mobile deployment constraints, low-shot learning, open-vocabulary detection, and metric learning before moving to the proposed method and experiments. The report also benefits from diagrams and screenshots, especially the pipeline diagram and the appendix showing the application interface. Minor weaknesses are that some theoretical sections are rather broad and could be more tightly connected to the final implementation, and some parts of the evaluation would benefit from a more concise summary for the reader.
The formal quality of the report is very good. The English is clear and generally professional. Figures, tables, equations, and references are used appropriately. The thesis is typographically consistent and readable. I did not notice any major formal problems.
The work with literature is very good. The thesis uses a broad set of relevant sources, including object detection, transformer-based detectors, open-vocabulary detection, metric learning, lightweight neural networks, quantization, pruning, and mobile segmentation models. The bibliography contains recent and topic-relevant papers. The adopted ideas are sufficiently distinguished from the student’s own implementation and experiments.
The student implemented a native Android application in Kotlin using Jetpack Compose. The application supports reference image storage, gallery management, object selection, detection, optional segmentation, cropping, background removal, embedding computation, similarity comparison, and model selection. The architecture follows the MVVM pattern and uses Room, MediaStore, StateFlow, model wrappers, and asynchronous processing, which indicates a well-designed Android implementation. The experimental part is also extensive. The student evaluated multiple embedding backbones and loss functions, fine-tuned models on a plant-specific dataset, compared the selected lightweight model with ImageNet-pretrained and CLIP baselines, and analyzed optimization methods such as quantization and pruning. The mobile performance evaluation is particularly valuable.
The results are practically usable as a prototype Android application for reference-image-based plant identification and comparison. The work also provides useful experimental insight into the trade-offs between accuracy, latency, model size, and mobile inference backends.
Evaluation level: obtížnější zadání
The assignment is technically demanding, as it combines several challenging areas: object detection, low-shot and open-vocabulary recognition, metric learning, model optimization for mobile devices, and native Android application development. In addition to studying existing methods, the student was required to design and implement a complete mobile solution, adapt or train suitable models, and evaluate the system in real-world conditions.
Grade proposed by reviewer: A
Responsibility: Mgr. et Mgr. Hana Odstrčilová