Přístupnostní navigace
E-application
Search Search Close
Bachelor's Thesis
Author of thesis: Yaroslav Hryn
Acad. year: 2025/2026
Supervisor: Ing. Markéta Juránková, Ph.D.
Reviewer: Ing. Michal Hradiš, Ph.D.
This bachelor's thesis describes the design and implementation of a web application for searching people in image datasets based on natural language descriptions. The motivation comes from scenarios where a user is looking for a specific person and knows only their appearance – such as the colour of their clothing or other visual characteristics – but does not have a photo available. At its core, the system converts both text queries and images into a shared vector space using vision-language models, with support for loading any compatible model from the HuggingFace platform. The resulting embeddings are indexed and searched using the Qdrant cloud vector database. To improve search precision, negative prompting was implemented as a re-ranking method, allowing users to actively exclude unwanted visual attributes from results. Search results can also be grouped by person identity based on identifiers provided in a metadata file. The resulting application allows users to manage datasets and models, search for people using natural language, apply negative queries, and browse results in three different view modes. The system was experimentally evaluated by comparing several vision-language models on a real-world dataset.
person retrieval, text description, vision-language models, vector database, negative prompting, clustering, web application
Date of defence
16.06.2026
Result of the defence
Defended (thesis was successfully defended)
Grading
B
Process of defence
Student nejprve prezentoval výsledky, kterých dosáhl v rámci své práce. Komise se poté seznámila s hodnocením vedoucího a posudkem oponenta práce. Student následně odpověděl na otázky oponenta a na další otázky přítomných. Komise se na základě posudku oponenta, hodnocení vedoucího, přednesené prezentace a odpovědí studenta na položené otázky rozhodla práci hodnotit stupněm B.
Topics for thesis defence
Language of thesis
English
Faculty
Fakulta informačních technologií
Department
Department of Computer Graphics and Multimedia
Study programme
Information Technology (BIT)
Composition of Committee
doc. Ing. Tomáš Martínek, Ph.D. (předseda) doc. Ing. Michal Španěl, Ph.D. (místopředseda) Ing. Jiří Hynek, Ph.D. (člen) Ing. Filip Orság, Ph.D. (člen) Ing. Vladimír Bartík, Ph.D. (člen)
Supervisor’s reportIng. Markéta Juránková, Ph.D.
Student v průběhu celého roku aktivně pracoval, samostatně přicházel s návrhy řešení a tyto návrhy konzultoval. Splnil všechny body zadání a práci dokončil s dostatečným předstihem. Proto uděluji hodnocení vedoucího práce A.
Práce byla průměrně náročná, zaměřená na vytvoření intuitivního softwaru za použití stávajících technologií. Všechny body zadání byly splněny a dosažené výsledky odpovídají požadavkům zadání.
Student aktivně navrhoval vlastní řešení a úpravy systému.
Student práci v průběhu roku pravidelně konzultoval a na schůzky přicházel připravený s aktuálním postupem a vhodnými dotazy ke konzultaci.
Práce byla dokončena včas a její výsledná podoba byla konzultována.
Výstup práce je vhodný pro zveřejnění formou open-source softwaru.
Grade proposed by supervisor: A
Reviewer’s reportIng. Michal Hradiš, Ph.D.
The student created a working application which can be used to search in collections of images. However, the thesis does not demonstrate a systematic approach to understanding the problem domain, designing the solution, or testing it.
Evaluation level: obtížnější zadání
The thesis topic combines advanced image and text processing with web application development.
The text is understandable, but the structure and separation of topics could be improved. The text was clearly written to describe an already finished system. It focuses solely on the methods, models, and approaches used by the student, and lacks discussion and justification of the choices made.
Specific issues:
The thesis is reasonably well written. The typography is satisfactory, with some reservations:
The student created a working web application. I appreciate that it includes asynchronous image processing and adequate configuration. On the other hand, it is still a rather basic, single-user local application with serious technical limitations: there are no user accounts, uploads are stored in memory, and the asynchronous backend contains long-running synchronous code.
Some of the UI choices are somewhat unexpected, and the text does not suggest any interaction with potential users or deeper consideration of practical use cases. The application does not include documentation or tests.
The evaluation was manual and was probably performed by the author. This has both advantages and disadvantages. It could simulate real usage and interaction with the system. However, the evaluation protocol is not described in sufficient detail to assess its validity, and no statistical analysis was performed. In any case, automatic testing should also have been performed; even the negative query selection could have been simulated.
The application works and can be used as a simple local tool.
Evaluation level: student se odůvodněně odchýlil od zadání
The solution does not use large language models. The question is to what extent this was a specific objective of the thesis topic, but the solution is meaningful as presented.
Evaluation level: je v obvyklém rozmezí
I am missing an overview of current approaches and tools for searching for people by appearance, existing datasets, and evaluation methodologies. The tables, graphs, and UI images are excessive.
The thesis references 18 relevant and generally high-quality sources. The sources are used well. However, I am missing a literature review beyond the sources directly used by the student. The thesis does not include even a review of similar applications.
Grade proposed by reviewer: C
Responsibility: Mgr. et Mgr. Hana Odstrčilová