Přístupnostní navigace
E-application
Search Search Close
Master's Thesis
Author of thesis: Ing. Ondřej Bahounek
Acad. year: 2025/2026
Supervisor: Ing. Tomáš Goldmann, Ph.D.
Reviewer: Ing. Filip Pleško
This thesis investigates identity-preserving face generation conditioned on embeddings produced by pre-trained face recognition models. The proposed framework utilizes diffusion-based generative models together with classifier-free guidance to synthesize realistic facial images corresponding to a target identity while allowing stochastic variation in pose, expression, lighting, and background conditions. The work builds upon and extends the concepts introduced by the ID3PM framework, which utilizes face recognition embeddings as a conditioning signal within a Denoising Diffusion Probabilistic Model (DDPM) architecture. Several modifications to the embedding injection mechanisms are proposed, including decoupled projection layers and variants that separate the modulation of temporal and identity signals within Adaptive Group Normalization layers. In addition, embedding processing networks designed to improve latent disentanglement are investigated. The impact of these architectural modifications is analyzed with respect to image fidelity, identity preservation, and the semantic structure of the latent space. In addition to identity-conditioned synthesis, the thesis focuses on semantic manipulation of generated faces through direct modification of the input embeddings. Facial attributes are controlled using methods such as linear semantic directions obtained from Support Vector Machine classifiers and a proposed iterative gradient-guided optimization approach utilizing non-linear classifiers. The framework further demonstrates the generation of novel synthetic identities through latent space interpolation and unconditional generation enabled by classifier-free guidance. The proposed methods are evaluated both quantitatively and qualitatively, including a comparison with the state-of-the-art Arc2Face model. The results demonstrate that the framework provides an effective solution for high-quality face generation and controllable semantic manipulation, highlighting its practical utility for synthetic face generation and dataset augmentation.
face generation, face recognition embeddings, diffusion models, DDPM, identity preservation, semantic manipulation, synthetic datasets, ID3PM, Arc2Face
Date of defence
25.06.2026
Result of the defence
Defended (thesis was successfully defended)
Grading
A
Process of defence
Student nejprve prezentoval výsledky, kterých dosáhl v rámci své práce. Komise se poté seznámila s hodnocením vedoucího a posudkem oponenta práce. Student následně odpověděl na otázky oponenta a na další otázky přítomných. Komise se na základě posudku oponenta, hodnocení vedoucího, přednesené prezentace a odpovědí studenta na položené otázky rozhodla práci hodnotit stupněm A.
Topics for thesis defence
Language of thesis
English
Faculty
Fakulta informačních technologií
Department
Department of Intelligent Systems
Study programme
Information Technology and Artificial Intelligence (MITAI)
Specialization
Machine Learning (NMAL)
Composition of Committee
prof. Dr. Ing. Jan Černocký (předseda) prof. Ing. Hynek Heřmanský, Dr. Eng. (místopředseda) prof. RNDr. Alexandr Meduna, CSc. (člen) Ing. Michal Hradiš, Ph.D. (člen) Ing. František Grézl, Ph.D. (člen) Ing. Martin Fajčík, Ph.D. (člen)
Supervisor’s reportIng. Tomáš Goldmann, Ph.D.
Overall, I consider the thesis to be complete and fulfilled in all required aspects. While the resulting solution did not surpass the performance of current state-of-the-art approaches, the student properly discussed the shortcomings of his work and drew attention to the complexity of evaluation methods, which can be significantly influenced by factors such as image resolution. Regarding the student's approach, the number of consultations was limited; however, the student consistently presented his progress during meetings and had a clear vision of the direction of the work. Although the methods employed do not bring fundamental innovation, their evaluation is thorough and clearly highlights the advantages and disadvantages of the chosen solution. Having considered all aspects, the student's approach to the thesis is evaluated as excellent (A).
The thesis focuses on the area of generative neural networks, specifically on generating different variants of a face. In my view, this is an assignment of an experimental nature with uncertain outcomes. I consider the assignment to have been fulfilled, and the limitations and shortcomings of the solution are properly discussed in the thesis.
The thesis cannot be considered to have been completed with sufficient lead time. I received the final version approximately one week before the submission deadline. Despite this, I did not find any serious shortcomings in it. The student was notified of a few minor issues; whether these were addressed was not checked by the supervisor.
No publications or awards related to this thesis are known to me.
The student independently gathered all necessary literature. The selection of sources corresponds well to the topic being addressed.
The overall number of consultations was low and I met with the student only a few times. Nevertheless, he arrived at consultations well-prepared, had a good grasp of the subject matter, and was able to clearly present the progress he had made.
Grade proposed by supervisor: A
Reviewer’s reportIng. Filip Pleško
The thesis represents a solid and well-executed piece of work on a current and technically demanding topic. The student fulfilled the assignment, studied the relevant literature, implemented a functional solution, extended it with several experimental variants, and evaluated the results using suitable methods.
The strongest aspects of the thesis are the technical implementation, the reimplementation of a method without publicly available code, the systematic experimental evaluation, and the honest discussion of both successful and unsuccessful modifications. The weaker aspect is that the work is partly based on an existing approach and the proposed extensions bring rather mixed or moderate improvements. Nevertheless, the thesis is complete, technically sound, and demonstrates a good understanding of the problem.
Evaluation level: zadání splněno
The requirements of the assignment were fulfilled. The thesis covers the principles of generative neural networks and diffusion models, discusses Stable Diffusion and its extensions, analyzes the use of biometric face embeddings, and proposes a working solution for controlled face generation conditioned by embedding vectors.
Evaluation level: přesahuje obvyklé rozmezí
The technical report has approximately 120 standard pages, which exceeds the usual expected range.
The thesis is logically structured and understandable. It proceeds from general foundations of diffusion models to face recognition embeddings, then to the proposed method, implementation, evaluation setup, and results.
The formal quality of the report is good. The document is clearly organized, uses an appropriate technical style, and contains a reasonable number of figures, tables, formulas, and references. The text is understandable and the terminology is used consistently.
The thesis uses relevant and up-to-date literature from the areas of diffusion models, Stable Diffusion, controlled image generation, face recognition embeddings, and identity-conditioned face synthesis. The theoretical part is based on standard and important works.The practical part appropriately discusses ID3PM and Arc2Face, which are directly related to the topic. The student clearly distinguishes between existing methods and his own contributions. The citations appear appropriate and the bibliography supports both the theoretical and experimental parts of the thesis.
The implementation output is a substantial part of the thesis. The student reimplemented a solution inspired by ID3PM because the original implementation was not publicly available. This required understanding the method from the paper, creating a working training and generation pipeline, and integrating face recognition embeddings into a diffusion-based architecture.
Beyond the baseline, the student implemented and evaluated several architectural variants, including different embedding injection mechanisms and embedding processing networks. The work also includes methods for semantic manipulation in the embedding space, generation of synthetic identities, and comparison with Arc2Face.
The validation is performed using both quantitative and qualitative evaluation. The thesis evaluates identity preservation, image fidelity, latent space properties, and visual quality. Although some modifications provide only limited improvement and some experiments show negative results, the student discusses these outcomes honestly and uses them to better understand the behavior of the model.
The results are usable mainly in the context of synthetic face dataset generation, data augmentation, identity-preserving generation, and controlled manipulation of facial attributes. The thesis does not introduce a completely new generative paradigm, but it provides a useful reimplementation and experimental extension of an existing research direction.
The practical value lies especially in the implemented pipeline, the comparison with Arc2Face, and the analysis of how face embeddings can be manipulated to influence generated images. The work is also useful because it identifies limitations of the approach, including the sensitivity of evaluation metrics and the risk of unintended changes when modifying embedding vectors.
Evaluation level: obtížnější zadání
The assignment can be considered more demanding due to the combination of modern diffusion-based generative models, biometric face embeddings, and controlled face synthesis with identity preservation. The topic required the student to study a rapidly developing area, understand the principles of Stable Diffusion and related extensions, and connect them with face recognition representations.
Grade proposed by reviewer: A
Responsibility: Mgr. et Mgr. Hana Odstrčilová