Bachelor's Thesis

Transformation of Faces from Cartoon to Realistic Appearance

Final Thesis 13.65 MB

Author of thesis: Martin Mendl

Acad. year: 2025/2026

Supervisor: Ing. Tomáš Goldmann, Ph.D.

Reviewer: Ing. Filip Pleško

Abstract:

This thesis investigates the possibility of reconstructing original photographs from AI-generated images stylized in the Studio Ghibli aesthetic. The work focuses on determining to what extent visual information is preserved after stylization and whether it can be recovered using deep learning methods. A paired dataset of original and stylized images was created and multiple reconstruction pipelines based on U-Net and Swin U-Net architectures were implemented and evaluated. The models were trained using combinations of pixel reconstruction loss and face recognition loss to improve both visual similarity and identity preservation. Experimental results show that meaningful approximations of the original images can be achieved under controlled preprocessing conditions. The best-performing U-Net configuration achieved the highest reconstruction accuracy while maintaining efficient inference speed, whereas the Swin U-Net produced visually smoother outputs at a~higher computational cost. The thesis demonstrates that stylized AI-generated images retain structural information that can be partially reconstructed using relatively compact neural networks.

Keywords:

Artificial Intelligence, Deep Learning, Computer Vision, Image Reconstruction, Neural Networks, U-Net, Swin U-Net, Image Stylization, Studio Ghibli Style, Face Reconstruction, Generative Models, Image Processing

Date of defence

15.06.2026

Result of the defence

Defended (thesis was successfully defended)

znamkaBznamka

Grading

B

Process of defence

Student nejprve prezentoval výsledky, kterých dosáhl v rámci své práce. Komise se poté seznámila s hodnocením vedoucího a posudkem oponenta práce. Student následně odpověděl na otázky oponenta a na další otázky přítomných. Komise se na základě posudku oponenta, hodnocení vedoucího, přednesené prezentace a odpovědí studenta na položené otázky rozhodla práci hodnotit stupněm B.

Topics for thesis defence

  1. Explain the EER metric. What is it used for, and what happens when the decision threshold is shifted?
  2. Jakým způsobem jste hodnotil kvalitu výstupů?
  3. Jak moc generické je Vaše řešení?

Language of thesis

English

Faculty

Department

Study programme

Information Technology (BIT)

Composition of Committee

doc. Ing. František Zbořil, Ph.D. (předseda)
doc. Mgr. Kamil Malinka, Ph.D. (místopředseda)
Ing. Jiří Matoušek, Ph.D. (člen)
Ing. Vladimír Veselý, Ph.D. (člen)
doc. Ing. Vítězslav Beran, Ph.D. (člen)

Supervisor’s report
Ing. Tomáš Goldmann, Ph.D.

Overall, I evaluate the student's approach very positively. He came to consultations prepared and worked on the thesis steadily. The results of the thesis are, in my opinion, interesting and can serve as a starting point for further work of a similar focus. The thesis does contain a few aspects that could have been done better; however, for a bachelor's thesis I do not consider them significant. Based on the above, I evaluate the student with the grade excellent (A).

Evaluation criteria Verbal classification
Informace k zadání

This is a slightly more demanding assignment, whose aim is to select suitable generative models, design loss functions, and train and evaluate the model. The thesis was thematically focused on transforming a modified facial image back into a realistic one. I am satisfied with the results of the thesis, and they exceed the expectations I had for a student's first work. The achieved results offer a valuable contribution to further work of a similar kind.

Práce s literaturou

The student selected his own study materials and literature, the choice of which he consulted with me. I consider the selection of sources adequate with respect to the topic of the thesis.

Aktivita během řešení, konzultace, komunikace

The student was very active during the work, came to consultations prepared, and had a good command of the topic. All his questions were relevant and to the point.

Aktivita při dokončování

The thesis was completed well in advance, and the student incorporated most of the supervisor's comments into the technical report.

Publikační činnost, ocenění

No publication activity or awards are known to the supervisor.

Points proposed by supervisor: 92

Grade proposed by supervisor: A

Reviewer’s report
Ing. Filip Pleško

Overall Evaluation


Final grade: B/C


The thesis addresses a technically demanding and relevant problem involving generative models, image reconstruction, and face recognition. The student fulfilled the assignment in full by creating a paired dataset, implementing a complete reconstruction pipeline, and evaluating the results with respect to visual quality and identity preservation.


The main strengths of the work are the practical implementation, the experimental scope, and the supporting application for testing the trained models and inspecting generated outputs. The results show that stylized facial images preserve some identity-related information, although the quality of the reconstructed images is still limited and would require further improvement for practical use.


A weakness of the thesis is the presentation of the experimental comparison. Some explored approaches and loss configurations, including unsuccessful ones, are discussed in the text but are not summarized systematically in the final comparison. This makes the experimental conclusions slightly harder to verify. There is also a local presentation inconsistency in Figure 5.1 regarding the depicted inputs to the loss function.


Overall, the thesis provides a complete and functional solution to the assigned problem, with strong implementation work and relevant experimental findings, despite limitations in output quality and presentation of some results.

Evaluation criteria Verbal classification Points
Náročnost zadání

Evaluation level: obtížnější zadání

The assignment can be considered more demanding. It combines several non-trivial areas, including generative neural networks, image-to-image transformation, dataset preparation, face reconstruction, and evaluation based on face-recognition metrics. The work required not only theoretical understanding of modern generative and reconstruction methods, but also practical implementation of a complete experimental pipeline.

Prezentační úroveň technické zprávy

The technical report is structured in a logical way and follows a standard progression from theoretical background, through data preparation and model design, to experimental evaluation. The individual chapters are generally understandable and provide sufficient context for the proposed solution. However, there are some local inconsistencies between the text and figures. For example, Figure 5.1 incorrectly indicates the inputs to the loss function, which is inconsistent with the accompanying textual explanation. This does not prevent understanding of the work as a whole, but it reduces the clarity of the presentation in the affected part.

70
Formální úprava technické zprávy

The formal quality of the technical report is good. The text is readable, the language level is adequate, and the report is presented in a consistent and well-arranged form. Figures, equations, and references are used appropriately to support the explanation of the proposed methods. I do not have any major reservations regarding the typographical or language aspects of the report.

100
Realizační výstup

The work was presented to me by the student. The student created a complete pipeline for generating realistic faces from stylized input images. In addition, he developed a clear application for testing the models and inspecting the generated images. The implementation therefore goes beyond a purely theoretical study and provides a functional experimental framework for evaluating the proposed approaches.

80
Využitelnost výsledků

The results are not yet of sufficient quality for direct practical deployment and further improvement of the output quality would be necessary. Nevertheless, the work demonstrates that stylized facial images preserve information related to identity and that it is possible to partially extract or reconstruct the original identity from them. This makes the results interesting both from the technical perspective and from the perspective of understanding how much identity-related information remains present after stylization.

Rozsah splnění požadavků zadání

Evaluation level: zadání splněno

The assignment requirements were fulfilled in full. The student studied the relevant principles of generative neural networks and image style transformation, created a paired dataset of original and stylized facial images, designed and implemented a pipeline for transforming stylized facial images back to realistic-looking ones, and performed experiments evaluating the results with respect to face recognition and identity preservation.

Rozsah technické zprávy

Evaluation level: přesahuje obvyklé rozmezí

The technical report has approximately 92 standard pages, which exceeds the usual expected range.

Práce s literaturou

The thesis uses a sufficient amount of relevant literature related to the topic. The selected sources correspond to the focus of the work. The literature provides an adequate basis for the theoretical and practical parts of the thesis.

100
Topics for thesis defence:
  1. Explain the EER metric. What is it used for, and what happens when the decision threshold is shifted?
Points proposed by reviewer: 80

Grade proposed by reviewer: B

Responsibility: Mgr. et Mgr. Hana Odstrčilová