Bachelor's Thesis

Image Recognition on Mobile Phone to Facilitate Playing Board Games

Final Thesis 3.79 MB Appendix 4.64 MB

Author of thesis: Bc. Denis Fekete

Acad. year: 2025/2026

Supervisor: doc. RNDr. Pavel Smrž, Ph.D.

Abstract:

This thesis proposes an end-to-end object detection workflow for developers creating Android smartphone applications for recognizing game objects.

CVLiBG is an Android library written in Kotlin. Its purpose is to wrap camera control and object detection related functionalities into simple, reusable, and modular components for developers to utilize. CVLiBG is built around CameraX and OpenCV, with an additional module using the Open Neural Network Exchange Runtime for YOLO object detection models. The detection pipeline is exposed through modular detector classes, allowing developers to use the provided solutions or implement their own.

The second part of this thesis is Dataset Creator, a desktop application designed to support the creation of image datasets used for training neural networks. This application provides annotation, synthetic data generation, dataset export, and in-application training using pretrained Ultralytics models.

The proposed workflow was demonstrated on a proof-of-concept application for the Bang! card game. It uses a custom object detection model prepared with Dataset Creator and integrated into the Android application using CVLiBG. This demonstrates how the proposed workflow abstracts the implementation of camera-based object detection and allows developers to focus on the user interface and object detection results.

Keywords:

deep neural networks, object detection, android, kotlin, yolo, data augmentation

Date of defence

16.06.2026

Result of the defence

Defended (thesis was successfully defended)

znamkaBznamka

Grading

Process of defence

Student nejprve prezentoval výsledky, kterých dosáhl v rámci své práce. Komise se poté seznámila s hodnocením vedoucího a posudkem oponenta práce. Student následně odpověděl na otázky přítomných. Komise se na základě posudku oponenta, hodnocení vedoucího, přednesené prezentace a odpovědí studenta na položené otázky rozhodla práci hodnotit stupněm B.

Topics for thesis defence

Proč bych měl chtít použít vaší aplikaci?
Jaké informace vaše aplikace poskytuje?
Jak řešíte inferenci na mobilním zařízení?

Language of thesis

English

Faculty

Fakulta informačních technologií

Department

Department of Computer Graphics and Multimedia

Study programme

Information Technology (BIT)

Composition of Committee

doc. Ing. Lukáš Burget, Ph.D. (předseda)
doc. Mgr. Adam Rogalewicz, Ph.D. (místopředseda)
Ing. Libor Polčák, Ph.D. (člen)
Ing. Michal Hradiš, Ph.D. (člen)
Ing. Martin Žádník, Ph.D. (člen)

Supervisor’s report
doc. RNDr. Pavel Smrž, Ph.D.

Celkově hodnotím aktivitu studenta a výsledky práce ve formě realizačního výstupu jako dobré - podařilo se mu nastudovat základní prvky vytváření mobilních aplikací na platformě Androiůd, založené na počítačovém vidění, shromáždit a anotovat data, začlenit tvorbu syntetických datových sad a vyhodnotit výsledky rozpoznávání v relevantních testech.

Evaluation criteria	Verbal classification
Information about assignment	Zadání vyžadovalo seznámení se s principy rozpoznávání obrazu na mobilním telefonu a obecného vývoje na vybrané platformě a realizaci systému pro podporu konkrétní hry. Student si vybral spíše jednodušší cestu rozpoznávání identity karet ve hře, kde škála jejich typů a důležitých prvků není velká. Na druhé straně se mu podařilo obstojně proniknout do základů počítačového vidění a realizovat a vyhodnotit praktický systém, který je použitelný v reálných podmínkách.
Work with literature	Student pracoval s dostatečným množstvím relevantních studijních materiálů, které vhodně využil k vytvoření funkčního systému. I když použité prameny většinou neodpovídají odborným vědeckým článkům a neposkytují obecný pohled na možnosti mobilních nástrojů pro podporu hraní a vyhodnocování deskových her, celkově hodnotím práci s literaturou jako dobrou.
Activity during solution, consultations, communication	Denis Fekete byl aktivní během obou semestrů, dodržoval dohodnuté termíny, pravidelně referoval o postupu prací a na konzultace přicházel připraven.
Activity during completion	Práce byla dokončena s určitým předstihem, měl jsem možnost revidovat předběžné verze technické zprávy, moje připomínky byly zohledněny.
Publication activity, awards	-

Points proposed by supervisor: 75

Grade proposed by supervisor: C

Reviewer’s report
Maksim Aparovich

The engineering side is good and validated: three connected artifacts and modest benchmarking. The student also demonstrated in person a clear understanding of the underlying concepts (transfer learning, image representation in computer vision, model-size trade-offs), even where the report itself was less precise. On the other hand, the work has real weaknesses: the user study is formally incomplete, the needs analysis and related-work review are shallow, the English contains typos, and there is a duplicate citation. Weighing the above-standard difficulty, the solid practical results, and the honestly stated limitations, the work is good overall, with its main shortcomings in the written report and the evaluation rather than in the engineering or the student's grasp of the topic.

Evaluation criteria	Verbal classification	Points
The difficulty of the assignment	Evaluation level: more difficult assignment The assignment is above standard difficulty, requiring three separate software artifacts plus a mandatory real-world user study. For example, beyond the mobile library, the student had to "Develop a tool for generating synthetic data from real-world data" and build a working demo application for Bang!, covering an end-to-end pipeline from dataset preparation to deployment.
Presentation level of the technical report	The structure is logical and easy to follow, with the theory chapter walking through the pipeline step by step (preprocessing -> inference -> postprocessing). However, the related-work coverage is too brief and superficial: chapter 2 covers only a handful of applications and cites no books, papers, or CV/dev literature. A few theoretical statements in the text are imprecise and some factual details are misaligned across the thesis (e.g. the YOLO model versions used in the experiments), though the in-person demo confirmed the student understands the underlying concepts well; the issue is what made it into the report, not the student's knowledge.	75
Formal preparation of a technical report	Figures and typesetting are clean, but the English text contains a noticeable number of typos and awkward phrasings. Examples include "resulting it faster training", the duplicated word in "Android's Android's Neural Networks API", etc.	75
Realisation output	The software output is strong: three working artifacts validated through benchmarks and a meaningful experiment series, with third-party licenses handled correctly (PySide LGPL, Ultralytics AGPL, OpenCV). Claims are backed by measurements when comparing ONNX to OpenCV. Some conclusions are over-stated on thin evidence, though - YOLO 8 is called the most accurate despite only "~1% more accurate in mAP50 and mAP50-95" differences, and the "NNAPI was 4x slower than CPU" claim rests on a single device.	90
Usability of results	The work extends an existing direction rather than breaking new ground, but delivers a reusable workflow. It builds on a prior thesis on the same topic, so the contribution is incremental, yet the combination of library plus dataset generator has real practical value. The results are applicable in practice.
The extent to which the requirements of the assignment have been met	Evaluation level: assignment fulfilled Most of the assignment is met, but two points are weak: the player-needs analysis (point 2) is shallow and the user study (point 5) was reduced to a small proof-of-concept rather than a proper evaluation. The author admits this directly: "Due to time constraints, the evaluation was performed as a small-scale proof-of-concept evaluation" and "does not provide statistically significant evidence about the improvement of the gameplay experience." Points 1, 3, and 4 are solidly delivered.
Extent of the technical report	Evaluation level: meets the minimum requirements only The report sits at the lower end of the acceptable range, with roughly 44 pages of body text against a 40-page minimum and a usual range of 60-80. The content is mostly information-rich, but some sections are disproportionately thin: the entire "Previous works" chapter is only about one page. The length is sufficient but modest for a bachelor's thesis.
Work with literature	Source selection is relevant and includes the key object-detection papers, but the bibliography mixes many blog/tutorial links with academic sources and contains errors. Entries [26] and [27] are an exact duplicate of the same source.	75

Points proposed by reviewer: 80

Grade proposed by reviewer: B

Responsibility: Mgr. et Mgr. Hana Odstrčilová

VUT

Faculties and university institutes

Parts

Image Recognition on Mobile Phone to Facilitate Playing Board Games