Master's Thesis

A Service for Automatic Test Result Summarization in Testing Farm

Final Thesis 6.8 MB Appendix 17.06 MB

Author of thesis: Ing. Natália Bubáková

Acad. year: 2025/2026

Abstract:

This thesis proposes an AI-based service to improve the interpretation of completed test requests within the Testing Farm ecosystem. The project aims to reduce the manual effort needed to understand Testing Farm failures by generating concise, human-readable summaries from test metadata, structured result files, execution logs, and external operational context. To support this goal, the work examines Testing Farm result artifacts, large language model processing, agent-based workflows, and tool-assisted analysis. The core of the service is a hierarchical agent workflow implemented in Python using the BeeAI Framework and Vertex AI, separating request-level, suite-level, test-level, and outage-related analysis. Particular emphasis is placed on infrastructure and setup-related failures, which are difficult to interpret from raw artifacts alone and can be correlated with external sources such as Testing Farm status pages and Jira records. The service integrates with Testing Farm through enriched sidecar artifacts and presents generated summaries directly in the Oculus web interface. The implemented prototype is evaluated on real Testing Farm data with respect to summary accuracy, practical usefulness, and performance, and its production potential is demonstrated by its adoption as a Testing Farm production prototype for active development.

Keywords:

Testing Farm, BeeAI Framework, LLM, Vertex AI, tmt, large language models, test result summarization, agent workflow, test failure detection, Jira, status pages

Date of defence

25.06.2026

Result of the defence

Defended (thesis was successfully defended)

znamkaAznamka

Grading

Process of defence

Studentka nejprve prezentovala výsledky, kterých dosáhla v rámci své práce. Komise se poté seznámila s hodnocením vedoucího a posudkem oponenta práce. Studentka následně odpověděla na otázky oponenta a na další otázky přítomných, např. ohledně praktických příkladů nasazení realizačního výstupu či zvoleného fontu při popisu některých obrázků. Komise se na základě posudku oponenta, hodnocení vedoucího, přednesené prezentace a odpovědí studentky na položené otázky rozhodla práci hodnotit stupněm A - výborně.

Topics for thesis defence

Chapter 4 contains a large set of requirements divided by their relative priority (must, should, could). How were the priorities established (this was not entirely clear from the text)?
When correlating the findings with known outages in Jira and Testing Farm status pages, did you consider using Model Context Protocol servers? Do you think they would bring improved results?

Language of thesis

English

Faculty

Fakulta informačních technologií

Department

Department of Intelligent Systems

Study programme

Information Technology and Artificial Intelligence (MITAI)

Specialization

Intelligent Systems (NISY)

Composition of Committee

doc. Ing. František Zbořil, CSc. (předseda)
doc. Ing. Vladimír Janoušek, Ph.D. (místopředseda)
Ing. Martin Hrubý, Ph.D. (člen)
Ing. Jaroslav Rozman, Ph.D. (člen)
Dr. Ing. Petr Peringer (člen)
Ing. Tomáš Goldmann, Ph.D. (člen)

Supervisor’s report
Ing. Aleš Smrčka, Ph.D.

Natália Bubáková jako velice důsledná diplomantka pracovala samostatně, konzultovala důležité části a dokázala si poradit i s technickými překážkami. Projevila inženýrské schopnosti při praktickém řešení služby začlenitelné do reálného prostředí Testing Farm.

Evaluation criteria	Verbal classification
Informace k zadání	Zadání diplomové práce bylo vypsáno ve spolupráci s firmou Red Hat a navazovalo na praktickou potřebu zlepšit interpretaci výsledků v systému Testing Farm. Náročnost zadání spočívala zejména v kombinaci automatizovaného testování, zpracování rozsáhlých a hierarchicky členěných dat pomocí agentního workflow. Studentka musela navrhnout postup, jak selektivně získávat relevantní informace z dynamických zdrojů, jako jsou logy, stavové stránky a záznamy v Jira. S dosaženým výsledkem jsem spokojen: vznikl funkční prototyp ověřený na reálných datech Testing Farm, který má zřejmý praktický potenciál pro další vývoj a nasazení.
Aktivita při dokončování	Technická zpráva byla dokončena v dostatečném předstihu.
Publikační činnost, ocenění
Práce s literaturou	Studentka si studijní prameny získávala sama.
Aktivita během řešení, konzultace, komunikace	Aktivita byla proměnlivá, nicméně studentka komunikovala a opakovanou nárazovou prací dosáhla stanovených milníků.

Points proposed by supervisor: 90

Grade proposed by supervisor: A

Reviewer’s report
Ing. Viktor Malík, Ph.D.

The thesis addresss a very relevant problem - large-scale testing pipelines often produce too many results and it is extremely hard find the root cause of the test failure in a reasonably short time. The proposed solution uses agentic AI system that uses Large Language Models to analyze the test results and artifacts and to provide a short comprehensive summary. The smart design of the solution, which is in line with the workflow of the Testing Farm project, allows to overcome common problems of LLMs and provides a very useful information for users of Testing Farm. This is demonstrated by the conducted experiments as well as by great interest from the Testing Farm community. With respect to these attributes, I recommend to accept the thesis with the grade A (excellent).

Evaluation criteria	Verbal classification	Points
Rozsah splnění požadavků zadání	Evaluation level: zadání splněno
Rozsah technické zprávy	Evaluation level: přesahuje obvyklé rozmezí The thesis text is slightly longer than the usual extent. This is mostly caused by the large scope of the final solution as well as by the student trying to explain all of the relevant concepts and solution parts in great detail. In some cases, this makes the text a bit tedious (especially some of the implementation details from Chapter 6 could have been omitted). On the other hand, in most places, it leads to an easier understanding of the solution and the text does not contain any irrelevant information.
Prezentační úroveň technické zprávy	The thesis is very well organized and easy to read. All of the described concepts are sufficiently explained before being referred to. The description of the proposed solution is logically divided into analysis, design, and implementation part, with each part iteratively diving into more detail. This makes it easy for the reader to understand the solution and understand all of the design choices made during solving of the thesis. My only minor reserve is that the text is sometimes too detailed (see the previous point).	95
Formální úprava technické zprávy	The typography and language of the thesis are on a very high level. The text is written in an excellent English and I haven't found a single mistake or typo while reading the text. Similarly, typography of the text is great and largely contributes to the overall readability.	100
Práce s literaturou	Due to the implementation nature of the thesis, most of the literary sources come from the Testing Farm documentation, which is expected for this kind of the thesis. I also appreciate that the student has explored and got inspired by other existing solutions using agentic AI for test/logs summarization, mostly within Red Hat.	85
Realizační výstup	The final software product is the strongest part of the thesis. The student has proposed a complex agentic AI system, which is well aware of the internal structure of the subject project (Testing Farm) and its outputs (test results, logs, artifacts) that are analyzed by the LLMs. This allows the solution to overcome the traditional problems of LLMs - large verbosity, instability of results, hallucinations, or too long execution. Throughout thorough experimentation, the student succeeded to eliminate most of the shortcomings of existing LLM-based projects.	95
Využitelnost výsledků	At the time of writing this assessment, the proposed solution has been deployed in a testing mode on the public instance of Testing Farm. In addition, the Testing Farm maintainers already implemented few improvements and actively participated in deployment of the service, which demonstrates a great interest from the community and usefulness of the solution in practice.
Náročnost zadání	Evaluation level: obtížnější zadání I consider the assignment slightly more difficult than an average master's thesis assignment, mainly because (1) it required to study a rather complex project (Testing Farm) and (2) the solution required to propose an AI agent orchestration system. Despite its huge popularity these days, agentic AI is still very much unexplored, undocumented, and its landscape changes on a weekly basis.

Topics for thesis defence:

Chapter 4 contains a large set of requirements divided by their relative priority (must, should, could). How were the priorities established (this was not entirely clear from the text)?
When correlating the findings with known outages in Jira and Testing Farm status pages, did you consider using Model Context Protocol servers? Do you think they would bring improved results?

Points proposed by reviewer: 95

Grade proposed by reviewer: A

Responsibility: Mgr. et Mgr. Hana Odstrčilová

VUT

Faculties and university institutes

Parts

A Service for Automatic Test Result Summarization in Testing Farm