Master's Thesis

Optimal control methods in machine learning

Final Thesis 1.56 MB Appendix 7.65 kB

Author of thesis: Foster Agyei

Acad. year: 2025/2026

Supervisor: doc. Mgr. et Mgr. Aleš Návrat, Ph.D.

Abstract:

This thesis explores the basic mathematical relationship between deep learning architectures' expressivity and geometric control theory. Using Deep Residual Neural Networks (ResNets) as continuous-time dynamical systems controlled by Neural Ordinary Differential Equations (Neural ODEs), we show that the algebraic properties of the underlying vector fields and the Lie Algebra Rank Condition (LARC) determine a network's ability to approximate complex coordinate transformations. By reinterpreting individual network layers as discrete snapshots of a continuous velocity flow, this continuous lens changes the training objective from learning static, disconnected weights to identifying a smooth, time-dependent control function.

Formulating and analyzing this training process as a regularized continuous-time optimal control problem is the main goal of this work. Using co-state trajectories and a step-size backtracking line-search mechanism, we implement an iterative framework based on the discrete Pontryagin Maximum Principle (PMP) to efficiently optimize this structure without depending on conventional discrete backpropagation. Using this configuration, we study the effects of extreme spatial curvature and boundary deformations on various geometric designs of controlled vector fields. In order to guarantee smooth, stable continuous flows that preserve valid coordinate diffeomorphisms without producing spatial ripping or overfitting, we finally investigate the critical role of an L^2 control regularization penalty in limiting high-energy control operations.

Keywords:

Control-Linear Systems, Neural ODEs, Residual Neural Networks (ResNets), Geometric Control Theory, Ensemble Controllability, Lie Brackets, Chow-Rashevskii Theorem, Pontryagin Maximum Principle (PMP), Universal Approximation, Diffeomorphisms.

Date of defence

17.06.2026

Result of the defence

Defended (thesis was successfully defended)

znamkaEznamka

Grading

Process of defence

Student presented his Master's thesis. Then hy presented Algorithm 1 from the thesis to answer reviewer comment about the level he understand the work (and suggestion that the thesis was heavily generated by AI). Then he answered questions of prof. Protasov regarding particular expressions in thesis's presentation.

Language of thesis

English

Faculty

Fakulta strojního inženýrství

Department

Institute of Mathematics

Study programme

Applied and Interdisciplinary Mathematics (N-AIM-A)

Composition of Committee

prof. RNDr. Josef Šlapal, CSc. (předseda)
doc. Ing. Luděk Nechvátal, Ph.D. (místopředseda)
doc. Ing. Petr Tomášek, Ph.D. (člen)
prof. Mgr. Pavel Řehák, Ph.D. (člen)
doc. Ing. Tomáš Kisela, Ph.D. (člen)
Prof. Vladimir Protasov (člen)

Supervisor’s report
doc. Mgr. et Mgr. Aleš Návrat, Ph.D.

This thesis demonstrates the application of methods from geometric control theory to deep learning. Residual neural networks are interpreted here as a discretization of the flow of a linear-control neural ODE. In this context, the network's expressivity directly corresponds to the system's controllability. The training process is formulated as an optimal control problem, which is solved using an iterative numerical scheme based on the Pontryagin maximum principle.

The student approached the thesis responsibly and consulted regularly, but his overall progress was slow. As a result, in the final phase, there was not enough time left for more in-depth testing of theoretical knowledge on more examples or, in particular, for final text corrections. The resulting text exhibits significant shortcomings and shows signs of uncritical use of generative AI. On the other hand, it should be acknowledged that the student successfully mastered the fundamentals of geometric control theory. He understood the source technical article to a sufficient level, enabling him to independently implement and modify the presented algorithm.
In light of these facts, I propose grading the work with a D (satisfactory).

Evaluation criteria	Grade
Splnění požadavků a cílů zadání	D
Postup a rozsah řešení, adekvátnost použitých metod	C
Vlastní přínos a originalita	E
Schopnost interpretovat dosažené výsledky a vyvozovat z nich závěry	D
Využitelnost výsledků v praxi nebo teorii	B
Logické uspořádání práce a formální náležitosti	E
Grafická, stylistická úprava a pravopis	D
Práce s literaturou včetně citací	D
Samostatnost studenta při zpracování tématu	D

Grade proposed by supervisor: D

Reviewer’s report
doc. Mgr. Petr Vašík, Ph.D.

Although the topic of the thesis is very attractive, actual and appropriate, I have to say that the thesis as a whole is below standard. It is difficult to read due to numerous typos, inaccuracies, inconsistencies, and mistakes. For instance: Symbol F in (1.9) is a function. Without a warning, it is a vector field in (1.11), where in the description of (1.11), the author is talking about a particle's motion. While F is supposed to be a set of coefficients of a control u, on page 44 (top) u becomes an argument of F. I hope that it is a typo.

Apart from that, the thesis contains a considerable number of typographical, grammatical, and stylistic errors that negatively affect readability and reduce the overall professional quality.

In Section 1.4, the sentence "where \Phi is the terminal cost and L L is the Lagrangian" contains a duplicated symbol and incorrect spacing. Several figure and theorem references are inconsistent or malformed, e.g. references such as "equation(2.4)" and "example as in ??" appear in the text without proper formatting or resolution or by theorem ?? (section 5.1).

References lack a unified style. Bibliographic entries are not consistently formatted with respect to author names, journal titles, capitalisation, publication details, and punctuation. Some citations appear as numerical references, while others are referred to only generically in the text. Placeholder citations such as "[?, ?]" remain in the manuscript and should have been resolved before submission.

What worries me most is a strong feeling that the whole text is mostly AI-generated, including the code samples. Expressions such as "The Lie Bracket contains the solution", "gold standard", "alphabet of movements", "surgical tools", and similar metaphorical descriptions are atypical for a mathematical thesis. Some paragraphs use highly promotional formulations ("crucial finding", "precise foundation", "automatic ensemble controllability") that are not sufficiently supported by rigorous argumentation. The MATLAB code included in the appendix appears to have been generated or heavily assisted by an AI tool. This cannot be concluded with certainty from the text alone, but the highly template-like structure, overly explanatory comments, repeated code blocks with minor systematic changes, and inconsistencies between comments, algorithms, and the surrounding thesis text strongly suggest non-original or mechanically generated code.

Finally, the conclusions of the thesis are rather strong in contrast with the number of experiments provided. The sentence "our numerical studies verified that they encounter severe architectural constraints
when resolving localized high-curvature distortions" should be supported by more arguments. Furthermore, the sentences "The 8-parameter GH model may
autonomously bend certain, narrow pockets of space without interfering with nearby grid configurations
because it uses localized kernel curves instead of global power steps. Both training and testing tracking
errors are driven past 10^−2, almost an order of magnitude below the baseline, thanks to the optimization’s
algebraic flexibility, which breaks through the polynomial performance floor." seem to be fully AI-generated and have no meaning.

My overall feeling is that the student must prove that he understands the text and that he is able to support his conclusions to defend the thesis. I suggest grade E.

Evaluation criteria	Grade
Splnění požadavků a cílů zadání	C
Postup a rozsah řešení, adekvátnost použitých metod	D
Vlastní přínos a originalita	E
Schopnost interpretovat dosaž. výsledky a vyvozovat z nich závěry	E
Využitelnost výsledků v praxi nebo teorii	B
Logické uspořádání práce a formální náležitosti	F
Grafická, stylistická úprava a pravopis	F
Práce s literaturou včetně citací	F

Topics for thesis defence:

Explain Algorithm 1, lines 2-21.
Explain and justify your conclusion: "Moreover, the testing error profiles closely resemble the empirical training loss in all configurations. Our theoretical generalization estimate curves are validated by this consistent tracking, which also demonstrates that the 𝐿^2 control regularization penalty \beta effectively prevents overfitting by preserving a steady, smooth continuous-time flow."

Grade proposed by reviewer: E

Responsibility: Mgr. et Mgr. Hana Odstrčilová

VUT

Faculties and university institutes

Parts

Optimal control methods in machine learning