Publication result detail

Search and Explore: Symbiotic Policy Synthesis in POMDPs

ANDRIUSHCHENKO, R.; ALEXANDER, B.; ČEŠKA, M.; JUNGES, S.; KATOEN, J.; MACÁK, F.

Original Title

Search and Explore: Symbiotic Policy Synthesis in POMDPs

English Title

Search and Explore: Symbiotic Policy Synthesis in POMDPs

Type

Paper in proceedings (conference paper)

Original Abstract

This paper marries two state-of-the-art controller synthesis methods for partially observable Markov decision processes (POMDPs), a prominent model in sequential decision making under uncertainty. A central issue is to find a POMDP controller - that solely decides based on the observations seen so far - to achieve a total expected reward objective. As finding optimal controllers is undecidable, we concentrate on synthesising good finite-state controllers (FSCs). We do so by tightly integrating two modern, orthogonal methods for POMDP controller synthesis: a belief-based and an inductive approach. The former method obtains an FSC from a finite fragment of the so-called belief MDP, an MDP that keeps track of the probabilities of equally observable POMDP states. The latter is an inductive search technique over a set of FSCs, e.g., controllers with a fixed memory size. The key result of this paper is a symbiotic anytime algorithm that tightly integrates both approaches such that each profits from the controllers constructed by the other. Experimental results indicate a substantial improvement in the value of the controllers while significantly reducing the synthesis time and memory footprint.

English abstract

This paper marries two state-of-the-art controller synthesis methods for partially observable Markov decision processes (POMDPs), a prominent model in sequential decision making under uncertainty. A central issue is to find a POMDP controller - that solely decides based on the observations seen so far - to achieve a total expected reward objective. As finding optimal controllers is undecidable, we concentrate on synthesising good finite-state controllers (FSCs). We do so by tightly integrating two modern, orthogonal methods for POMDP controller synthesis: a belief-based and an inductive approach. The former method obtains an FSC from a finite fragment of the so-called belief MDP, an MDP that keeps track of the probabilities of equally observable POMDP states. The latter is an inductive search technique over a set of FSCs, e.g., controllers with a fixed memory size. The key result of this paper is a symbiotic anytime algorithm that tightly integrates both approaches such that each profits from the controllers constructed by the other. Experimental results indicate a substantial improvement in the value of the controllers while significantly reducing the synthesis time and memory footprint.

Keywords

partially observable Markov decision processes, finite-state controllers, beliefs, inductive synthesis

Key words in English

partially observable Markov decision processes, finite-state controllers, beliefs, inductive synthesis

Authors

ANDRIUSHCHENKO, R.; ALEXANDER, B.; ČEŠKA, M.; JUNGES, S.; KATOEN, J.; MACÁK, F.

RIV year

2024

Released

02.08.2023

Publisher

Springer Verlag

Location

Cham

ISBN

978-3-031-37708-2

Book

Computer Aided Verification

Edition

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Volume

13966

Pages from

113

Pages to

135

Pages count

23

BibTex

@inproceedings{BUT185190,
  author="ANDRIUSHCHENKO, R. and ALEXANDER, B. and ČEŠKA, M. and JUNGES, S. and KATOEN, J. and MACÁK, F.",
  title="Search and Explore: Symbiotic Policy Synthesis in POMDPs",
  booktitle="Computer Aided Verification",
  year="2023",
  series="Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
  volume="13966",
  pages="113--135",
  publisher="Springer Verlag",
  address="Cham",
  doi="10.1007/978-3-031-37709-9\{_}6",
  isbn="978-3-031-37708-2"
}