An Observation-based Optimization Algorithm for POMDP and Its Simulation
-
Graphical Abstract
-
Abstract
The problem of performance optimization for partially observable Markov decision process(POMDP)is addressed based on the sensitivity analysis of Markov decision process(MDP).The sensitivity analysis formulas are given. Based on these results,two observation-based optimization algorithms,i.e.,policy-gradient and policy-iteration algorithms are developed for POMDP.To verify these algorithms,a simulation based on the problem of admission control is also presented.
-
-