The optimal control of partially observable Markov processes

1971 Munich Personal RePEc Archive (Ludwig Maximilian University of Munich) 455 citations

Abstract

The report studies the control of a finite-state, discrete-time Markov process characterized by incomplete state observation. The process is viewed through a set of outputs such that the probability of observing a given output is dependent on the current state of the Markov process. The observed stochastic process consisting of the time sequence of outputs generated by the imbedded Markov process is termed a partially observable Markov process. A finite number of alternative parameter sets for the partially observable process are available. Associated with each alternative is a set of costs for making transitions between the states of the Markov process and for producing the various outputs. At each time period an observer must select a control alternative to minimize the total expected operating costs for the process. The thesis consists of two major sections: In the first section the state of the partially observable Markov process is proved to be the vector of state occupancy probabilities for the Markov process. Using this concept of state, an algorithm is developed to solve for the optimal control as a function of a finite operating time. The algorithm produces an exact solution for the optimal control over the complete state space of a general partially observable Markov process, and is applicable to both discounted and nondiscounted problems The second section deals with the case of infinite operating time, and is subdivided into the cases of discounted and nondiscounted costs. (Author)

Keywords

ObservableMarkov processControl (management)Computer scienceMarkov chainMathematicsEconometricsStatisticsArtificial intelligencePhysics

Related Publications

Publication Info

Year
1971
Type
article
Citations
455
Access
Closed

External Links

Citation Metrics

455
OpenAlex

Cite This

Edward J. Sondik (1971). The optimal control of partially observable Markov processes. Munich Personal RePEc Archive (Ludwig Maximilian University of Munich) .