The optimal control of partially observable Markov processes

Abstract

The report studies the control of a finite-state, discrete-time Markov process characterized by incomplete state observation. The process is viewed through a set of outputs such that the probability of observing a given output is dependent on the current state of the Markov process. The observed stochastic process consisting of the time sequence of outputs generated by the imbedded Markov process is termed a partially observable Markov process. A finite number of alternative parameter sets for the partially observable process are available. Associated with each alternative is a set of costs for making transitions between the states of the Markov process and for producing the various outputs. At each time period an observer must select a control alternative to minimize the total expected operating costs for the process. The thesis consists of two major sections: In the first section the state of the partially observable Markov process is proved to be the vector of state occupancy probabilities for the Markov process. Using this concept of state, an algorithm is developed to solve for the optimal control as a function of a finite operating time. The algorithm produces an exact solution for the optimal control over the complete state space of a general partially observable Markov process, and is applicable to both discounted and nondiscounted problems The second section deals with the case of infinite operating time, and is subdivided into the cases of discounted and nondiscounted costs. (Author)

Keywords

ObservableMarkov processControl (management)Computer scienceMarkov chainMathematicsEconometricsStatisticsArtificial intelligencePhysics

Related Publications

Finite-Dimensional Approximation of Gaussian Processes

Giancarlo Ferrari‐Trecate , Christopher K. I. Williams , Manfred Opper

Gaussian process (GP) prediction suffers from O(n3) scaling with the data set size n. By using a finite-dimensional basis to approximate the GP predictor, the computational comp...

1998 Neural Information Processing Systems 33 citations

Stochastic Petri net representation of discrete event simulations

Peter J. Haas , Gerald S. Shedler

In the context of discrete event simulation, the marking of a stochastic Petri net (SPN) corresponds to the state of the underlying stochastic process of the simulation and the ...

1989 IEEE Transactions on Software Enginee... 55 citations

Numerical Calculation of Time-Dependent Viscous Incompressible Flow of Fluid with Free Surface

Francis H. Harlow , J. Eddie Welch

A new technique is described for the numerical investigation of the time-dependent flow of an incompressible fluid, the boundary of which is partially confined and partially fre...

1965 The Physics of Fluids 5816 citations

Estimating Autocorrelations in Fixed-Effects Models

Gary Solon

This paper discusses the estimation of serial correlation in fixed effects models for longitudinal data. Like time series data, longitudinal data often contain serially correlat...

1984 42 citations

Gaussian Process Priors with Uncertain Inputs Application to Multiple-Step Ahead Time Series Forecasting

Agathe Girard , Carl Edward Rasmussen , Joaquin Quiñonero Candela +1 more

We consider the problem of multi-step ahead prediction in time series analysis using the non-parametric Gaussian process model. k-step ahead forecasting of a discrete-time non-l...

2002 370 citations

Publication Info

Year: 1971
Type: article
Citations: 455
Access: Closed

External Links

Citation Metrics

455

OpenAlex

Cite This

APA Style

                            
                                    Edward J. Sondik
                                
                            (1971). 
                            The optimal control of partially observable Markov processes. 
                            Munich Personal RePEc Archive (Ludwig Maximilian University of Munich)
                            
                            .