Monotone Optimal Policies for Left-Skip-Free Markov Decision Processes
Part of the
International Series in Operations Research & Management Science
book series (ISOR, volume 19)
In a previous paper (Stidham and Weber ), we considered a variety of models for optimal control of the service rate in a queueing system, in which the objective is to minimize the limiting average expected cost per unit time. By standard techniques, we showed how to convert such a problem into an equivalent problem in which the objective is to minimize the expected total (undiscounted) cost until the first entrance into state zero. Under weak assumptions on the one-stage (service plus holding) costs and transition probabilities, we showed that an optimal policy is monotonic, that is, a larger service rate is used in larger states. In contrast to previous models in the literature on control of queues, we assumed that the holding cost was nondecreasing, but not necessarily convex, in the state. A common assumption in all the models was that services take place one at a time, so that the state transitions are skip-free to the left: a one-step transition from state i to a state j < i − 1 is impossible. Many queueing models have this property, including all birth—death models, as well as a variety of M/GI/1-type models, including models with batch arrivals, phase-type service times, and LCFS-PR queue discipline.
KeywordsService Time Optimal Policy Service Rate Markov Decision Process Optimality Equation
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Bertsekas, D. Dynamic Programming and Optimal Control
, Vol. II. Athena Scientific, Belmont, MA, 1995.zbMATHGoogle Scholar
Keilson, J. The use of Green’s functions in the study of bounded random walks with applications to queuing theory. J. Math. Phys.
, 42–52, 1962.MathSciNetzbMATHGoogle Scholar
Keilson, J. Green’s Function Methods in Probability Theory
. Griffin, London, 1965.zbMATHGoogle Scholar
Kulkarni, V. G. Modeling and Analysis of Stochastic Systems
. Chapman-Hall, London, 1995.zbMATHGoogle Scholar
Lippman, S. A. Applying a new device in the optimization of exponential queuing systems. Oper. Res.
, 687–710, 1975.MathSciNetzbMATHCrossRefGoogle Scholar
Puterman, M. Markov Decision Processes: Discrete Stochastic Dynamic Programming
. Wiley, New York, 1994.zbMATHGoogle Scholar
Schäl, M. Conditions for optimality in dynamic programming and for the limit of n
-stage optimal policies to be optimal. Z. Wahrscheinlichkeitstheorie verw. Gerb.
, 179–196, 1975.zbMATHCrossRefGoogle Scholar
Serfozo, R. An equivalence between continuous and discrete-time markov decision processes. Oper. Res.
, 616–620, 1979.MathSciNetzbMATHCrossRefGoogle Scholar
Stidham, S. Jr., and Weber, R. Monotonic and insensitive optimal policies for control of queues with undiscounted costs. Oper. Res.
, 611–625, 1989.MathSciNetCrossRefGoogle Scholar
Topkis, D. Minimizing a submodular function on a lattice. Oper. Res.
, 305–321, 1978.MathSciNetzbMATHCrossRefGoogle Scholar
Wijngaard, J., and Stidham, S. Jr. Forward recursion for Markov decision processes with skip-free-to-the-right transitions, Part i: Theory and algorithms. Math. Oper. Res.
, 295–308, 1986.MathSciNetzbMATHCrossRefGoogle Scholar
© Springer Science+Business Media New York 1999