Monotone Optimal Policies for Left-Skip-Free Markov Decision Processes

  • Shaler StidhamJr.
  • Richard R. Weber
Part of the International Series in Operations Research & Management Science book series (ISOR, volume 19)


In a previous paper (Stidham and Weber [9]), we considered a variety of models for optimal control of the service rate in a queueing system, in which the objective is to minimize the limiting average expected cost per unit time. By standard techniques, we showed how to convert such a problem into an equivalent problem in which the objective is to minimize the expected total (undiscounted) cost until the first entrance into state zero. Under weak assumptions on the one-stage (service plus holding) costs and transition probabilities, we showed that an optimal policy is monotonic, that is, a larger service rate is used in larger states. In contrast to previous models in the literature on control of queues, we assumed that the holding cost was nondecreasing, but not necessarily convex, in the state. A common assumption in all the models was that services take place one at a time, so that the state transitions are skip-free to the left: a one-step transition from state i to a state j < i − 1 is impossible. Many queueing models have this property, including all birth—death models, as well as a variety of M/GI/1-type models, including models with batch arrivals, phase-type service times, and LCFS-PR queue discipline.


Service Time Optimal Policy Service Rate Markov Decision Process Optimality Equation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    Bertsekas, D. Dynamic Programming and Optimal Control, Vol. II. Athena Scientific, Belmont, MA, 1995.zbMATHGoogle Scholar
  2. [2]
    Keilson, J. The use of Green’s functions in the study of bounded random walks with applications to queuing theory. J. Math. Phys. 41, 42–52, 1962.MathSciNetzbMATHGoogle Scholar
  3. [3]
    Keilson, J. Green’s Function Methods in Probability Theory. Griffin, London, 1965.zbMATHGoogle Scholar
  4. [4]
    Kulkarni, V. G. Modeling and Analysis of Stochastic Systems. Chapman-Hall, London, 1995.zbMATHGoogle Scholar
  5. [5]
    Lippman, S. A. Applying a new device in the optimization of exponential queuing systems. Oper. Res. 23, 687–710, 1975.MathSciNetzbMATHCrossRefGoogle Scholar
  6. [6]
    Puterman, M. Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley, New York, 1994.zbMATHGoogle Scholar
  7. [7]
    Schäl, M. Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal. Z. Wahrscheinlichkeitstheorie verw. Gerb. 32, 179–196, 1975.zbMATHCrossRefGoogle Scholar
  8. [8]
    Serfozo, R. An equivalence between continuous and discrete-time markov decision processes. Oper. Res. 27, 616–620, 1979.MathSciNetzbMATHCrossRefGoogle Scholar
  9. [9]
    Stidham, S. Jr., and Weber, R. Monotonic and insensitive optimal policies for control of queues with undiscounted costs. Oper. Res. 87, 611–625, 1989.MathSciNetCrossRefGoogle Scholar
  10. [10]
    Topkis, D. Minimizing a submodular function on a lattice. Oper. Res. 26, 305–321, 1978.MathSciNetzbMATHCrossRefGoogle Scholar
  11. [11]
    Wijngaard, J., and Stidham, S. Jr. Forward recursion for Markov decision processes with skip-free-to-the-right transitions, Part i: Theory and algorithms. Math. Oper. Res. 11, 295–308, 1986.MathSciNetzbMATHCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 1999

Authors and Affiliations

  • Shaler StidhamJr.
  • Richard R. Weber

There are no affiliations available

Personalised recommendations