Optimum Design in Item Response Theory: Test Assembly and Item Calibration

  • W. J. van der Linden
Part of the Recent Research in Psychology book series (PSYCHOLOGY)


The idea of optimizing experimental design to give estimators maximal efficiency has been around in the statistical literature for several decades, but its applicability to sampling problems in item response theory (IRT) has not been widely noticed. It is the purpose of this paper to show how optimum design principles can be used to improve item and examinee sampling in IRT-based test assembly and item calibration. For both applications a result based on the maximin principle is given. The maxim in principle fits these applications naturally, because IRT models are nonlinear and involve criteria of optimality that are dependent on the unknown parameters.


Item Response Theory Item Parameter Item Response Theory Model Test Assembly Ability Parameter 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Adema, J.J. (1988). A note on solving large-scale zero-one programming problems (Research Report 88–4 ). Enschede, The Netherlands: Department of Education, University of Twente.Google Scholar
  2. Adema, J.J. (1990a). The construction of customized two-stage tests. Journal of Educational Measurement, 27, 241–253.CrossRefGoogle Scholar
  3. Adema, J.J. (1990b). Models and algorithms for the construction of achievement tests. Doctoral thesis. Enschede, The Netherlands: University of Twente.Google Scholar
  4. Adema, J.J., Boekkooi-Timminga, E., & van der Linden, W.J. (1992). Achievement test construction using 0–1 linear programming. European Journal of Operational Research, 55, 103–111.CrossRefGoogle Scholar
  5. Adema, J.J., & van der Linden, W.J. (1989). Algorithms for computerized test construction using classical item parameters. Journal of Educational Statistics, 14, 279–290.CrossRefGoogle Scholar
  6. Amstrong, R.D., Jones, D.H., & Wu, I.-L. (1992). An automated test development of parallel tests from a seed test. Psychometrika, 51, 271–288.CrossRefGoogle Scholar
  7. Atkinson, A.C. (1982). Developments in the design of experiments. International Statistical Review, 50, 161–177.CrossRefGoogle Scholar
  8. Berger, M.P.F. (1992). Sequential sampling designs for the two-parameter item response theory model. Psychometrika, 51, 521–538.CrossRefGoogle Scholar
  9. Berger, M.P.F., & van der Linden, W.J. (1992). Optimality of sampling designs in item response theory models. In M. Wilson (Ed.), Objective measurement: Theory into practice (Vol. 1 ). Norwood, N.J.: Ablex.Google Scholar
  10. Boekkooi-Timminga, E. (1987). Simultaneous test construction by zero-one programming. Methodika, 1, 101–112.Google Scholar
  11. Boekkooi-Timminga, E. (1989). Models for computerized test construction. Doctoral thesis, University of Twente. De Lier, The Netherlands: Academisch Boeken Centrum.Google Scholar
  12. Boekkooi-Timminga, E. (1990a). The construction of parallel tests from IRT-based item banks. Journal of Educational Statistics, 15, 129–145.CrossRefGoogle Scholar
  13. Boekkooi-Timminga, E. (1990b). A cluster-based method for test construction. Applied Psychological Measurement, 14, 341–354.CrossRefGoogle Scholar
  14. Davey, T.C. ( 1992, April). Optimal common-item anchors for ability metric linking. Paper presented at the annual meeting of the American Educational Research Association, San Francisco, California.Google Scholar
  15. Fedorov, V.V. (1972). Theory of optimal experiments. New York: Academic Press.Google Scholar
  16. Fischer, G.H. (1981). On the existence and uniqueness of maximum likelihood estimates in the Rasch model. Psychometrika, 46, 59–77.CrossRefGoogle Scholar
  17. Hambleton, R.K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Boston: Kluwer-Nijhoff.Google Scholar
  18. Kalton, G. (1983). Introduction to survey sampling (Quantitative Applications in the Social Sciences, Series Nr. 35 ). Newbury Park, CA: Sage.Google Scholar
  19. Kiefer, J., & Wolfowitz, J. (1960). The equivalence of two extremum problems. Canadian Journal of Mathematics, 12, 363–366.CrossRefGoogle Scholar
  20. Nemhauser, G.L., & Wolsey, L.A. (1988). Integer and combinatorial optimization. New York: Wiley.Google Scholar
  21. Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen: Paedogogiske Institut.Google Scholar
  22. Sanders, P.F., Theunissen, T.J.J.M., & Baas, S.M. (1989). Minimizing the number of observations: A generalization of the Spearman-Brown formula. Psychometrika, 54, 587–598.CrossRefGoogle Scholar
  23. Sanders, P.F., Theunissen, T.J.J.M., & Baas, S.M. (1991). Maximizing the coefficient of generalizability under the constraint of limited resources. Psychometrika, 56, 87–96.CrossRefGoogle Scholar
  24. Särndal, C.-E., Swensson, B., & Wretman, J. (1992). Model-assisted survey sampling. New York: Springer-Verlag.CrossRefGoogle Scholar
  25. Silvey, S.D. (1980). Optimal design. London: Chapman and Hall.Google Scholar
  26. Steinburg, D.M., & Hunter, W.G. (1984). Experimental design: Review and comment. Technometrics, 26, 71–130.CrossRefGoogle Scholar
  27. Theunissen, T.J.J.M. (1985). Binary programming and test design. Psychometrika, 50, 411–420.CrossRefGoogle Scholar
  28. Timminga, E. (1985). Geautomatiseerd toetsontwerp: Itemselectie met behulp van binair programmeren [Automated test design: Item selection using binary programming]. Master’s thesis. Enschede, The Netherlands: University of Twente.Google Scholar
  29. Vale, CD. (1986). Linking item parameters onto a common scale. Applied Psychological Measurement, 10, 333–344.CrossRefGoogle Scholar
  30. van der Linden, W.J., & Boekkooi-Timminga, E. (1988). A zero-one programming approach to Gulliksen’s matched random subsets method. Applied Psychological Measurement, 12, 201–209.CrossRefGoogle Scholar
  31. van der Linden, W.J., & Boekkooi-Timminga, E. (1989). A maximin model for test design with practical constraints. Psychometrika, 53, 237–247.CrossRefGoogle Scholar
  32. van der Linden, W.J., & Eggen, T.J.J.M. (1986). An empirical Bayes approach to item banking. Applied Psychological Measurement, 10, 345–354.CrossRefGoogle Scholar
  33. Verschoor, A. (1991). Optimal test design (computer program). Arnhem, The Netherlands: Cito.Google Scholar
  34. Wingersky, M.S., & Lord, F.M. (1984). An investigation of methods for reducing sampling error in some IRT procedures. Applied Psychological Measurement, 8, 347–364.CrossRefGoogle Scholar

Copyright information

© Springer-Verlag New York, Inc. 1994

Authors and Affiliations

  • W. J. van der Linden
    • 1
  1. 1.University of TwenteEnschedeThe Netherlands

Personalised recommendations