Using Test Standard-Setting Methods in Educational Program Evaluation: Addressing the Issue of How Good is Good Enough

Main Article Content

Paul R. Brandon

Abstract

School districts in the United States and elsewhere commonly use standard setting to assign value to student test and assessment scores. That is, they set standards to show “how good is good enough.” This paper presents a summary of the empirical findings on the most widely-studied test standard-setting method and describes what the conclusions of the summary suggest about the use of test standard-setting in educational program evaluations.

Downloads

Download data is not yet available.

Article Details

How to Cite
Brandon, P. R. (2005). Using Test Standard-Setting Methods in Educational Program Evaluation: Addressing the Issue of How Good is Good Enough. Journal of MultiDisciplinary Evaluation, 2(3), 1–29. https://doi.org/10.56645/jmde.v2i3.99
Section
Research on Evaluation Articles

References

Angoff, W. H. (1971). Scales, norms and equivalent scores. In R. L. Thorndike (Ed.), Educational measurement (2nd ed., pp. 508-600). Washington, DC: American Council on Education.

Brandon, P. R. (2002). Two versions of the contrasting-groups standard-setting method: A review. Measurement and Evaluation in Counseling and Development, 35, 167-181. https://doi.org/10.1080/07481756.2002.12069061 DOI: https://doi.org/10.1080/07481756.2002.12069061

Brandon, P. R. (2004). Conclusions about frequently studied modified Angoff standard-setting topics. Applied Measurement in Education, 17, 59-88. https://doi.org/10.1207/s15324818ame1701_4 DOI: https://doi.org/10.1207/s15324818ame1701_4

Brandon, P. R., and Higa, T. F. (1998, April). Setting standards to use when judging program performance in stakeholder-assisted evaluations of small educational programs. Paper presented at the meeting of the American Educational Research Association, San Diego, CA.

Burton, N. W. (1978). Societal standards. Journal of Educational Measurement, 15, 263-271. https://doi.org/10.1111/j.1745-3984.1978.tb00073.x DOI: https://doi.org/10.1111/j.1745-3984.1978.tb00073.x

Camilli, G., Cizek, G. J., & Lugg, C. A. (2001). Psychometric theory and the validation of performance standards: History and future perspectives. In G. C. Cizek (Ed.), Setting performance standards: Concepts, methods, and perspectives (pp. 445-475). Mahwah, NJ: Lawrence Erlbaum.

Cizek, G. C. (2001). (Ed.). Setting performance standards: Concepts, methods, and perspectives. Mahwah, NJ: Lawrence Erlbaum.

Cook, T. D.; Leviton, L. C., & Shadish Jr., W. R. (1985). Program evaluation. In G. Lindzey and E. Aronson, Handbook of social psychology (3rd ed.). New York: Random House.

Fink, A. Kosecoff, J., & Brook, R. H. (1986). Setting standards of performance for program evaluations: The case of the teaching hospital general medicine group practice program. Evaluation and Program Planning, 9, 143-151. https://doi.org/10.1016/0149-7189(86)90034-0 DOI: https://doi.org/10.1016/0149-7189(86)90034-0

Hanser, L. M. (1998). Lessons for the National Assessment of Educational Progress from military standard setting. Applied Measurement in Education, 11, 81-95. Henry, G. T., McTaggart, M. J., & McMillan, J. H. (1992). Establishing benchmarks for outcome indicators: A statistical approach to developing performance standards. Evaluation Review, 16, 131-150. https://doi.org/10.1177/0193841X9201600202 DOI: https://doi.org/10.1177/0193841X9201600202

Hurtz, G. M., & Auerbach, M. A. (2003). A meta-analysis of the effects of modifications to the Angoff method on cutoff scores and judgment consensus. Educational and Psychological Measurement, 63, 584-601. https://doi.org/10.1177/0013164403251284 DOI: https://doi.org/10.1177/0013164403251284

Jaeger, R. M. (1989). Certification of student competence. In R. L. Linn (Ed.),Educational measurement (3rd ed., pp. 485-514). New York: American Council on Education/Macmillan.

Joint Committee on Standards for Educational Evaluation. (1994). The program evaluation standards (2nd ed.). Newbury Park, CA: Sage.

Kane, M. T. (2001). So much remains the same: Conception and status of validation in setting standards. In G. C. Cizek (Ed.), Setting performance standards: Concepts, methods, and perspectives (pp. 53-88). Mahwah, NJ: Lawrence Erlbaum.

Lipsey, M. W. (1990). Design sensitivity: Statistical power for experimental research. Newbury Park, CA: Sage.

Livingston, S. A. & Zieky, M. J. (1989). A comparative study of standard-setting methods. Applied Measurement in Education, 2, 121-141. https://doi.org/10.1207/s15324818ame0202_3 DOI: https://doi.org/10.1207/s15324818ame0202_3

Lynch, K. B. (1987). The size of education effects: An analysis of programs reviewed by the Joint Dissemination Review panel. Educational Evaluation and Policy Analysis, 9, 55-61. https://doi.org/10.3102/01623737009001055 DOI: https://doi.org/10.3102/01623737009001055

Mills, C. N., Melican, G. J., & Ahluwalia, N. T. (1991). Defining minimal competence. Educational Measurement: Issues and Practice, 10(2):7-10. https://doi.org/10.1111/j.1745-3992.1991.tb00186.x DOI: https://doi.org/10.1111/j.1745-3992.1991.tb00186.x

Patton, M. Q. (1997) Utilization-focused evaluation: The new century text. 3rd ed. Newbury Park, CA: Sage.

Rossi, P. H., & Freeman, H. E. (1993). Evaluation: A systematic approach (5th ed.). Newbury Park, CA: Sage.

Shadish, W. R., Cook, T. D., & Leviton, L. C. (1991) Foundations of program evaluation: Theories of practice. Newbury Park, CA: Sage.

Shepard, L. A. (1995). Implications for standard setting of the National Academy of Education Evaluation of the National Assessment of Educational Progress Achievement Levels. In Joint conference on standard setting for large-scale assessments. Vol.2. Proceedings (pp. 143-160). Washington, DC: U.S. Government Printing Office.

Smith, N. L. (1981). Constructing reasonable expectations in evaluation. Evaluation News, 2, 265-267. https://doi.org/10.1177/109821408100200322 DOI: https://doi.org/10.1177/109821408100200322

Smith, N. L. (1999). A framework for characterizing the practice of evaluation, with application to empowerment evaluation. Canadian Journal of Program Evaluation, Special Issue, 39-68. https://doi.org/10.3138/cjpe.0014.003 DOI: https://doi.org/10.3138/cjpe.0014.003

Wholey, J. S. (1979). Evaluation: Promise and performance. Washington, DC: Urban Institute.

Worthen, B. R., Sanders, J. R., & Fitzpatrick, J. L. (1997). Program evaluation: Alternative approaches and practical guideline (2nd ed.). New York: Longman.