Retrospective Pretest and Counterfactual Self-Report: Different or Same?

Main Article Content

Tony C.M. Lam
Edgar Valencia
https://orcid.org/0000-0002-0311-9272

Abstract

Purpose: To examine discriminant validity of treatment participants’ self-report of the state they would be in had they not received treatment (counterfactual); specifically, the distinction between self-report of counterfactual and self-report of preintervention state (retrospective pretest).


Setting: An education department of a large University in North America.


Intervention: Methods of self-reporting research self-efficacy with counterfactual items and with retrospective pretest items.


Research design: A randomized comparison group design with two treatments that were defined by the version of the survey used in each. In the survey for the counterfactual condition, items about research self-efficacy without the influence of their program of studies were included. The survey in the retrospective pretest condition contained items regarding research self-efficacy before participating in their program of study. The same items about research self-efficacy at the current time (posttest) were included in both treatment conditions.


Data collection & analysis: Participants were graduate students recruited via email who answered an online survey about research self-efficacy. These students were randomly assigned to one of the two aforementioned treatments. Responses were analyzed using a mixed 2 by 2 randomized factorial ANOVA design with self-report method (counterfactual or retrospective pretest) as the between-subjects factor and time (pre and post intervention) as the within-subjects factor.


Findings: Our findings show that counterfactual and retrospective pretest scores and treatment effects computed based on these two sets of scores are virtually identical, casting doubt on participants’ ability to differentiate between a state of no treatment and a state at treatment commencement after they have received treatment.

Downloads

Download data is not yet available.

Article Details

How to Cite
Lam, T. C., & Valencia, E. (2019). Retrospective Pretest and Counterfactual Self-Report: Different or Same?. Journal of MultiDisciplinary Evaluation, 15(33), 37–53. https://doi.org/10.56645/jmde.v15i33.575
Section
Research on Evaluation Articles

References

American Educational Research Association (AERA), American Psychological Association (APA), & National Council on Measurement in Education (NCME) (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.

Belcher, B., & Palenberg, M. (2018). Outcomes and Impacts of Development Interventions: Toward Conceptual Clarity. American Journal of Evaluation, 1098214018765698. https://doi.org/10.1177/1098214018765698 DOI: https://doi.org/10.1177/1098214018765698

Bell, S.H. & Peck, L.R. (2016). On the feasibility of extending social experiments to wider applications. Journal of MultiDisciplinary Evaluation, 12(27), 2016 DOI: https://doi.org/10.56645/jmde.v12i27.452

https://doi.org/10.56645/jmde.v12i27.452 DOI: https://doi.org/10.56645/jmde.v12i27.452

Blamey, A., & Mackenzie, M. (2007). Theories of Change and Realistic Evaluation: Peas in a Pod or Apples and Oranges? Evaluation, 13(4), 439-455. https://doi.org/10.1177/1356389007082129 DOI: https://doi.org/10.1177/1356389007082129

Blanch-Hartigan, D. (2011). Medical students' self-assessment of performance: results from three meta-analyses. Patient Education and Counseling, 84(1), 3-9. https://doi.org/10.1016/j.pec.2010.06.037 DOI: https://doi.org/10.1016/j.pec.2010.06.037

Bowman, N. (2010). Can 1st-Year College Students Accurately Report Their Learning and Development? American Educational Research Journal, 47(2), 466-496. https://doi.org/10.3102/0002831209353595 DOI: https://doi.org/10.3102/0002831209353595

Bray, J. H., Maxwell, S. E., & Howard, G. S. (1984). Methods of Analysis with Response-Shift Bias. Educational and Psychological Measurement, 44(4), 781-804. DOI: https://doi.org/10.1177/0013164484444002

https://doi.org/10.1177/0013164484444002 DOI: https://doi.org/10.1177/0013164484444002

Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56(2), 81-105. https://doi.org/10.1037/h0046016 DOI: https://doi.org/10.1037/h0046016

Campbell, D. T., & Stanley, J. (1963). Experimental and Quasi-Experimental Designs for Research (1 edition). Boston: Wadsworth Publishing.

Carifio, J., & Perla, R. J. (2007). Ten common misunderstandings, misconceptions, persistent myths and urban legends about Likert scales and Likert response formats and their antidotes. Journal of Social Sciences, 3(3), 106-116. DOI: https://doi.org/10.3844/jssp.2007.106.116

https://doi.org/10.3844/jssp.2007.106.116 DOI: https://doi.org/10.3844/jssp.2007.106.116

Chen, H.-T. (1990). Theory-driven evaluations. Newbury Park: Sage.

Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences (2 edition). Hillsdale, N.J: Routledge.

Cook, T. D. (2002). Randomized Experiments in Educational Policy Research: A Critical Examination of the Reasons the Educational Evaluation Community Has Offered for Not Doing Them. Educational Evaluation and Policy Analysis, 24(3), 175-199. DOI: https://doi.org/10.3102/01623737024003175

https://doi.org/10.3102/01623737024003175 DOI: https://doi.org/10.3102/01623737024003175

Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: design & analysis issues for field settings. Rand McNally College Pub. Co.

Coryn, C. L. S., & Hobson, K. A. (2011). Using nonequivalent dependent variables to reduce internal validity threats in quasi-experiments: Rationale, history, and examples from practice. New Directions for Evaluation, 2011(131), 31-39. https://doi.org/10.1002/ev.375 DOI: https://doi.org/10.1002/ev.375

Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52(4), 281-302. https://doi.org/10.1037/h0040957 DOI: https://doi.org/10.1037/h0040957

Davis, D. A., Mazmanian, P. E., Fordis, M., Van Harrison, R., Thorpe, K. E., & Perrier, L. (2006). Accuracy of physician self-assessment compared with observed measures of competence: a systematic review. JAMA: The Journal of the American Medical Association, 296(9), 1094-1102. https://doi.org/10.1001/jama.296.9.1094 DOI: https://doi.org/10.1001/jama.296.9.1094

Dennis, M. L. (1990). Assessing the Validity of Randomized Field Experiments: An Example from Drug Abuse Treatment Research. Evaluation Review, 14(4), 347-373. https://doi.org/10.1177/0193841X9001400402 DOI: https://doi.org/10.1177/0193841X9001400402

Eckert, A. (2000). Situational enhancement of design validity: the case of training evaluation at the World Bank Institute. The American Journal of Evaluation, 21(2), 185-193. https://doi.org/10.1016/S1098-2140(00)00065-5 DOI: https://doi.org/10.1016/S1098-2140(00)00065-5

Ellis, P. D. (2010). The Essential Guide to Effect Sizes: Statistical Power, Meta-Analysis, and the Interpretation of Research Results. Cambridge University Press DOI: https://doi.org/10.1017/CBO9780511761676

https://doi.org/10.1017/CBO9780511761676 DOI: https://doi.org/10.1017/CBO9780511761676

Ford, J. K., & Weissbein, D. A. (1997). Transfer of Training: An Updated Review and Analysis. Performance Improvement Quarterly, 10(2), 22-41. https://doi.org/10.1111/j.1937-8327.1997.tb00047.x DOI: https://doi.org/10.1111/j.1937-8327.1997.tb00047.x

Gilham, S. A., Lucas, W. L., & Sivewright, D. (1997). The Impact of Drug Education and Prevention Programs: Disparity Between Impressionistic and Empirical Assessments. Evaluation Review, 21(5), 589-613. https://doi.org/10.1177/0193841X9702100504 DOI: https://doi.org/10.1177/0193841X9702100504

Hill, L. G., & Betz, D. L. (2005). Revisiting the Retrospective Pretest. American Journal of Evaluation, 26(4), 501-517. https://doi.org/10.1177/1098214005281356 DOI: https://doi.org/10.1177/1098214005281356

Holland, P. W. (1986). Statistics and Causal Inference. Journal of the American Statistical Association, 81(396), 945. https://doi.org/10.2307/2289064 DOI: https://doi.org/10.2307/2289064

Howard, G. S. (1980). Response-Shift Bias A Problem in Evaluating Interventions with Pre/Post Self-Reports. Evaluation Review, 4(1), 93-106. https://doi.org/10.1177/0193841X8000400105 DOI: https://doi.org/10.1177/0193841X8000400105

Howard, G. S., & Dailey, P. R. (1979). Response-shift bias: A source of contamination of self-report measures. Journal of Applied Psychology, 64(2), 144-150. DOI: https://doi.org/10.1037//0021-9010.64.2.144

https://doi.org/10.1037/0021-9010.64.2.144 DOI: https://doi.org/10.1037/0021-9010.64.2.144

Howard, G. S., Schmeck, R. R., & Bray, J. H. (1979). Internal Invalidity in Studies Employing Self-Report Instruments: A Suggested Remedy. Journal of Educational Measurement, 16(2), 129-135. https://doi.org/10.1111/j.1745-3984.1979.tb00094.x DOI: https://doi.org/10.1111/j.1745-3984.1979.tb00094.x

Kane, M. T. (2006). Validation. In R. L. Brennan (Ed.), Educational measurement (4th ed, pp. 17-64). Westport, CT: Praeger Publishers.

Keele, L. (2015). Causal Mediation Analysis: Warning! Assumptions Ahead. American Journal of Evaluation, 36(4), 500-513. https://doi.org/10.1177/1098214015594689 DOI: https://doi.org/10.1177/1098214015594689

Lam, T.C.M. (2008). Estimating program impact through judgment: A simple but bad idea? Paper presented at the annual meeting of the American Evaluation Association.

Lam, T.C.M. (2009). Do Self-Assessments Work to Detect Workshop Success? An Analysis of D'Eon et al.'s Argument and Recommendation. American Journal of Evaluation, 30(1), 93-105. DOI: https://doi.org/10.1177/1098214008327931

https://doi.org/10.1177/1098214008327931 DOI: https://doi.org/10.1177/1098214008327931

Lam, T.C.M. & Bengo, P. (2003). A comparison of three retrospective self-reporting methods of measuring change in instructional practices. American Journal of Evaluation, 24(1), 65-80. DOI: https://doi.org/10.1177/109821400302400106

https://doi.org/10.1177/109821400302400106 DOI: https://doi.org/10.1177/109821400302400106

Lam, T.C.M., Valencia, E., & Ardeshri, M. (2014). Item order effect in retrospective pretest method of self-reporting change. Paper presented at the annual meeting of the American Evaluation Association.

Lam, T.C.M. & Valve, L. (In Press). Cognitive Interview in Survey Development: Item Construction and Response Validation. In Harbaugh, G. & Luhanga, U. (Ed.) Basic Elements of Survey Research in Education: Addressing the Problems Your Advisor Never Told You About. American Educational Research Association (AERA).

LeBaron Wallace, T. (2011). An argument-based approach to validity in evaluation. Evaluation, 17(3), 233-246. https://doi.org/10.1177/1356389011410522 DOI: https://doi.org/10.1177/1356389011410522

Lipsey, M. W. (1993). Theory as method: Small theories of treatments. New Directions for Program Evaluation, 1993(57), 5-38. https://doi.org/10.1002/ev.1637 DOI: https://doi.org/10.1002/ev.1637

Lipsey, M. W., & Wilson, D. B. (1993). The efficacy of psychological, educational, and behavioral treatment: Confirmation from meta-analysis. American Psychologist, 48(12), 1181-1209. https://doi.org/10.1037/0003-066X.48.12.1181 DOI: https://doi.org/10.1037//0003-066X.48.12.1181

Mayne, J. (2001). Addressing Attribution through Contribution Analysis: Using Performance Measures Sensibly. Canadian Journal of Program Evaluation, 16(1), 1-24. DOI: https://doi.org/10.3138/cjpe.016.001

https://doi.org/10.3138/cjpe.016.001 DOI: https://doi.org/10.3138/cjpe.016.001

Mayne, J. (2012). Contribution analysis: Coming of age? Evaluation, 18(3), 270-280. https://doi.org/10.1177/1356389012451663 DOI: https://doi.org/10.1177/1356389012451663

Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational Measurement (3rd ed., pp. 13-110). New York, NY: MacMillan.

Mills, B. A., Caetano, R., & Rhea, A. E. (2014). Factor Structure of the Clinical Research Appraisal Inventory (CRAI). Evaluation & the Health Professions, 37(1), 71-82. https://doi.org/10.1177/0163278713500303 DOI: https://doi.org/10.1177/0163278713500303

Mueller, C. E., & Gaus, H. (2015). Assessing the Performance of the "Counterfactual as Self-Estimated by Program Participants": Results From a Randomized Controlled Trial. American Journal of Evaluation, 36(1), 7-24. https://doi.org/10.1177/1098214014538487 DOI: https://doi.org/10.1177/1098214014538487

Mueller, C. E., & Gaus, H. (2018). Treatment Effect Estimation Using Self-Estimated Counterfactuals Under Varying Conditions. Journal of MultiDisciplinary Evaluation, 14(30), 16-36. DOI: https://doi.org/10.56645/jmde.v14i30.484

https://doi.org/10.56645/jmde.v14i30.484 DOI: https://doi.org/10.56645/jmde.v14i30.484

Mueller, C. E., Gaus, H., & Rech, J. (2014). The Counterfactual Self-Estimation of Program Participants: Impact Assessment Without Control Groups or Pretests. American Journal of Evaluation, 35(1), 8-25. https://doi.org/10.1177/1098214013503182 DOI: https://doi.org/10.1177/1098214013503182

Mullikin, E. A., Bakken, L. L., & Betz, N. E. (2007). Assessing Research Self-Efficacy in Physician-Scientists: The Clinical Research APPraisal Inventory. Journal of Career Assessment, 15(3), 367-387. https://doi.org/10.1177/1069072707301232 DOI: https://doi.org/10.1177/1069072707301232

Nath, S. R. (2007). Self-Reporting and Test Discrepancy: Evidence from a National Literacy Survey in Bangladesh. International Review of Education / Internationale Zeitschrift Für Erziehungswissenschaft / Revue Internationale de l'Education, 53(2), 119-133. DOI: https://doi.org/10.1007/s11159-007-9037-0

https://doi.org/10.1007/s11159-007-9037-0 DOI: https://doi.org/10.1007/s11159-007-9037-0

Nimon, K., Zigarmi, D., & Allen, J. (2011). Measures of Program Effectiveness Based on Retrospective Pretest Data: Are All Created Equal? American Journal of Evaluation, 32(1), 8-28. https://doi.org/10.1177/1098214010378354 DOI: https://doi.org/10.1177/1098214010378354

Pawson, R., & Tilley, N. (2001). Realistic Evaluation Bloodlines. American Journal of Evaluation, 22(3), 317-324. https://doi.org/10.1177/109821400102200305 DOI: https://doi.org/10.1177/109821400102200305

Peck, L. R., Kim, Y., & Lucio, J. (2012). An Empirical Examination of Validity in Evaluation. American Journal of Evaluation, 33(3), 350-365. https://doi.org/10.1177/1098214012439929 DOI: https://doi.org/10.1177/1098214012439929

Phillip, J. J. (1996). Measuring ROI: the fifth level of evaluation. Technical & Skills Training, (April), 10-13.

Phillip, J. J., & Phillip, P. (2006). Measuring return of investment in leadership development. In K. Hannum, J. W. Martineau, & C. Reinelt (Eds.), The Handbook of Leadership Development Evaluation. John Wiley & Sons.

Phillips, J., & Stone, R. (2002). How to Measure Training Results: A Practical Guide to Tracking the Six Key Indicators (1 edition). New York: McGraw-Hill Education.

Porter, S., & O'Halloran, P. (2011). The use and limitation of realistic evaluation as a tool for evidence-based practice: a critical realist perspective. Nursing Inquiry, 19(1), 18-28. https://doi.org/10.1111/j.1440-1800.2011.00551.x DOI: https://doi.org/10.1111/j.1440-1800.2011.00551.x

Pratt, C. C., McGuigan, W. M., & Katzev, A. R. (2000). Measuring Program Outcomes: Using Retrospective Pretest Methodology. American Journal of Evaluation, 21(3), 341-349. https://doi.org/10.1177/109821400002100305 DOI: https://doi.org/10.1177/109821400002100305

Reichardt, C. S. (2000). A Typology of Strategies for Ruling Out Threats to Validity. In L. Bickman (Ed.), Research Design: Donald Campbell's Legacy (1 edition). Thousand Oaks, Calif: SAGE Publications, Inc.

Rindskopf, D. (2000). Plausible rival hypotheses in measurement, design and scientific theory. In L. Bickman (Ed.), Validity and social experimentation. Sage Publications.

Rogers, P. J., & Weiss, C. H. (2007). Theory-based evaluation: Reflections ten years on: Theory-based evaluation: Past, present, and future. New Directions for Evaluation, (114), 63-81. https://doi.org/10.1002/ev.225 DOI: https://doi.org/10.1002/ev.225

Schmitt, J., & Beach, D. (2015). The contribution of process tracing to theory-based evaluations of complex aid instruments. Evaluation, 21(4), 429-447. https://doi.org/10.1177/1356389015607739 DOI: https://doi.org/10.1177/1356389015607739

Schneider, B., Carnoy, M., Kilpatrick, J., Schmidt, W. H., & Shavelson, R. J. (2007). Estimating causal effects using experimental and observational designs. Washington, DC: American Educational Research Association.

Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Houghton Mifflin.

Solmeyer, A. R., & Constance, N. (2015). Unpacking the "Black Box" of Social Programs and Policies: Introduction. American Journal of Evaluation, 36(4), 470-474. https://doi.org/10.1177/1098214015600786 DOI: https://doi.org/10.1177/1098214015600786

Taylor, P. J., Russ-Eft, D. F., & Taylor, H. (2009). Gilding the Outcome by Tarnishing the Past Inflationary Biases in Retrospective Pretests. American Journal of Evaluation, 30(1), 31-43. https://doi.org/10.1177/1098214008328517 DOI: https://doi.org/10.1177/1098214008328517

Trochim, W. M. K. (1986). Editor's notes. New Directions for Program Evaluation, (31), 1-7. https://doi.org/10.1002/ev.1430 DOI: https://doi.org/10.1002/ev.1430

Weiss, C. H. (1997). Theory-based evaluation: Past, present, and future. New Directions for Evaluation, (76), 41-55. https://doi.org/10.1002/ev.1086 DOI: https://doi.org/10.1002/ev.1086

White, H. (2009). Theory-based impact evaluation: principles and practice. Journal of Development Effectiveness, 1(3), 271-284. https://doi.org/10.1080/19439340903114628 DOI: https://doi.org/10.1080/19439340903114628

Willis, G. B. (2004). Cognitive Interviewing: A Tool for Improving Questionnaire Design (1st ed.). Sage Publications, Inc.

Yin, R. K. (2000). Rival Explanations as an Alternative to Reforms as "Experiments." In L. Bickman (Ed.), Validity & social experimentation (Vol. 1). Thousand Oaks: SAGE Publications.