Propensity Scores: A Practical Introduction Using R

Main Article Content

Antonio Olmos
https://orcid.org/0009-0005-4914-4210
Priyalatha Govindasamy
https://orcid.org/0000-0003-3059-7330

Abstract

Background: This paper provides an introduction to propensity scores for evaluation practitioners.


Purpose: The purpose of this paper is to provide the reader with a conceptual and practical introduction to propensity scores, matching using propensity scores, and its implementation using statistical R program/software.


Setting: Not applicable


Intervention: Not applicable


Research Design: Not applicable


Data Collection and Analysis: Not applicable


Findings: In this demonstration paper, we describe the context in which propensity scores are used, including the conditions under which the use of propensity scores is recommended, as well as the basic assumptions needed for a correct implementation of the technique. Next, we describe some of the more common techniques used to conduct propensity score matching. We conclude with a description of the recommended steps associated with the implementation of propensity score matching using several packages developed in R, including syntax and brief interpretations of the output associated with every step.

Downloads

Download data is not yet available.

Article Details

How to Cite
Olmos, A., & Govindasamy, P. (2015). Propensity Scores: A Practical Introduction Using R. Journal of MultiDisciplinary Evaluation, 11(25), 68–88. https://doi.org/10.56645/jmde.v11i25.431
Section
Research on Evaluation Articles
Author Biographies

Antonio Olmos, University of Denver

Research Methods and Statistics

Associate Professor

Priyalatha Govindasamy, University of Denver

Research Methods and Statistics

References

Austin, P. C., Grootendorst, P., & Anderson, G. M. (2007). A comparison of the ability of different propensity score models to balance measured variables between treated and untreated subjects: A Monte Carlo study. Statistics in Medicine, 26, 734-753. https://doi.org/10.1002/sim.2580 DOI: https://doi.org/10.1002/sim.2580

Austin, P.C. (2008). A critical appraisal of propensity score matching in the medical literature between 19996-2003. Statistics in Medicine, 27, 2037-2049. https://doi.org/10.1002/sim.3150 DOI: https://doi.org/10.1002/sim.3150

Austin, P. C. (2011). An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivariate Behavioral Research, 46, 399-424. https://doi.org/10.1080/00273171.2011.568786 DOI: https://doi.org/10.1080/00273171.2011.568786

Bai, H., & Clark, M. H. (2012, October). Propensity score matching: Theories and Applications. Workshop presented at the American Evaluation Association, Minneapolis, MN.

Bowers, J., Fredrickson, M., & Hansen, B. (2014). RItools: Randomization Inference Tools. R package version 0.1-12.

Bonell, C. P., Hargreaves, J., Cousens, S., Ross, D., Hayes, R., Petticrew, M., & Kirkwood, B. R. (2009). Journal of Epidemiology Community Health, 1-6. https://doi.org/10.1136/jech.2008.082602 DOI: https://doi.org/10.1136/jech.2008.082602

Caliendo, M., & Kopeinig, S. (2008). Some practical guidance for the implementation of propensity score matching. Journal of Economic Surveys, 22(1), 31-72. https://doi.org/10.1111/j.1467-6419.2007.00527.x DOI: https://doi.org/10.1111/j.1467-6419.2007.00527.x

Campbell, D. T., & Stanley, J. C. (1963). Experimental and quasi-experimental designs for research. United States of America: Houghton Mifflin Company.

Cochran, W. G., & Rubin, D. B. (1973). Controlling bias in observational studies: A review. Indian Journal of Statistics Series, 35(4), 417-446.

Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: Design and analysis issues for field settings. Boston: Houghton Mifflin Company.

D'Agostino, R. B., & D'Agostino, R. B. (2007). Estimating treatment effects using observational data. Journal of American Medical Association, 297(3), 314-316. https://doi.org/10.1001/jama.297.3.314 DOI: https://doi.org/10.1001/jama.297.3.314

Drake, R. E., Goldman, H. H., Leff, H. S., Lehman, A. F., Dixon, L., Mueser, K. T., & Torrey, W. C. (2001). Implementing evidence-based practices in routine mental health service settings. Psychiatric Service, 52(2), 197-182. https://doi.org/10.1176/appi.ps.52.2.179 DOI: https://doi.org/10.1176/appi.ps.52.2.179

Draper, N. R., & Smith, H. (1998). Applied regression analysis. (3rd ed.). United States of America: John Wiley & Sons, Inc. https://doi.org/10.1002/9781118625590 DOI: https://doi.org/10.1002/9781118625590

Gagne, J. J. (2010). High-dimensional propensity scores for comparative effectiveness research. Presentation at the Lewin Summit, June 15, 2010

Gliner, J. A., Morgan, G. A., & Leech, N. L. (2009). Research methods in applied settings (2nd. Ed). Mahwah, NJ: Lawrence Erlbaum.

Gu, X. S., & Rosenbaum, P. R. (1993). Comparison of multivariate matching methods: Structures, distances, and algorithms. Journal of Computational and Graphical Statistics, 2(4), 405-420. https://doi.org/10.1080/10618600.1993.10474623 DOI: https://doi.org/10.1080/10618600.1993.10474623

Guo, X. S., & Fraser, M. W. (2015). Propensity score analysis: Statistical methods and applications (2nd ed.). Thousand Oaks, CA: Sage Publications, Inc.

Guskey. T. (1999). The age of our accountability. Journal of Staff Development, 19(4), 36-44.

Hansen, B. B., Fredrickson, M., Bertsekas, D., & Tseng, P., (2013) Package optmatch. R package version 0.8-1

Hansen, B. B. (2004). Full Matching in an Observational Study of Coaching for the SAT. Journal of the American Statistical Association, 99(467). https://doi.org/10.1198/016214504000000647 DOI: https://doi.org/10.1198/016214504000000647

Hansen, B. B., & Bowers, J. (2008). Covariate balance in simple, stratified and clustered comparative studies. Statistical Science, 23(2), 219-236. https://doi.org/10.1214/08-STS254 DOI: https://doi.org/10.1214/08-STS254

Harrell, F. E. (2015). Hmisc: Harrell Miscellaneous. R package version 3.15-0

Ho, D. E., Imai, K., King, G., & Stuart, E. A. (2011). MatchIt: Nonparametric preprocessing for parametric causal inference. Journal of Statistical Software, 42(8), 1-28. https://doi.org/10.18637/jss.v042.i08 DOI: https://doi.org/10.18637/jss.v042.i08

Holland, P. W. (1986). Statistics and causal inference. Journal of the American Statistical Association, 81(396), 945-960. https://doi.org/10.1080/01621459.1986.10478354 DOI: https://doi.org/10.1080/01621459.1986.10478354

Holmes, W. M. (2014). Using propensity scores in quasi-experimental design. United States of America: Sage Publication, Inc. https://doi.org/10.4135/9781452270098 DOI: https://doi.org/10.4135/9781452270098

Imbens, G. W., & Wooldridge, J. M. (2009). Recent developments in the econometrics of program evaluation. Journal of Economic Literature, 47(10), 5-86. https://doi.org/10.1257/jel.47.1.5 DOI: https://doi.org/10.1257/jel.47.1.5

Keele, L.J. (2015). Rbounds: An R Package For Sensitivity Analysis with Matched Data. R. package version 2.1

Lechner, M. (2008). A note on the common support problem in applied evaluation studies. Econometric Evaluation of Public Policies: Methods and Applications, 91/92, 217-235. https://doi.org/10.2307/27917246 DOI: https://doi.org/10.2307/27917246

McCaffrey, D. F., Ridgeway, G., & Morral, A. R. (2004). Propensity score estimation with boosted regression for evaluating causal effects in observational studies. Psychological Methods, 9(4), 403-425. https://doi.org/10.1037/1082-989X.9.4.403 DOI: https://doi.org/10.1037/1082-989X.9.4.403

Morgan S. L., & Winship, C. (2012). Counterfactuals and causal inference: Methods and principles for social research. New York: Cambridge University Press.

Olmos, A. & Govindasamy, P. (2014). Randomized experiments vs. Propensity scores matching: A Meta-analysis. Paper presented at the American Evaluation Association, Denver, CO.

R Core Team (2014). R: A language and environment for statistical computing. (3.0.3) [Computer software]. Vienna, Austria: Foundation for Statistical Computing.

Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41-55. https://doi.org/10.1093/biomet/70.1.41 DOI: https://doi.org/10.1093/biomet/70.1.41

Rosenbaum, P.R., & Rubin, D.B. (1985). Constructing a control group using multivariate matched sampling methods that incorporate the propensity score. The American Statistician, 39, 33-38. https://doi.org/10.1080/00031305.1985.10479383 DOI: https://doi.org/10.1080/00031305.1985.10479383

Rosenbaum, P. R. (2002). Observational studies. NY: Springer. https://doi.org/10.1007/978-1-4757-3692-2 DOI: https://doi.org/10.1007/978-1-4757-3692-2_1

Rosenbaum, P. R. (2005). Observational Study. In Everitt, B. S., & Howell, D. C. (3rd ed.), Encyclopedia of Statistics in Behavioral Science (pp. 1451-1462). Chichester: John Wiley & Sons. https://doi.org/10.1002/0470013192.bsa454 DOI: https://doi.org/10.1002/0470013192.bsa454

Rubin, D. B. (1979). Using multivariate matched sampling and regression adjustment to control bias in observational studies. Journal of the American Statistical Association, 74(366), 318-328. https://doi.org/10.1080/01621459.1979.10482513 DOI: https://doi.org/10.1080/01621459.1979.10482513

Rubin, D. B. (2005). Causal inference using potential outcomes: Design, modeling, decisions. Journal of the American Statistical Association, 100(469), 322-331. https://doi.org/10.1198/016214504000001880 DOI: https://doi.org/10.1198/016214504000001880

Scriven, M. (1991). Evaluation Thesaurus. Thousand Oaks, CA: Sage

Sekhon, J. S. (2011). Multivariate and propensity score matching software with automated balance optimization: The matching package for R. Journal of Statistical Software, 42(7), 1-52. https://doi.org/10.18637/jss.v042.i07 DOI: https://doi.org/10.18637/jss.v042.i07

Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental design for generalized causal inference. Boston: Houghton Mifflin Company.

Stuart, E. A., & Rubin, D. B. (2008). Best practices in quasi-experimental design: Matching methods for causal inference. In Osborne, J. Best Practices in Quantitative Methods (pp. 155-177). Thousand Oaks, CA: Sage. https://doi.org/10.4135/9781412995627.d14 DOI: https://doi.org/10.4135/9781412995627.d14

Stuart, E. A. (2010). Matching methods for causal inference: A review and a look forward. Statistical Science, 25(1), 1-21. https://doi.org/10.1214/09-STS313 DOI: https://doi.org/10.1214/09-STS313

Trochim, W. M. K. (1984). Research design for program evaluation. Thousand Oaks, CA: Sage.

Weiss, C. H. (1998). Evaluation: Methods for Studying Programs and Policies. Upper Saddle NJ: Prentice Hall

Zhao, Z. (2004). Using matching to estimate treatment effects: Data requirements, matching metrics, and Monte Carlo evidence. Review of Economics and Statistics, 86(1), 91-107. https://doi.org/10.1162/003465304323023705 DOI: https://doi.org/10.1162/003465304323023705