Using Principal Components Analysis in Program Evaluation: Some Practical Considerations
Main Article Content
Abstract
Principal Components Analysis (PCA) is widely used by behavioral science researchers to assess the dimensional structure of data and for data reduction purposes. Despite the wide array of analytic choices available, many who employ this method continue to rely exclusively on the default options recommended in dominant statistical packages. This paper examines alternative analytic strategies to guide interpretation of PCA results that expand on these default options, including (a) rules for retaining factors or components and (b) rotation strategies. Conventional wisdom related to the interpretation of pattern/structure coefficients also is challenged. Finally, the use of principal component scores in subsequent analyses is explored. A small set of actual data is used to facilitate illustrations and discussion.
Despite the increasing popularity of confirmatory factor analytic (CFA) techniques, principal components analysis (PCA) continues to enjoy widespread use (Kellow, 2004; Thompson, 2004). Researchers who employ PCA are typically interested in (a) assessing the dimensional structure of a dataset (Dunteman, 1989) or (b) reducing a large number of variables into a smaller set of linear combinations (components) for subsequent analyses (e.g., multiple regression). For instance, an evaluator may have occasion to develop a new instrument and wish to ascertain the number and features of the underlying dimensions represented in the data. At other times an existing measure is modified or shortened and the sample data are used to explore the extent to which the structure of the original version has or has not been substantively altered (although CFA is a stronger method for this purpose). The PCA approach also is useful for creating new variables that are linear combinations of a set of highly correlated original variables. These new composite variables may then be used in subsequent analyses. As Stephens (1992) notes, “... if there are 30 variables (whether predictors or items), we are undoubtedly not measuring 30 different constructs, hence, it makes sense to find some variable reduction scheme that will indicate how the variables cluster or hang together” (p. 374). Use of PCA helps to solve at least two problems. First, the presence of multicollinearity (high inter-item or variable correlations) leads to inflated standard errors for the measured variables when conducting statistical analyses, which increases the probability of Type II errors (non-significance when a significant difference exists in the population). Second, when one is using a large set of variables to predict or explain another variable (or set of variables) as opposed to a smaller set of composites, one pays a price in terms of the degrees of freedom used in the analysis. All other things being equal, the more degrees of freedom expended the smaller the value of the omnibus test statistic (e.g., F) that results from the analysis (Stephens, 1992).
There are a number of important issues related to the data in hand that need to be addressed (e.g., linearity; absence of outliers) before invoking PCA, and readers are referred to Tabachnick and Fidell (2001) for an excellent overview of these considerations. Once PCA is determined to be appropriate, the analysis proceeds in a series of sequential steps―several options are available to researchers at each step. Too often researchers rely on the default options provided in the major statistical packages and fail to examine other options that may allow for fuller exploitation of the data. The purpose of the present paper is to briefly explore the options available to analysts with respect to (a) rules for retaining principal components and (c) rotation strategies. In addition, conventional wisdom related to the interpretation of pattern/structure coefficients is challenged on substantive grounds. Finally, we briefly explore how PCA may be used to derive component scores for further data analysis.
Downloads
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Copyright and Permissions
Authors retain full copyright for articles published in JMDE. JMDE publishes under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY - NC 4.0). Users are allowed to copy, distribute, and transmit the work in any medium or format for noncommercial purposes, provided that the original authors and source are credited accurately and appropriately. Only the original authors may distribute the article for commercial or compensatory purposes. To view a copy of this license, visit creativecommons.org
References
Cattell, R. B. (1966). The meaning and strategic use of factor analysis. In R.B. Cattell (Ed.), Handbook of multivariate experimental psychology (pp. 174-243). Chicago: Rand McNally.
Dunteman, G. H. (1989). Principal components analysis. Newbury Park, CA: Sage. DOI: https://doi.org/10.4135/9781412985475
https://doi.org/10.4135/9781412985475 DOI: https://doi.org/10.4135/9781412985475
Fabrigar, L. R., Wegener, D. T., MacCallum, R. C., & Strahan, E. J. (1999). Evaluating the use of exploratory factor analysis in psychological research. Psychological Methods, 4, 272-299. DOI: https://doi.org/10.1037//1082-989X.4.3.272
https://doi.org/10.1037/1082-989X.4.3.272 DOI: https://doi.org/10.1037/1082-989X.4.3.272
Gorsuch, R. L. (1983). Factor analysis. Hillsdale, NJ: Earlbaum. Hogarty, K. Y.,
Kromrey, J. D., Ferron, J. M., & Hines, C. V. (in press). Selection of variables in exploratory factor analysis: An empirical comparison of a stepwise and traditional approach. Psychometrika.
Horn, J. L. (1965). A rationale and test for the number of factors in factor analysis. Psychometrika, 30, 179-185. DOI: https://doi.org/10.1007/BF02289447
https://doi.org/10.1007/BF02289447 DOI: https://doi.org/10.1007/BF02289447
Kaiser, H. F. (1970). A second generation Little Jiffy. Psychometrika, 35, 401-415. DOI: https://doi.org/10.1007/BF02291817
https://doi.org/10.1007/BF02291817 DOI: https://doi.org/10.1007/BF02291817
Katzenmeyer, W. (1994). School culture quality survey. Tampa, FL: Anchin Center.
Kim, J. O., & Mueller, C. W. (1978). Factor analysis: Statistical methods and practical issues. Newbury Park, CA: Sage. DOI: https://doi.org/10.4135/9781412984256
https://doi.org/10.4135/9781412984256 DOI: https://doi.org/10.4135/9781412984256
Kellow, J. T. (2004, July). Exploratory Factor Analysis in Two Prominent Journals: Hegemony by Default. Paper presented at the annual meeting of the American Psychological Association, Honolulu, HI.
May, H. (2004). Making statistics more meaningful for policy research and program evaluation. American Journal of Evaluation, 25, 525-540. DOI: https://doi.org/10.1016/j.ameval.2004.09.004
https://doi.org/10.1177/109821400402500408 DOI: https://doi.org/10.1177/109821400402500408
Nasser, F., Benson, J. & Wisenbaker, J. (2002). The performance of regression-based variations of the visual scree for determining the number of common factors. Educational and Psychological Measurement, 62, 397-419. DOI: https://doi.org/10.1177/00164402062003001
https://doi.org/10.1177/00164402062003001 DOI: https://doi.org/10.1177/00164402062003001
Nunnally, J. C. (1978). Psychometric Theory (2nd Ed.). New York: McGraw Hill.
Preacher, K. J., & MacCallum, R. C. (2003). Repairing Tom Swift's electronic factoring machine. Understanding Statistics, 2, 13-43. DOI: https://doi.org/10.1207/S15328031US0201_02
https://doi.org/10.1207/S15328031US0201_02 DOI: https://doi.org/10.1207/S15328031US0201_02
Russell. D. W. (2002). In search of underlying dimensions: The use (and abuse) of factor analysis in Personality and Social Psychology Bulletin, 28, 1629-1646. DOI: https://doi.org/10.1177/014616702237645
https://doi.org/10.1177/014616702237645 DOI: https://doi.org/10.1177/014616702237645
Stevens, J. (1992). Applied multivariate statistics for the social sciences (2nd. Ed.). Hillsdale, NJ: Earlbaum.
Tabachnick, B. G., & Fidell, L. S. (2001). Using multivariate statistics (4th Ed.). Needham Heights, MA: Allyn & Bacon.
Thompson, B., & Daniel, L. G. (1996). Factor analytic evidence for the construct validity of scores: A historical overview and some guidelines. Educational and Psychological Measurement, 56, 197-208. DOI: https://doi.org/10.1177/0013164496056002001
https://doi.org/10.1177/0013164496056002001 DOI: https://doi.org/10.1177/0013164496056002001
Thompson, B. (2002). What future quantitative social science research could look like: Confidence intervals for effect sizes. Educational Researcher, 31(3), 24-31. DOI: https://doi.org/10.3102/0013189X031003025
https://doi.org/10.3102/0013189X031003025 DOI: https://doi.org/10.3102/0013189X031003025
Thompson, B. (2004). Exploratory and confirmatory factor analysis: Understanding concepts and application. Washington, DC: American Psychological Association. DOI: https://doi.org/10.1037/10694-000
https://doi.org/10.1037/10694-000 DOI: https://doi.org/10.1037/10694-000