Quantitative Methods for Estimating the Reliability of Qualitative Data

Jason W. Davey; P. Cristian Gugiu; Chris L. S. Coryn

doi:10.56645/jmde.v6i13.266

PDF

Published: Feb 4, 2010

DOI: https://doi.org/10.56645/jmde.v6i13.266

Keywords:

qualitative coding, qualitative methodology, reliability coefficients

Jason W. Davey

Fenwal, Inc.

P. Cristian Gugiu

Western Michigan University

https://orcid.org/0000-0003-0022-287X

Chris L. S. Coryn

Western Michigan University

Abstract

Background: Measurement is an indispensable aspect of conducting both quantitative and qualitative research and evaluation. With respect to qualitative research, measurement typically occurs during the coding process.

Purpose: This paper presents quantitative methods for determining the reliability of conclusions from qualitative data sources. Although some qualitative researchers disagree with such applications, a link between the qualitative and quantitative fields is successfully established through data collection and coding procedures.

Setting: Not applicable.

Intervention: Not applicable.

Research Design: Case study.

Data Collection and Analysis: Narrative data were collected from a random sample of 528 undergraduate students and 28 professors.

Findings: The calculation of the kappa statistic, weighted kappa statistic, ANOVA Binary Intraclass Correlation, and Kuder-Richardson 20 is illustrated through a fictitious example. Formulae are presented so that the researcher can calculate these estimators without the use of sophisticated statistical software.

Downloads

Download data is not yet available.

How to Cite

Davey, J. W., Gugiu, P. C., & Coryn, C. L. S. (2010). Quantitative Methods for Estimating the Reliability of Qualitative Data. Journal of MultiDisciplinary Evaluation, 6(13), 140–162. https://doi.org/10.56645/jmde.v6i13.266

Issue

Vol. 6 No. 13 (2010)

Section

Research on Evaluation Articles

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Copyright and Permissions

Authors retain full copyright for articles published in JMDE. JMDE publishes under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY - NC 4.0). Users are allowed to copy, distribute, and transmit the work in any medium or format for noncommercial purposes, provided that the original authors and source are credited accurately and appropriately. Only the original authors may distribute the article for commercial or compensatory purposes. To view a copy of this license, visit creativecommons.org

References

Armstrong, D., Gosling, A., Weinman, J., & Marteau, T. (1997). The place of inter-rater reliability in qualitative research: An empirical study. Sociology, 31(3), 597-606. https://doi.org/10.1177/0038038597031003015 DOI: https://doi.org/10.1177/0038038597031003015

Bartoszynski, R., & Niewiadomska-Bugaj, M. (1996). Probability and statistical inference. New York, NY: John Wiley.

Benaquisto, L. (2008). Axial coding. In L. M. Given (Ed.), The Sage encyclopedia of qualitative research methods (Vol. 1, pp. 51-52). Thousand Oaks, CA: SAGE.

Benaquisto, L. (2008). Coding frame. In L. M. Given (Ed.), The Sage encyclopedia of qualitative research methods (pp. 88-89). Thousand Oaks, CA: Sage.

Benaquisto, L. (2008). Open coding. In L. M. Given (Ed.), The Sage encyclopedia of qualitative research methods (Vol. 2, pp. 581-582). Thousand Oaks, CA: Sage.

Benaquisto, L. (2008). Selective coding. In L. M. Given (Ed.), The Sage encyclopedia of qualitative research methods. Thousand Oaks, CA: Sage.

Bonett, D. G. (2002). Sample size requirements for testing and estimating coefficient alpha. Journal of Educational and Behavioral Statistics, 27, 335-340. https://doi.org/10.3102/10769986027004335 DOI: https://doi.org/10.3102/10769986027004335

Bonett, D. G. & Wright, T. A. (2000). Sample size requirements for estimating Pearson, Kendall, and Spearman correlations. Psychometrika, 65, 23-28. https://doi.org/10.1007/BF02294183 DOI: https://doi.org/10.1007/BF02294183

Brodsky, A. E. (2008). Researcher as instrument. In L. M. Given (Ed.), The Sage encyclopedia of qualitative research methods (Vol. 2, p. 766). Thousand Oaks, CA: Sage.

Burla, L., Knierim, B., Barth, J., Liewald, K., Duetz, M., & Abel, T. (2008). From text to codings: Intercoder reliability assessment in qualitative content analysis. Nursing Research, 57, 113- 117. https://doi.org/10.1097/01.NNR.0000313482.33917.7d DOI: https://doi.org/10.1097/01.NNR.0000313482.33917.7d

Cascio, W. F. (1991). Applied psychology in personnel management (4th ed.). Englewood Cliffs, NJ: Prentice-Hall International.

Cheek, J. (2008). Funding. In L. M. Given (Ed.), The Sage encyclopedia of qualitative research methods (Vol. 1, pp. 360-363). Thousand Oaks, CA: Sage.

Cohen, J. (1960). A coefficient of agreement from nominal scales. Educational and Psychological Measurement, 20, 37-46. https://doi.org/10.1177/001316446002000104 DOI: https://doi.org/10.1177/001316446002000104

Cohen, J. (1968). Weighted kappa: Nominal scale agreement with provision for scaled disagreement or partial credit. Psychological Bulletin, 70, 213-220. https://doi.org/10.1037/h0026256 DOI: https://doi.org/10.1037/h0026256

Coryn, C. L. S. (2007). The holy trinity of methodological rigor: A skeptical view. Journal of MultiDisciplinary Evaluation, 4(7), 26-31. https://doi.org/10.56645/jmde.v4i7.7 DOI: https://doi.org/10.56645/jmde.v4i7.7

Creswell, J. W. (2007). Qualitative inquiry & research design: Choosing among five approaches (2nd ed.). Thousand Oaks, CA: Sage.

Crocker, L., & Algina, J. (1986). Introduction to classical & modern test theory. Fort Worth, TX: Holt, Rinehart, & Winston.

Davis, C. S. (2008). Hypothesis. In L. M. Given (Ed.), The Sage encyclopedia of qualitative research methods (Vol. 1, pp. 408-409). Thousand Oaks, CA: SageAGE.

Dillon, W. R., & Mulani, N. (1984). A probabilistic latent class model for assessing inter-judge reliability. Multivariate Behavioral Research, 19, 438-458. https://doi.org/10.1207/s15327906mbr1904_5 DOI: https://doi.org/10.1207/s15327906mbr1904_5

Efron, B. & Tibshirani, R. J. (1993). An introduction to the bootstrap. New York, NY: Chapman & Hall/CRC. https://doi.org/10.1007/978-1-4899-4541-9 DOI: https://doi.org/10.1007/978-1-4899-4541-9

Elston, R. C., Hill, W. G., & Smith, C. (1977). Query: Estimating "Heritability" of a dichotomous trait. Biometrics, 33, 231-236. https://doi.org/10.2307/2529318 DOI: https://doi.org/10.2307/2529318

Everitt, B. S. (1968). Moments of the statistics kappa and weighted kappa. The British Journal of Mathematical and Statistical Psychology, 21, 97-103. https://doi.org/10.1111/j.2044-8317.1968.tb00400.x DOI: https://doi.org/10.1111/j.2044-8317.1968.tb00400.x

Feldt, L. S. & Ankenmann, R. D. (1998). Appropriate sample size for comparison alpha reliabilities. Applied Psychological Measurement, 22, 170- 178. https://doi.org/10.1177/01466216980222006 DOI: https://doi.org/10.1177/01466216980222006

Firmin, M. W. (2008). Replication. In L. M. Given (Ed.), The Sage encyclopedia of qualitative research methods (Vol. 2, pp. 754-755). Thousand Oaks, CA: Sage.

Fleiss, J. L. (1971). Measuring nominal scale agreement among many raters. Psychological Bulletin, 76, 378-382. https://doi.org/10.1037/h0031619 DOI: https://doi.org/10.1037/h0031619

Fleiss, J. L., Cohen, J., & Everitt, B. S. (1969).Largesamplestandarderrors of kappa and weighted kappa. Psychological Bulletin, 72, 323-327. https://doi.org/10.1037/h0028106 DOI: https://doi.org/10.1037/h0028106

Fleiss, J. L., & Cuzick, J. (1979). The reliability of dichotomous judgments: Unequal numbers of judges per subject. Applied Psychological Measurement, 3, 537-542. https://doi.org/10.1177/014662167900300410 DOI: https://doi.org/10.1177/014662167900300410

George, D. & Mallery, P. (2003). SPSS for Windows step by step: A simple guide and reference. 11.0 update (4th ed.). Boston, MA: Allyn & Bacon.

Given, L. M., & Saumure, K. (2008). Trustworthiness. In L. M. Given (Ed.), The Sage encyclopedia of qualitative research methods (Vol. 2, pp. 895- 896). Thousand Oaks, CA: Sage. https://doi.org/10.4135/9781412963909 DOI: https://doi.org/10.4135/9781412963909

Golafshani, N. (2003). Understanding reliability and validity in qualitative research. The Qualitative Report, 8(4), 597-607.

Greene, J. C. (2007). Mixed methods in social inquiry. Thousand Oaks, CA: Sage.

Gulliksen, H. (1950). Theory of mental tests. New York: Wiley. https://doi.org/10.1037/13240-000 DOI: https://doi.org/10.1037/13240-000

Hettmansperger, T. P. & McKean, J. (1998). Kendalls library of statistics 5, robust nonparametric statistical models. London: Arnold.

Hogg, R. V. & Craig, A. T. (1995). Introduction to mathematical statistics (5th ed.). Upper Saddle River, NJ: Prentice Hall.

Hogg, R. V., McKean, J. W., & Craig, A. T. (2004). Introduction to mathematical statistics (6th ed.). Upper Saddle Rover, NJ: Prentice Hall.

Hopkins, K. D. (1998). Educational and psychological measurement and evaluation (8th ed.). Boston, MA: Allyn and Bacon.

Jensen, D. (2008). Confirmability. In L. M. Given (Ed.), The Sage encyclopedia of qualitative research methods (Vol. 1, p. 112). Thousand Oaks, CA: Sage.

Jensen, D. (2008). Credibility. In L. M. Given(Ed.),TheSageencyclopediaof qualitative research methods (Vol. 1, pp. 138-139). Thousand Oaks, CA: Sage.

Jensen, D. (2008). Dependability. In L. M. Given (Ed.), The Sage encyclopedia of qualitative research methods (Vol. 1, pp. 208-209). Thousand Oaks, CA: Sage.

Karlin, S., Cameron, P. E., & Williams, P. (1981). Sibling and parent-offspring correlation with variable family age. Proceedings of the National Academy of Science, U.S.A. 78, 2664-2668. https://doi.org/10.1073/pnas.78.5.2664 DOI: https://doi.org/10.1073/pnas.78.5.2664

Kim, K. & Timm, N. (2007). Univariate and multivariate general linear models: Theory and applications with SAS (2nd ed.). New York, NY: Chapman & Hall/CRC. https://doi.org/10.1201/b15891 DOI: https://doi.org/10.1201/b15891

Kleinman, J. C. (1973). Proportions with extraneous variance: Single and independent samples. Journal of the American Statistical Association, 68, 46-54. https://doi.org/10.1080/01621459.1973.10481332 DOI: https://doi.org/10.1080/01621459.1973.10481332

Krippendorf, K. (2004). Content analysis: An introduction to its methodology (2nd ed.). Thousand Oaks, CA: Sage.

Kuder,G.F.,&Richardson,M.W.(1937). The theory of estimation of test reliability. Psychometrika, 2, 151-160. https://doi.org/10.1007/BF02288391 DOI: https://doi.org/10.1007/BF02288391

Landis, J. R., & Koch, G. C. (1977). The measurement of observer agreement for categorical data. Biometrics, 33, 159-174. https://doi.org/10.2307/2529310 DOI: https://doi.org/10.2307/2529310

Lincoln, Y. S., & Guba, E. G. (1985). Naturalistic inquiry. Newbury Park, CA: Sage. https://doi.org/10.1016/0147-1767(85)90062-8 DOI: https://doi.org/10.1016/0147-1767(85)90062-8

Lipsitz, S. R., Laird, N. M., & Brennan, T. A. (1994). Simple moment estimates of the κ-coefficient and its variance. Applied Statistics, 43, 309-323. https://doi.org/10.2307/2986022 DOI: https://doi.org/10.2307/2986022

Lord, F. M. & Novick, M. R. (1968). Statistical theories of mental test scores. Reading, MA: Addison-Wesley.

Maclure, M. & Willett, W. C. (1987). Misinterpretation and misuse of the kappa statistic. Journal of Epidemiology, 126, 161-169. https://doi.org/10.1093/aje/126.2.161 DOI: https://doi.org/10.1093/aje/126.2.161

Magee, B. (1985). Popper. London: Routledge Falmer.

Mak, T. K. (1988). Analyzing intraclass correlation for dichotomous variables. Applied Statistics, 37, 344-252. https://doi.org/10.2307/2347309 DOI: https://doi.org/10.2307/2347309

Marshall, C., & Rossman, G. B. (2006). Designing qualitative research (4th ed.). Thousand Oaks, CA: Sage.

Maxwell, A. E. (1977). Coefficients of agreement between observers and their interpretation. British Journal of Psychiatry, 130, 79-83. https://doi.org/10.1192/bjp.130.1.79 DOI: https://doi.org/10.1192/bjp.130.1.79

Miles, M. B., & Huberman, A. M. (1994). Qualitative data analysis: An expanded sourcebook (2nd ed.). Thousand Oaks, CA: Sage.

Miller, P. (2008). Reliability. In L. M. Given (Ed.), The Sage encyclopedia of qualitative research methods (Vol. 2, pp. 753-754). Thousand Oaks, CA: Sage.

Mitchell, S. K. (1979). Interobserver agreement, reliability, and generalizability of data collected in observational studies. Psychological Bulletin, 86, 376-390. https://doi.org/10.1037/0033-2909.86.2.376 DOI: https://doi.org/10.1037//0033-2909.86.2.376

Morse, J. M., Barrett, M., Mayan, M., Olson, K., & Spiers, J. (2002). Verification strategies for establishing reliability and validity in qualitative research. International Journal of Qualitative Methods, 1(2), 13-22. https://doi.org/10.1177/160940690200100202 DOI: https://doi.org/10.1177/160940690200100202

Nelder, J. A., & Pregibon, D. (1987). An extended quasi-likelihood function. Biometrika, 74, 221-232. https://doi.org/10.1093/biomet/74.2.221 DOI: https://doi.org/10.1093/biomet/74.2.221

Nunnally, J. C. (1978). Psychometric theory(2nded.).NewYork:McGraw- Hill.

Paley, J. (2008). Positivism. In L. M. Given (Ed.), The Sage encyclopedia of qualitative research methods (Vol. 2, pp. 646-650). Thousand Oaks, CA: Sage.

Ridout, M. S., Demétrio, C. G. B., & Firth, D. (1999). Estimating intraclass correlations for binary data. Biometrics, 55, 137-148. https://doi.org/10.1111/j.0006-341X.1999.00137.x DOI: https://doi.org/10.1111/j.0006-341X.1999.00137.x

Ross, S. (1997). A first course in probability (5th ed.). Upper Saddle River, NJ: Prentice Hall.

Rozzeboom, W. W. (1966). Foundations of the theory of prediction. Homewood, IL: Dorsey.

Saumure, K., & Given, L. M. (2008). Rigor in qualitative research. In L. M. Given (Ed.), The Sage encyclopedia of qualitative research methods (Vol. 2, pp. 795-796). Thousand Oaks, CA: Sage.

Schmitt, N. (1996). Uses and abuses of coefficient alpha. Psychological Assessment, 8, 81-84. https://doi.org/10.1037/1040-3590.8.4.350 DOI: https://doi.org/10.1037/1040-3590.8.4.350

Seale, C. (1999). Quality in qualitative research. Qualitative Inquiry, 5(4), 465-478. https://doi.org/10.1177/107780049900500402 DOI: https://doi.org/10.1177/107780049900500402

Smith, D. M. (1983). Algorithm AS189: Maximum likelihood estimation of the parameters of the beta binomial distribution. Applied Statistics, 32, 196-204. https://doi.org/10.2307/2347299 DOI: https://doi.org/10.2307/2347299

Soeken, K. L., & Prescott, P. A. (1986). Issues in the use of kappa to estimate reliability. Medical Care, 24, 733-741. https://doi.org/10.1097/00005650-198608000-00008 DOI: https://doi.org/10.1097/00005650-198608000-00008

Stapleton, J. H. (1995). Linear statistical models. New York, NY: John Wiley & Sons, Inc. https://doi.org/10.1002/9780470316924 DOI: https://doi.org/10.1002/9780470316924

Stenbacka, C. (2001). Qualitative research requires quality concepts of its own. Management Decision, 39(7), 551-555. https://doi.org/10.1108/EUM0000000005801 DOI: https://doi.org/10.1108/EUM0000000005801

Tamura, R. N., & Young, S. S. (1987). A stabilized moment estimator for the beta-binomial distribution. Biometrics, 43, 813-824. https://doi.org/10.2307/2531535 DOI: https://doi.org/10.2307/2531535

van den Hoonaard, W. C. (2008). Inter- and intracoder reliability. In L. M. Given (Ed.), The Sage encyclopedia of qualitative research methods (Vol. 1, pp. 445-446). Thousand Oaks, CA: Sage.

Yamamoto, E., & Yanagimoto, T. (1992). Moment estimators for the binomial distribution. Journal of Applied Statistics, 19, 273-283. https://doi.org/10.1080/02664769200000023 DOI: https://doi.org/10.1080/02664769200000023

Article Sidebar

Main Article Content

Abstract

Downloads

Article Details

References

Most read articles by the same author(s)