REFERENCES

  1. Agresti, A. (1992). Modeling patterns of agreement and disagreement. Statistical Methods in Medical Research 1, 201–218.
  2. Alanen, E. (2010). Everything all right in method comparison studies? Statistical Methods in Medical Research 21, 297–309.
  3. Altman, D. G. and Bland, J. M. (1983). Measurement in medicine: The analysis of method comparison studies. The Statistician 32, 307–317.
  4. Altman, D. G. and Bland, J. M. (1987). Comparing methods of measurement [Letter]. Applied Statistics 36, 224–225.
  5. Altman, D. G. and Bland, J. M. (2002). Commentary on quantifying agreement between two methods of measurement [Letter]. Clinical Chemistry 48, 801–802.
  6. Andrés, A. M. and Marzo, P. F. (2005). Chance-corrected measures of reliability and validity in K × K tables. Statistical Methods in Medical Research 14, 473–492.
  7. Arellano-Valle, R. B., Bolfarine, H. and Lachos, V. H. (2005). Skew-normal linear mixed models. Journal of Data Science 3, 415–438.
  8. Atkinson, G. and Nevill, A. (1997). Comment on the use of concordance correlation to assess the agreement between two variables. Biometrics 53, 775–777.
  9. Bablok, W., Passing, H., Bender, R. and Schneider, B. (1988). A general regression procedure for method transformation. Application of linear regression procedures for method comparison studies in clinical chemistry, Part III. Journal of Clinical Chemistry and Clinical Biochemistry 26, 783–790.
  10. Bangdiwala, S. I. (1985). A graphical test for observer agreement. In International Statistical Institute Centenary Session 1985, pp. 307-308, International Statistical Institute, Amsterdam.
  11. Barlow, W. (1996). Measurement of interrater agreement with adjustment for covariates. Biometrics 52, 695–702.
  12. Barlow, W., Lai, M.-Y. and Azen, S. P. (1991). A comparison of methods for calculating a stratified kappa. Statistics in Medicine 10, 1465–1472.
  13. Barnett, R. N. (1965). A scheme for the comparison of quantitative methods. American Journal of Clinical Pathology 43, 562–569.
  14. Barnett, R. N. and Youden, W. J. (1970). A revised scheme for the comparison of quantitative methods. American Journal of Clinical Pathology 54, 454–462.
  15. Barnhart, H. X. and Williamson, J. M. (2001). Modeling concordance correlation via GEE to evaluate reproducibility. Biometrics 57, 931–940.
  16. Barnhart, H. X., Haber, M. J. and Lin, L. I. (2007a). An overview on assessing agreement with continuous measurement. Journal of Biopharmaceutical Statistics 17, 529–569.
  17. Barnhart, H. X., Haber, M. J. and Song, J. (2002). Overall concordance correlation coefficient for evaluating agreement among multiple observers. Biometrics 58, 1020–1027.
  18. Barnhart, H. X., Kosinski, A. S. and Haber, M. J. (2007b). Assessing individual agreement. Journal of Biopharmaceutical Statistics 17, 697–719.
  19. Barnhart, H. X., Lokhnygina, Y., Kosinski, A. S. and Haber, M. J. (2007c). Comparison of concordance correlation coefficient and coefficient of individual agreement in assessing agreement. Journal of Biopharmaceutical Statistics 17, 721–738.
  20. Barnhart, H. X., Song, J. and Haber, M. J. (2005). Assessing intra, inter and total agreement with replicated readings. Statistics in Medicine 24, 1371–1384.
  21. Bartko, J. J. (1994). Measures of agreement: A single procedure. Statistics in Medicine 13, 737–745.
  22. Bartlett, J. W. and Frost, C. (2008). Reliability, repeatability and reproducibility: Analysis of measurement errors in continuous variables. Ultrasound in Obstetrics and Gynecology 31, 466–475.
  23. Bates, D. and Maechler, M. (2015). Matrix: Sparse and Dense Matrix Classes and Methods. R package version 1.2-3.
  24. Bates, D., Mächler, M., Bolker, B. and Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software 67, 1–48.
  25. Blackwood, L. G. and Bradley, E. L. (1991). An omnibus test for comparing 2 measuring devices. Journal of Quality Technology 23, 12–16.
  26. Bland, J. M. and Altman, D. G. (1986). Statistical methods for assessing agreement between two methods of clinical measurement. Lancet i, 307–310.
  27. Bland, J. M. and Altman, D. G. (1990). A note on the use of the intraclass correlation coefficient in the evaluation of agreement between two methods of measurement. Computers in Biology and Medicine 20, 337–340.
  28. Bland, J. M. and Altman, D. G. (1995a). Comparing two methods of clinical measurement: A personal history. International Journal of Epidemiology 24, S7–S14.
  29. Bland, J. M. and Altman, D. G. (1995b). Comparing methods of measurement: Why plotting difference against standard method is misleading. Lancet 346, 1085–1087.
  30. Bland, J. M. and Altman, D. G. (1999). Measuring agreement in method comparison studies. Statistical Methods in Medical Research 8, 135–160.
  31. Bland, J. M. and Altman, D. G. (2003). Applying the right statistics: Analyses of measurement studies. Ultrasound in Obstetrics and Gynecology 22, 85–93.
  32. Bland, J. M. and Altman, D. G. (2007). Agreement between methods of measurement with multiple observations per individual. Journal of Biopharmaceutical Statistics 17, 571–582.
  33. Bloch, D. A. and Kraemer, H. C. (1989). 2 × 2 kappa coefficients: Measures of agreement or association. Biometrics 45, 269–287.
  34. Bowling, L. S., Sageman, W. S., O’Connor, S. M., Cole, R. and Amundson, D. E. (1993). Lack of agreement between measurement of ejection fraction by impedance cardiography versus radionuclide ventriculography. Critical Care Medicine 21, 1523–1527.
  35. Bradley, E. L. and Blackwood, L. G. (1989). Comparing paired data: A simultaneous test for means and variances. The American Statistician 43, 234–235.
  36. Brockwell, P. J. and Davis, R. A. (2002). Introduction to Time Series and Forecasting, 2nd edn. Springer, New York.
  37. Broemeling, L. D. (2009). Bayesian Methods for Measures of Agreement. Chapman & Hall/CRC, Boca Raton, FL.
  38. Brulez, K., Choudhary, P. K., Maurer, G., Portugal, S. J., Boulton, R. L., Webber, S. L. and Cassey, P. (2014). Visual scoring of eggshell patterns has poor repeatability. Journal of Ornithology 155, 701–706.
  39. Byrt, T., Bishop, J. and Carlin, J. B. (1993). Bias, prevalence and kappa. Journal of Clinical Epidemiology 46, 423–429.
  40. Carrasco, J. L. and Jover, L. (2003). Estimating the generalized concordance correlation coefficient through variance components. Biometrics 59, 849–858.
  41. Carrasco, J. L., Caceres, A., Escaramis, G. and Jover, L. (2014). Distinguishability and agreement with continuous data. Statistics in Medicine 33, 117–128.
  42. Carrasco, J. L., Jover, L., King, T. S. and Chinchilli, V. M. (2007). Comparison of concordance correlation coefficient estimating approaches with skewed data. Journal of Biopharmaceutical Statistics 17, 673–684.
  43. Carrasco, J. L., King, T. S. and Chinchilli, V. M. (2009). The concordance correlation coefficient for repeated measures estimated by variance components. Journal of Biopharmaceutical Statistics 19, 90–105.
  44. Carroll, R. J. and Ruppert, D. (1988). Transformation and Weighting in Regression. Chapman & Hall, New York.
  45. Carroll, R. J. and Ruppert, D. (1996). The use and misuse of orthogonal regression in linear errors-in-variables models. The American Statistician 50, 1–6.
  46. Carstensen, B. (2010). Comparing Clinical Measurement Methods: A Practical Guide. John Wiley, Chichester, UK.
  47. Carstensen, B., Gurrin, L., Ekstrom, C. and Figurski, M. (2015). MethComp: Functions for analysis of agreement in method comparison studies. R package version 1.22.2.
  48. Carstensen, B., Simpson, J. and Gurrin, L. C. (2008). Statistical models for assessing agreement in method comparison studies with replicate measurements. The International Journal of Biostatistics 4, article 16.
  49. Casella, G. and Berger, R. (2001). Statistical Inference, 2nd edn. Duxbury Press, Pacific Grove, CA.
  50. Chen, C.-C. and Barnhart, H. X. (2008). Comparison of ICC and CCC for assessing agreement for data without and with replications. Computational Statistics and Data Analysis 53, 554–564.
  51. Chen, G., Faris, P., Hemmelgarn, B., Walker, R. L. and Quan, H. (2009). Measuring agreement of administrative data with chart data using prevalence unadjusted and adjusted kappa. BMC Medical Research Methodology 9, article 5.
  52. Cheng, C.-L. and Van Ness, J. W. (1999). Statistical Regression with Measurement Error. John Wiley, Chichester, UK.
  53. Chinchilli, V. M., Martel, J. K., Kumanyika, S. and Lloyd, T. (1996). A weighted concordance correlation coefficient for repeated measurement designs. Biometrics 52, 341–353.
  54. Choudhary, P. K. (2007). Semiparametric regression for assessing agreement using tolerance bands. Computational Statistics and Data Analysis 51, 6229–6241.
  55. Choudhary, P. K. (2008). A tolerance interval approach for assessment of agreement in method comparison studies with repeated measurements. Journal of Statistical Planning and Inference 138, 1102–1115.
  56. Choudhary, P. K. (2009). Interrater agreement. In Methods and Applications of Statistics in the Life and Health Sciences, pp. 461-480, Balakrishnan, N. (Editor), John Wiley, Hoboken, NJ.
  57. Choudhary, P. K. (2010). A unified approach for nonparametric evaluation of agreement in method comparison studies. The International Journal of Biostatistics 6, article 19.
  58. Choudhary, P. K. and Nagaraja, H. N. (2005a). Assessment of agreement using intersection-union principle. Biometrical Journal 47, 674–681.
  59. Choudhary, P. K. and Nagaraja, H. N. (2005b). Selecting the instrument closest to a gold standard. Journal of Statistical Planning and Inference 129, 229–237.
  60. Choudhary, P. K. and Nagaraja, H. N. (2005c). A two-stage procedure for selection and assessment of agreement of the best instrument with a gold standard. Sequential Analysis 24, 237–257.
  61. Choudhary, P. K. and Nagaraja, H. N. (2007). Tests for assessment of agreement using probability criteria. Journal of Statistical Planning and Inference 137, 279–290.
  62. Choudhary, P. K. and Ng, H. K. T. (2006). A tolerance interval approach for assessment of agreement using regression models for mean and variance. Biometrics 62, 288–296.
  63. Choudhary, P. K. and Yin, K. (2010). Bayesian and frequentist methodologies for analyzing method comparison studies with multiple methods. Statistics in Biopharmaceutical Research 2, 122–132.
  64. Choudhary, P. K., Sengupta, D. and Cassey, P. (2014). A general skew-t mixed model that allows different degrees of freedom for random effects and error distributions. Journal of Statistical Planning and Inference 147, 235–247.
  65. Chow, S.-C. and Liu, J.-P. (2008). Design and Analysis of Bioavailability and Bioequivalence Studies, 3rd edn. Chapman & Hall/CRC, Boca Raton, FL.
  66. Cochran, W. G. (1950). The comparison of percentages in matched samples. Biometrika 37, 256–266.
  67. Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20, 37–46.
  68. Cohen, J. (1968). Weighted kappa: Nominal scale agreement with provision for scales disagreement of partial credit. Psychological Bulletin 70, 213–220.
  69. Cornbleet, P. J. and Gochman, N. (1979). Incorrect least-squares regression coefficients in method-comparison analysis. Clinical Chemistry 25, 432–438.
  70. Cotes, P. M., Doré, C. J., Yin, J. A., Lewis, S. M., Messinezy, M., Pearson, T. C. and Reid, C. (1986). Determination of serum immunoreactive erythropoietin in the investigation of erythrocytosis. New England Journal of Medicine 315, 283–287.
  71. Cressie, N. A. C. (1993). Statistics for Spatial Data. John Wiley, New York.
  72. Dahl, D. B. (2015). xtable: Export Tables to LaTeX or HTML. R package version 1.8-0.
  73. Davidian, M. and Giltinan, D. M. (1995). Nonlinear Models for Repeated Measurement Data. Chapman & Hall/CRC, Boca Raton, FL.
  74. Davison, A. C. and Hinkley, D. V. (1997). Bootstrap Methods and Their Application. Cambridge University Press, New York.
  75. Deming, W. E. (1943). Statistical Adjustment of Data. John Wiley, New York.
  76. Dewitte, K., Fierens, C., St¨ockl, D. and Thienpont, L. M. (2002). Application of the Bland-Altman plot for interpretation of method-comparison Studies: A critical investigation of its practice [Letter]. Clinical Chemistry 48, 799–801.
  77. Diggle, P. J., Heagerty, P., Liang, K.-Y. and Zeger, S. L. (2002). Analysis of Longitudinal Data, 2nd edn. Oxford University Press, Oxford, UK.
  78. Donner, A., Eliasziw, M. and Klar, N. (1996). Testing the homogeneity of kappa statistics. Biometrics 52, 176–183.
  79. Donner, A., Shoukri, M. M., Klar, N. and Bartfay, E. (2000). Testing the equality of two dependent kappa statistics. Statistics in Medicine 19, 373–387.
  80. Dunn, G. (2004). Statistical Evaluation of Measurement Errors, 2nd edn. John Wiley, Chichester, UK.
  81. Dunn, G. (2007). Regression models for method comparison data. Journal of Biopharmaceutical Statistics 17, 739–756.
  82. Dunn, G. and Roberts, C. (1999). Modelling method comparison data. Statistical Methods in Medical Research 8, 161–179.
  83. Edland, S. D. (1996). Bias in slope estimates for the linear errors in variables model by the variance ratio method. Biometrics 52, 243–248.
  84. Efron, B. and Tibshirani, R. J. (1993). An Introduction to the Bootstrap. Chapman & Hall, New York.
  85. Eksborg, S. (1981). Evaluation of method-comparison data [Letter]. Clinical Chemistry 27, 1311–1312.
  86. Eliasziw, M., Young, S. L., Woodbury, M. G. and Fryday-Field, K. (1994). Statistical methodology for the concurrent assessment of interrater and intrarater reliability: Using goniometric measurements as an example. Physical Therapy 74, 777–788.
  87. Escaramis, G., Ascaso, C. and Carrasco, J. L. (2010). The total deviation index estimated by tolerance intervals to evaluate the concordance of measurement devices. BMC Medical Research Methodology 10, article 31.
  88. Fay, M. P. (2005). Random marginal agreement coefficients: Rethinking the adjustment for chance when measuring agreement. Biostatistics 6, 171–180.
  89. Fernholz, L. T. (1983). von Mises Calculus for Statistical Functionals. Springer, New York.
  90. Feuerman, M. and Miller, A. R. (2008). Relationships between statistical measures of agreement: Sensitivity, specificity and kappa. Journal of Evaluation in Clinical Practice 14, 930–933.
  91. Finney, D. J. (1996). A note on the history of regression. Journal of Applied Statistics 23, 555–557.
  92. Fitzmaurice, G. M., Laird, N. M. and Ware, J. H. (2011). Applied Longitudinal Analysis, 2nd edn. John Wiley, Hoboken, NJ.
  93. Fleiss, J. L. (1971). Measuring nominal scale agreement among many raters. Pychological Bulletin 76, 378–382.
  94. Fleiss, J. L. (1986). The Design and Analysis of Clinical Experiments. John Wiley, New York.
  95. Fleiss, J. L. and Cohen, J. (1973). The equivalence of weighted kappa and the intraclass correlation as measures of reliability. Educational and Psychological Measurement 33, 613–619.
  96. Fleiss, J. L. and Shrout, P. E. (1978). Approximate interval estimation for a certain intraclass correlation coefficient. Psychometrika 43, 259–262.
  97. Fleiss, J. L., Cohen, J. and Everitt, B. S. (1969). Large sample standard errors of kappa and weighted kappa. Psychological Bulletin 72, 323–327.
  98. Fleiss, J. L., Levin, B. and Paik, M. C. (2003). Statistical Methods for Rates and Proportions, 3rd edn. John Wiley, Hoboken, NJ.
  99. Gamer, M., Lemon, J., Fellows, I. and Singh, P. (2012). irr: Various Coefficients of Interrater Reliability and Agreement. R package version 0.84.
  100. Geistanger, A., Berding, C., Vorberg, E. and Herlan, M. (2008). Local regression: A new approach for measurement system comparison analysis. Clinical Chemistry and Laboratory Medicine 46, 1211–1219.
  101. Gelman, A. and Hill, J. (2007). Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press, New York.
  102. Genz, A. (1992). Numerical computation of multivariate normal probabilities. Journal of Computational and Graphical Statistics 1, 141–149.
  103. Genz, A., Bretz, F., Miwa, T., Mi, X., Leisch, F., Scheipl, F. and Hothorn, T. (2015). mvtnorm: Multivariate Normal and t Distributions. R package version 1.0-3.
  104. Gilbert, P. and Varadhan, R. (2015). numDeriv: Accurate Numerical Derivatives. R package version 2014.2-1.
  105. Giraudeau, B. and Mary, J. Y. (2001). Planning a reproducibility study: How many subjects and how many replicates per subject for an expected width of the 95 per cent confidence interval of the intraclass correlation coefficient. Statistics in Medicine 20, 3205–3214.
  106. Graybill, F. A. (2001). Matrices with Applications in Statistics, 2nd edn. Cengage Learning, Belmont, CA.
  107. Grubbs, F. E. (1948). On estimating precision of measuring instruments and product variability. Journal of the American Statistical Association 43, 243–264.
  108. Guo, Y. and Manatunga, A. K. (2007). Nonparametric estimation of the concordance correlation coefficient under univariate censoring. Biometrics 83, 164–172.
  109. Guttman, I. (1988). Statistical tolerance regions. In Encyclopedia of Statistical Sciences, 9, pp. 272-287, Kotz, S., Johnson, N. L. and Read, C. B. (Editors), John Wiley, New York.
  110. Haber, M. J. and Barnhart, H. X. (2006). Coefficients of agreement for fixed observers. Statistical Methods in Medical Research 15, 255–271.
  111. Haber, M. J. and Barnhart, H. X. (2008). A general approach to evaluating agreement between two observers or methods of measurement from quantitative data with replicated measurements. Statistical Methods in Medical Research 17, 151–169.
  112. Haber, M. J., Barnhart, H. X., Song, J. and Gruden, J. (2005). Observer variability: A new approach in evaluating interobserver agreement. Journal of Data Science 3, 69–83.
  113. Hardin, J. W. and Hilbe, J. M. (2012). Generalized Estimating Equations, 2nd edn. Chapman & Hall/CRC, Boca Raton, FL.
  114. Harris, I. R., Burch, B. D. and St. Laurent, R. T. (2001). A blended estimator for measure of agreement with a gold standard. Journal of Agricultural, Biological, and Environmental Statistics 6, 326–339.
  115. Hawkins, D. M. (2002). Diagnostics for conformity of paired quantitative measurements. Statistics in Medicine 21, 1913–1935.
  116. Hedayat, A. S., Lou, C. and Sinha, B. K. (2009). A statistical approach to assessment of agreement involving multiple raters. Communications in Statistics -Theory and Methods 38, 2899–2922.
  117. Hiriote, S. and Chinchilli, V. M. (2011). Matrix-based concordance correlation coefficient for repeated measures. Biometrics 67, 1007–1016.
  118. Ho, H. J. and Lin, T. I. (2010). Robust linear mixed models using the skew t distribution with application to schizophrenia data. Biometrical Journal 52, 449–469.
  119. Hollis, S. (1996a). Analysis of method comparison studies [Guest editorial]. Annals of Clinical Biochemistry 33, 1–4.
  120. Hollis, S. (1996b). Author’s reply to St¨ockl, D. (1996). Annals of Clinical Biochemistry 33, 577.
  121. Hothorn, T., Bretz, F. and Westfall, P. (2008). Simultaneous inference in general parametric models. Biometrical Journal 50, 346–363.
  122. Hsu, J. C. (1996). Multiple Comparisons: Theory and Methods. Chapman & Hall/CRC, Boca Raton, FL.
  123. Hutson, A. D. (2010). A multi-rater nonparametric test of agreement and corresponding agreement plot. Computational Statistics and Data Analysis 54, 109–119.
  124. Hutson, A. D., Wilson, D. C. and Geiser, E. A. (1998). Measuring relative agreement: Echocardiographer versus computer. Journal of Agricultural, Biological, and Environmental Statistics 3, 163–174.
  125. Igic, B., Hauber, M. E., Galbraith, J. A., Grim, T., Dearborn, D. C., Brennan, P. L. R., Moskat, C., Choudhary, P. K. and Cassey, P. (2010). Comparison of micrometer-and scanning electron microscope-based measurements of avian eggshell thickness. Journal of Field Ornithology 81, 402–410.
  126. Jaech, J. L. (1971). Further tests of significance for Grubbs’s estimators. Biometrics 27, 1097–1101.
  127. Johnson, R. A. and Wichern, D. W. (2002). Applied Multivariate Statistical Analysis, 5th edn. Prentice Hall, Upper Saddle River, NJ.
  128. Kelly, G. E. (1985). Use of structural equations model in assessing the reliability of a new measurement technique. Applied Statistics 34, 258–263.
  129. Kelly, G. E. (1987). Author’s reply to Altman and Bland (1987). Applied Statistics 36, 225–227.
  130. King, T. S. and Chinchilli, V. M. (2001a). A generalized concordance correlation coefficient for continuous and categorical data. Statistics in Medicine 20, 2131–2147.
  131. King, T. S. and Chinchilli, V. M. (2001b). Robust estimators of the concordance correlation coefficient. Journal of Biopharmaceutical Statistics 11, 83–105.
  132. King, T. S., Chinchilli, V. M. and Carrasco, J. L. (2007a). A repeated measures concordance correlation coefficient. Statistics in Medicine 26, 3095–3113.
  133. King, T. S., Chinchilli, V. M., Wang, K.-L. and Carrasco, J. L. (2007b). A class of repeated measures concordance correlation coefficients. Journal of Biopharmaceutical Statistics 17, 653–672.
  134. Kraemer, H. C., Periyakoil, V. S. and Noda, A. (2002). Kappa coefficients in medical research. Statistics in Medicine 21, 2109–2129.
  135. Krippendorff, K. (1970). Bivariate agreement coefficients for reliability of data. Sociological Methodology 2, 139–50.
  136. Krishnamoorthy, K. and Mathew, T. (2009). Statistical Tolerance Regions: Theory, Applications, and Computation. John Wiley, Hoboken, NJ.
  137. Krouwer, J. S. (2008). Why Bland-Altman plots should use X, not (Y+X)/2 when X is a reference method [Letter]. Statistics in Medicine 27, 778–780.
  138. Krummenauer, F. (1999). Intraindividual scale comparison in clinical diagnostic methods: A review of elementary methods. Biometrical Journal 41, 917–929.
  139. Krummenauer, F., Genevriere, I. and Nixdorff, U. (2000). The biometrical comparison of cardiac imaging methods. Computer Methods and Programs in Biomedicine 62, 21–34.
  140. Kummell, C. H. (1879). Reduction of observation equations which contain more than one observed quantity. The Analyst 6, 97–105.
  141. Kutner, M., Nachtsheim, C., Neter, J. and Li, W. (2004). Applied Linear Statistical Models, 5th edn. McGraw-Hill/Irwin, Chicago.
  142. Lai, D. and Shiao, S.-Y. (2005). Comparing two clinical measurements: A linear mixed model approach. Journal of Applied Statistics 32, 855–860.
  143. Lakshminarayanan, M. Y. and Gunst, R. F. (1984). Estimation of parameters in linear structural relationships: Sensitivity to the choice of the ratio of error variances. Biometrika 71, 569–573.
  144. Landis, J. R. and Koch, G. (1977a). The measurement of observer agreement for categorical data. Biometrics 33, 159–174.
  145. Landis, J. R. and Koch, G. (1977b). A one-way components of variance model for categorical data. Biometrics 33, 671–679.
  146. Landis, J. R., King, T. S., Choi, J. W., Chinchilli, V. M. and Koch, G. G. (2011). Measures of agreement and concordance with clinical research applications. Statistics in Biopharmaceutical Research 3, 185–209.
  147. Lange, K. (2010). Numerical Analysis for Statisticians, 2nd edn. Springer, New York.
  148. Lee, J. J. and Tu, Z. N. (1994). A better confidence interval for kappa (κ) on measuring agreement between two raters with binary outcomes. Journal of Computational and Graphical Statistics 3, 301–321.
  149. Lehmann, E. L. (1998). Elements of Large-Sample Theory. Springer, New York.
  150. LeLorier, J., Grégoire, G., Benhaddad, A., Lapierre, J. and Derderian, F. (1997). Discrepancies between meta-analyses and subsequent large randomized, controlled trials. New England Journal of Medicine 337, 536–542.
  151. Lewis, P. A., Jones, P. W., Polak, J. W. and Tillotson, H. T. (1991). The problem of conversion in method comparison studies. Applied Statistics 40, 105–112.
  152. Liao, J. (2009). Sample size calculation for an agreement study. Pharmaceutical Statistics 9, 125–132.
  153. Lin, L. I. (1989). A concordance correlation coefficient to evaluate reproducibility. Biometrics 45, 255–268. Corrections: 2000, 56, 324-325.
  154. Lin, L. I. (1992). Assay validation using the concordance correlation coefficient. Biometrics 48, 599–604.
  155. Lin, L. I. (2000). Total deviation index for measuring individual agreement with applications in laboratory performance and bioequivalence. Statistics in Medicine 19, 255–270.
  156. Lin, L. I. (2008). Overview of agreement statistics for medical devices. Journal of Biopharmaceutical Statistics 18, 126–144.
  157. Lin, L. I. and Chinchilli, V. M. (1997). Rejoinder to the letter to the editor from Atkinson and Nevill. Biometrics 53, 777–778.
  158. Lin, L. I., Hedayat, A. S. and Wu, W. (2007). A unified approach for assessing agreement for continuous and categorical data. Journal of Biopharmaceutical Statistics 17, 629–652.
  159. Lin, L. I., Hedayat, A. S. and Wu, W. (2011). Statistical Tools for Measuring Agreement. Springer, New York.
  160. Lin, L. I., Hedayat, A. S., Sinha, B. and Yang, M. (2002). Statistical methods in assessing agreement: Models, issues, and tools. Journal of the American Statistical Association 97, 257–270.
  161. Lin, S. C., Whipple, D. M. and Ho, C. S. (1998). Evaluation of statistical equivalence using limits of agreement and associated sample size calculation. Communications in Statistics -Theory and Methods 27, 1419–1432.
  162. Linnet, K. (1990). Estimation of the linear relationship between the measurements of two methods with proportional errors. Statistics in Medicine 9, 1463–1473.
  163. Linnet, K. (1993). Evaluation of regression procedures for method comparison studies. Clinical Chemistry 39, 424–432.
  164. Linnet, K. (1998). Performance of Deming regression analysis in case of misspecified analytical error ratio in method comparison studies. Clinical Chemistry 44, 1024–1031.
  165. Linnet, K. (1999). Limitations of the paired t-test for evaluation of method comparison data [Letter]. Clinical Chemistry 45, 314–315.
  166. Liu, J.-P. and Chow, S.-C. (1997). A two one-sided tests procedure for assessment of individual bioequivalence. Journal of Biopharmaceutical Statistics 7, 49–61.
  167. Liu, Q. and Pierce, D. A. (1994). A note on Gauss-Hermite quadrature. Biometrika 81, 624–629.
  168. Ludbrook, J. (2010). Confidence in Altman-Bland plots: A critical review of the method of differences. Clinical and Experimental Pharmacology and Physiology 37, 143–149.
  169. Luiz, R. R., Costa, A. J. L., Kale, P. L. and Werneck, G. L. (2003). Assessment of agreement of a quantitative variable: A new graphical approach. Journal of Clinical Epidemiology 56, 963–967.
  170. Maloney, C. J. and Rastogi, S. C. (1970). Significance test for Grubbs’s estimators. Biometrics 26, 671–676.
  171. Mandel, J. (1978). Accuracy and precision: Evaluation and interpretation of analytical results. In Treatise on Analytical Chemistry, Part I, Theory and Practice, 2nd edition, volume 1, pp. 243-298, Kolthoff, I. M. and Elving, P. J. (Editors), John Wiley, New York.
  172. Mandel, J. and Stiehler, R. D. (1954). Sensitivity – a criterion for the comparison of methods of test. Journal of Research of the National Bureau of Standards 53, 155–159.
  173. Marshall, G. N., Hays, R. D. and Nicholas, R. (1994). Evaluating agreement between clinical assessment methods. International Journal of Methods in Psychiatric Research 4, 249–257.
  174. Martin, R. F. (2000). General Deming regression for estimating systematic bias and its confidence interval in method-comparison studies. Clinical Chemistry 46, 100–104.
  175. McCulloch, C. E., Searle, S. R. and Neuhaus, J. M. (2008). Generalized, Linear, and Mixed Models, 2nd edn. John Wiley, Hoboken, NJ.
  176. McGraw, K. O. and Wong, S. P. (1996). Forming inferences about some intraclass correlation coefficients. Psychological Methods 1, 30–46.
  177. Meeker, W. Q., Hahn, G. J. and Escobar, L. A. (2017). Statistical Intervals: A Guide for Practitioners and Researchers, 2nd edn. John Wiley, Hoboken, NJ.
  178. Meyer, D., Zeileis, A. and Hornik, K. (2015). vcd: Visualizing Categorical Data. R package version 1.4-1.
  179. Morgan, W. A. (1939). A test for the significance of the difference between the two variances in a sample from a normal bivariate population. Biometrika 31, 13–19.
  180. Müller, R. and Büttner, P. (1994). A critical discussion of intraclass correlation coefficients. Statistics in Medicine 13, 2465–2476.
  181. Nawarathna, L. S. and Choudhary, P. K. (2013). Measuring agreement in method comparison studies with heteroscedastic measurements. Statistics in Medicine 32, 5156–5171.
  182. Nelson, K. P. and Edwards, D. (2008). On population-based measures of agreement for binary classifications. Canadian Journal of Statistics 36, 411–426.
  183. Nickerson, C. A. (1997). Comment on “A concordance correlation coefficient to evaluate reproducibility”. Biometrics 53, 1503–1507.
  184. Nix, A. B. J. and Dunston, F. D. J. (1991). Maximum likelihood techniques applied to method comparison studies. Statistics in Medicine 10, 981–988.
  185. Olsson, J. and Rootzén, H. (1996). Quantile estimation from repeated measurements. Journal of the American Statistical Association 91, 1560–1565.
  186. Osborne, C. (1991). Statistical calibration: A review. International Statistical Review 59, 309–336.
  187. Pan, Y., Haber, M., Gao, J. and Barnhart, H. X. (2012). A new permutation-based method for assessing agreement between two observers making replicated quantitative readings. Statistics in Medicine 31, 2249–2261.
  188. Passing, H. and Bablok, W. (1983). A new biometrical procedure for testing the equality of measurements from two different analytical methods. Application of linear regression procedures for method comparison studies in clinical chemistry, Part I. Journal of Clinical Chemistry and Clinical Biochemistry 21, 709–720.
  189. Passing, H. and Bablok, W. (1984). Comparison of several regression procedures for method comparison studies and determination of sample sizes. Application of linear regression procedures for method comparison studies in clinical chemistry, Part II. Journal of Clinical Chemistry and Clinical Biochemistry 22, 431–445.
  190. Perez-Jaume, S. and Carrasco, J. L. (2015). A non-parametric approach to estimate the total deviation index for non-normal data. Statistics in Medicine 34, 3318–3335.
  191. Pinheiro, J. C. and Bates, D. M. (2000). Mixed-Effects Models in S and S-PLUS. Springer, New York.
  192. Pinheiro, J. C., Bates, D., DebRoy, S., Sarkar, D. and R Core Team (2015). nlme: Linear and Nonlinear Mixed Effects Models. R package version 3.1-122.
  193. Pinheiro, J. C., Liu, C. and Wu, Y. N. (2001). Efficient algorithms for robust estimation in linear mixed-effects models using the multivariate t distribution. Journal of Computational and Graphical Statistics 10, 249–276.
  194. Pitman, E. J. G. (1939). A note on normal correlation. Biometrika 31, 9–12.
  195. Pollock, M. A., Jefferson, S. G., Kane, J. W., Lomax, K., MacKinnon, G. and Winnard, C. B. (1992). Method comparison—A different approach. Annals of Clinical Biochemistry 29, 556–560.
  196. Quiroz, J. (2005). Assessment of equivalence using a concordance correlation coefficient in a repeated measurements design. Journal of Biopharmaceutical Statistics 15, 913–928.
  197. Quiroz, J. and Burdick, R. K. (2009). Assessment of individual agreements with repeated measurements based on generalized confidence intervals. Journal of Biopharmaceutical Statistics 19, 345–359.
  198. R Core Team (2015). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria.
  199. Ranchet, M., Akinwuntan, A. E., Tant, M., Neal, E. and Devos, H. (2015). Agreement between physician’s recommendation and fitness-to-drive decision in multiple sclerosis. Archives of Physical Medicine and Rehabilitation 96, 1840–1844.
  200. Revelle, W. (2016). psych: Procedures for Psychological, Psychometric, and Personality Research. R package version 1.6.4.
  201. Rifkin, R. D. (1995). Effects of correlated and uncorrelated measurement error on linear regression and correlation in medical method comparison studies. Statistics in Medicine 14, 789–798.
  202. Rocke, D. M. and Lorenzato, S. (1995). A two-component model for measurement error in analytical chemistry. Technometrics 37, 176–184.
  203. Roy, A. (2009). An application of linear mixed effects model to assess the agreement between two methods with replicated observations. Journal of Biopharmaceutical Statistics 19, 150–173.
  204. Rubin, D. B. (1983). Iteratively reweighted least squares. In Encyclopedia of Statistical Sciences, 4, pp. 272-275, Kotz, S., Johnson, N. L. and Read, C. B. (Editors), John Wiley, New York.
  205. Ryan, T. P. and Woodall, W. H. (2005). The most-cited statistical papers. Journal of Applied Statistics 32, 461–474.
  206. Sarkar, D. (2008). Lattice: Multivariate Data Visualization with R. Springer, New York.
  207. Sarkar, D. and Andrews, F. (2013). latticeExtra: Extra Graphical Utilities Based on Lattice. R package version 0.6-26.
  208. Schluter, P. J. (2009). A multivariate hierarchical Bayesian approach to measuring agreement in repeated measurement method comparison studies. BMC Medical Research Methodology 9, article 6.
  209. Scott, W. (1955). Reliability of content analysis: The case of nominal scale coding. Public Opinion Quarterly 19, 321–325.
  210. Searle, S. R., Casella, G. and McCulloch, C. E. (1992). Variance Components. John Wiley, New York.
  211. Sengupta, D., Choudhary, P. K. and Cassey, P. (2015). Modeling and analysis of method comparison data with skewness and heavy tails. In Ordered Data Analysis, Modeling and Health Research Methods, pp. 169-187, Choudhary, P. K., Nagaraja, C. H. and Ng, H. K. T. (Editors), Springer, New York.
  212. Sharpsteen, C. and Bracken, C. (2015). tikzDevice: R Graphics Output in LaTeX Format. R package version 0.9.
  213. Shoukri, M. M. (2010). Measures of Interobserver Agreement and Reliability, 2nd edn. Chapman & Hall/CRC, Boca Raton, FL.
  214. Shyr, J. Y. and Gleser, L. J. (1986). Inference about comparative precision in linear structural relationships. Journal of Statistical Planning and Inference 14, 339–358.
  215. St. Laurent, R. T. (1998). Evaluating agreement with a gold standard in method comparison studies. Biometrics 54, 537–545.
  216. Stöckl, D. (1996). Beyond the myths of difference plots [Letter]. Annals of Clinical Biochemistry 33, 575–576.
  217. Stöckl, D., Cabaleiro, D. R., Uytfanghe, K. V. and Thienpont, L. M. (2004). Interpreting method comparison studies by use of the Bland-Altman plot: Reflecting the importance of sample size by incorporating confidence limits and predefined error limits in the graphic [Letter]. Clinical Chemistry 50, 2216–2218.
  218. Stöckl, D., Dewitte, K. and Thienpont, L. M. (1998). Validity of linear regression in method comparison studies: Is it limited by the statistical model or the quality of the analytical input data? Clinical Chemistry 44, 2340–2346.
  219. Stroup, W. W. (2012). Generalized Linear Mixed Models: Modern Concepts, Methods and Applications. Chapman & Hall/CRC, Boca Raton, FL.
  220. Tan, C. Y. and Iglewicz, B. (1999). Measurement-methods comparisons and linear statistical relationship. Technometrics 41, 192–201.
  221. Tanner, M. A. and Young, M. A. (1985). Modeling agreement among raters. Journal of the American Statistical Association 80, 175–180.
  222. Thompson, W. D. and Walter, S. D. (1988). Kappa and the concept of independent errors. Journal of Clinical Epidemiology 41, 969–970.
  223. Tsai, M.-Y. (2015). Comparison of concordance correlation coefficient via variance components, generalized estimating equations and weighted approaches with model selection. Computational Statistics and Data Analysis 82, 47–58.
  224. Twomey, P. J. (2006). How to use difference plots in quantitative method comparison studies. Annals of Clinical Biochemistry 43, 124–129.
  225. van der Vaart, A. W. (1998). Asymptotic Statistics. Cambridge University Press, New York.
  226. Vardeman, S. B. (1992). What about the other intervals? The American Statistician 46, 193–197.
  227. Verbeke, G. and Lesaffre, E. (1996). A linear mixed-effects model with heterogeneity in the random-effects population. Journal of the American Statistical Association 91, 217–221.
  228. von Eye, A. and Mun, E. Y. (2004). Analyzing Rater Agreement: Manifest Variable Methods. Lawrence Earlbaum Associates, Mahwah, NJ.
  229. Vonesh, E. F. and Chinchilli, V. M. (1997). Linear and Nonlinear Models for the Analysis of Repeated Measures. Marcel Dekker, New York.
  230. Wang, W. (1999). On equivalence of two variances of a bivariate normal vector. Journal of Statistical Planning and Inference 81, 279–292.
  231. Wang, W. and Hwang, J. T. G. (2001). A nearly unbiased test for individual bioequivalence problems using probability criteria. Journal of Statistical Planning and Inference 99, 41–58.
  232. Weingart, S. N., Davis, R. B., Palmer, R. H., Cahalane, M., Hamel, M. B., Mukamal, K., Phillips, R. S., Davies, D. T. J. and Lezzoni, L. I. (2002). Discrepancies between explicit and implicit review: Physician and nurse assessments of complications and quality. Health Services Research 37, 483–498.
  233. Wellek, S. (2010). Testing Statistical Hypotheses of Equivalence and Noninferiority, 2nd edn. Chapman & Hall/CRC, Boca Raton, FL.
  234. Westgard, J. O. and Hunt, M. R. (1973). Use and interpretation of common statistical tests in method-comparison studies. Clinical Chemistry 19, 49–57.
  235. Westlund, K. B. and Kurland, L. T. (1953). Studies on multiple sclerosis in Winnipeg, Manitoba, and New Orleans, Louisiana I. Prevalence; comparison between the patient groups in Winnipeg and New Orleans. American Journal of Hygiene 57, 380–396.
  236. Woodman, R. J. (2010). Bland-Altman beyond the basics: Creating confidence with badly behaved data [Editorial]. Clinical and Experimental Pharmacology and Physiology 37, 141–142.
  237. Yin, K., Choudhary, P. K., Varghese, D. and Goodman, S. R. (2008). A Bayesian approach for sample size determination in method comparison studies. Statistics in Medicine 27, 2273–2289.
  238. Young, D. S. (2010). An R package for estimating tolerance intervals. Journal of Statistical Software 36, 1–39.
  239. Zhang, D. and Davidian, M. (2001). Linear mixed models with flexible distributions of random effects for longitudinal data. Biometrics 57, 795–802.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.138.67.121