If the two groups of raters (or the same group observed on 2 occasions) must rate the exact same group of raters, then any agreement coefficient used (e.g. Fleiss generalized kappa, Gwet's AC

_{1}, Conger's generalized kappa, Brennan-Prediger coefficient, or Krippendorff's alpha) will produce two correlated coefficients, making the calculation of the variance of the difference very difficult due to the embedded correlation structure. Gwet (2016) proposed the linearization method to resolve this problem. This approach consists of using the linear approximation to the agreement coefficient to develop the equivalent of a paired t-test. Users of the R package may use the

**R functions**that I developed to implement the linearization method to testing the difference of two agreement coefficients for statistical significance.

See more details on kudos.

Bibliography:

*Gwet, K. L. (2016). Testing the Difference of Correlated Agreement Coefficients for Statistical Significance, Educational and Psychological Measurement, Vol 76(4) 609-637*

Hi,

ReplyDeleteI am comparing the inter-rater agreement among 5 raters before and after an intervention on the same 40 subjects. Do I compare rater 1 ratings before the intervention to rater 1 ratings after the intervention and so on for each rater? How do I finally combine the 5 comparisons (1 for each rater) to decide if the intervention improves inter-rater agreement? How do I generate a final p value for the 5 comparisons.

Thanks

Hythem

Hi,

ReplyDeleteNo, you don't do the pairwise analysis. You should perform a global comparison. Check the agreetest app using the following link: https://agreestat.net/agreetest/. The use of test datasets will show you how this analysis should be done.

Thanks a lot. This was very helpful.

ReplyDelete