# Intraclass correlation

A typical use of the intraclass correlation coefficient (ICC) is to quantify rater reliability, i.e. level of agreement between several ‘raters’ measuring the same objects. It is a standard tool to assess measurement error. ICC=1 would indicate perfect reliability. Raters (or ‘judges’) go in columns, while the objects measured go in rows.

Past follows the standard reference, Shrout and Fleiss (1979), which provides a number of different coefficients, referred to as ICC(*m,k*) where *m* is the model type. If *k*=1, the coefficient evaluates individual measurements (by a single rater); otherwise it evaluates the average measurement across raters. The model types are are

- Model 1: the raters rating different objects are different, and randomly sampled from a larger set of raters
- Model 2: the same raters rate all objects, and the raters are a subset of a larger set of raters.
- Model 3: no assumptions about the raters.

The most commonly used ICC is ICC(2,1), which is therefore marked in red in Past.

The analysis is based on a two-way ANOVA without replication, as described elsewhere in this manual. Confidence intervals are parametric, following the equations of Shrout and Fleiss (1979).

#### Reference

Shrout, P.E., Fleiss, J.L. 1979. Intraclass correlations: Uses in assessing rater reliability. *Psychological Bulletin* 86:420-428.