IntraClass Correlations
Measurement Error and Reliability in Behavioral Sciences
Measurement Error and Reliability in Behavioral Sciences
Measurements in the behavioral sciences are often subject to error, particularly when based on human judgments. Such measurement error can significantly impact statistical analysis and interpretation, making it critical to quantify error through reliability indices. Many reliability indices are derived from the intraclass correlation coefficient (ICC), expressed as the ratio of the variance of interest to the total variance (variance of interest plus error) (Bartko, 1966; Ebel, 1951; Haggard, 1958). However, various ICC forms exist, each yielding different results for the same data depending on the experimental design and study objectives. Unfortunately, researchers often overlook these distinctions or fail to specify which ICC form they used, which can compromise the validity of their findings.
Intra-class versus Inter-class measurements
To assess relationships between variables from different measurement classes (e.g., LDL cholesterol and systolic blood pressure, which differ in metric and variance), the Pearson correlation coefficient (Pearson r) is typically used as the standard interclass correlation measure. In contrast, intraclass correlation coefficients (ICCs) are employed for variables within the same measurement class, sharing both metric and variance. ICCs measure homogeneity across pairs or larger sets of measurements, making them ideal for evaluating reliability (e.g., test-retest consistency) or stability (e.g., performance of a medical device over time). For example, ICCs can assess how consistently three medical devices score the same patient, providing insight into device reliability in the population. ICCs range from 0 to 1, with higher values indicating greater reliability.
- Poor: <0.50
- Moderate: [0.50–0.75>
- Good: [0.75–0.90>
- Excellent: [0.90–1.0]
Choosing the Right ICC