The influence of measurement error on calibration, discrimination, and overall estimation of a risk prediction model
1 Public Health Ontario, Toronto, Ontario, Canada
2 Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
3 Institute for Clinical Evaluative Sciences, Toronto, Ontario, Canada
4 Department of Health Policy, Management, and Evaluation, University of Toronto, Toronto, Ontario, Canada
5 Institute of Work and Health, Toronto, Ontario, Canada
6 Ottawa Hospital Research Institute, Ottawa, Ontario, Canada
7 Statistics Canada, Ottawa, Ontario, Canada
Population Health Metrics 2012, 10:20 doi:10.1186/1478-7954-10-20Published: 1 November 2012
Self-reported height and weight are commonly collected at the population level; however, they can be subject to measurement error. The impact of this error on predicted risk, discrimination, and calibration of a model that uses body mass index (BMI) to predict risk of diabetes incidence is not known. The objective of this study is to use simulation to quantify and describe the effect of random and systematic error in self-reported height and weight on the performance of a model for predicting diabetes.
Two general categories of error were examined: random (nondirectional) error and systematic (directional) error on an algorithm relating BMI in kg/m2 to probability of developing diabetes. The cohort used to develop the risk algorithm was derived from 23,403 Ontario residents that responded to the 1996/1997 National Population Health Survey linked to a population-based diabetes registry. The data and algorithm were then simulated to allow for estimation of the impact of these errors on predicted risk using the Hosmer-Lemeshow goodness-of-fit χ2 and C-statistic. Simulations were done 500 times with sample sizes of 9,177 for males and 10,618 for females.
Simulation data successfully reproduced discrimination and calibration generated from population data. Increasing levels of random error in height and weight reduced the calibration and discrimination of the model. Random error biased the predicted risk upwards whereas systematic error biased predicted risk in the direction of the bias and reduced calibration; however, it did not affect discrimination.
This study demonstrates that random and systematic errors in self-reported health data have the potential to influence the performance of risk algorithms. Further research that quantifies the amount and direction of error can improve model performance by allowing for adjustments in exposure measurements.