Open Access Highly Accessed Open Badges Research

Diabetes prevalence and diagnosis in US states: analysis of health surveys

Goodarz Danaei*, Ari B Friedman, Shefali Oza, Christopher JL Murray and Majid Ezzati

Population Health Metrics 2009, 7:16  doi:10.1186/1478-7954-7-16

PubMed Commons is an experimental system of commenting on PubMed abstracts, introduced in October 2013. Comments are displayed on the abstract page, but during the initial closed pilot, only registered users can read or post comments. Any researcher who is listed as an author of an article indexed by PubMed is entitled to participate in the pilot. If you would like to participate and need an invitation, please email, giving the PubMed ID of an article on which you are an author. For more information, see the PubMed Commons FAQ.

Authors' response to reader comment

Jolayne Houtz   (2009-10-30 00:12)  Population Health Metrics

We appreciate the attention to this detail by Dr Cheng. The point raised is correct and was indeed due to a skip pattern in the NHANES questionnaire. We repeated the analysis to evaluate the influence on the coefficients of regression within NHANES and predicted diabetes prevalence. Three coefficients (smoking, age 60-69, and age 70+) changed by less than 10%, and the rest remained unchanged. Predicted diabetes prevalence for different state-sex-age-race-insurance categories changed on average by 1.3% and at the most by 3.5% of the values reported in the manuscript, and hence were not sensitive to this error.
Goodarz Danaei and Majid Ezzati, on behalf of the authors

Competing interests

No competing interests.


Comments on the missing values of smoking and insurance status

Yiling Cheng   (2009-10-29 15:59)  Centers for Disease Control and Prevention email

This article demonstrated a simple and innovative approach to answer an important question that is what the total diabetes prevalences by US states are. I read it with great interesting and noticed the authors mentioned that there were “…50.2% of observations in NHANES were missing either smoking or insurance status…” According to the documentations, this is extremely too high. For example, in NHANES 2003-2004, persons aged 20 years or older had one missing value on question “Smoked at least 100 cigarettes in life” ( and persons aged 0 years or older had only 133 missing values on question “Covered by health insurance”( The authors might ignore the skip pattern of these two variables. Incorrectly handling these variables may make incorrect predictions and incorrect conclusions. I am wondering whether the authors can check the document and dataset again and rerun the analyses.

Competing interests

None declared


Post a comment