Phenotypes for Diabetes in Research Using E.H.R. Data
Because different diabetes phenotype definitions are based on three components (ICD codes, laboratory tests, and medications) which in turn can be assembled differently based on frequency and timing, these phenotypes can identify different cohorts and numbers of patients with diabetes. The prevalence of diabetes in Durham County, North Carolina, varies from 7 to 13% depending on the specific EHR-based diabetes phenotype definition used. Using the gold standard of clinical chart review, we assessed the sensitivity and specificity of 8 different phenotypes. We will discuss these results and what components may have caused a false positive identification of diabetes and how to mitigate those in future phenotype developments. An understanding of how cohorts of patients are identified is essential to compare population health, quality improvement, and other research projects involving any disease not just diabetes.