The vast amount of clinical data made available by pervasive electronic health records presents a great opportunity for reusing these data to improve the efficiency and lower the costs of clinical and translational research. A risk to reuse is potential hidden biases in clinical data. While specific studies have demonstrated benefits in reusing clinical data for research, there are significant concerns about potential clinical data biases.
This dissertation research contributes original understanding of clinical data biases. Using research data carefully collected from a patient community served by our institution as the reference standard, we examined the measurement and sampling biases in the clinical data for selected clinical variables. Our results showed that the clinical data and research data had similar summary statistical profiles, but that there were detectable differences in definitions and measurements for variables such as height, diastolic blood pressure, and diabetes status. One implication of these results is that research data can complement clinical data for clinical phenotyping. We further supported this hypothesis using diabetes as an example clinical phenotype, showing that integrated clinical and research data improved the sensitivity and positive predictive value.
Page generated in 0.0034 seconds