Indiana University-Purdue University Indianapolis (IUPUI) / In the age of big data, better use of large, real-world datasets is needed, especially ultra-large databases that leverage health information exchange (HIE) systems to gather data from multiple sources. Promising as this process is, there have been challenges analyzing big data in healthcare due to big data attributes, mainly regarding volume, variety, and velocity. Thus, these health data require not only computational approaches but also context-based controls.In this research, we systematically examined associations among select neural and dermal conditions in an ultra-large healthcare database derived from an HIE, in which computational approaches with epidemiological measures were used. After a systematic cleaning, a binary logistic model-based methodology was used to search for associations, controlling for race and gender. Age groups were chosen using an algorithm to find the highest incidence rates for each condition pair. A binomial test was conducted to check for significant temporal direction among conditions to infer cause and effect. Gene-disease association data were used to evaluate the association among the conditions and assess the shared genetic background. The results were adjusted for multiple testing, and network graphs of significant associations were created. Findings among methodologies were compared to each other and with prior studies in the literature. In the results, seemingly distant neural and dermal conditions had an extensive number of associations. Controlling for race and gender tightened these associations, especially for racial factors affecting dermal conditions, like melanoma, and gender differences on conditions like migraine. Temporal and gene associations helped explain some of the results, but not all. Network visualizations summarized results, highlighting central conditions and stronger associations. Healthcare data are confounded by many factors that hide associations of interest. Triangulating associations with separate analyses helped with the interpretation of results. There are still numerous confounders in these data that bias associations. Aside from what is known, our approach with limited variables may inform hypothesis generation. Using additional variables with controlled-computational methods that require minimal external input may provide results that can guide healthcare, health policy, and further research.
Identifer | oai:union.ndltd.org:IUPUI/oai:scholarworks.iupui.edu:1805/28472 |
Date | 03 1900 |
Creators | Kirbiyik, Uzay |
Contributors | Dixon, Brian E., Nan, Hongmei, Grannis, Shaun J., Janga, Sarath Chandra, Zou, Jian |
Source Sets | Indiana University-Purdue University Indianapolis |
Language | en_US |
Detected Language | English |
Type | Dissertation |
Page generated in 0.0018 seconds