In a landmark report, the Secretary of Health and Human Services once characterized the legacy of disparities in healthcare and health outcomes in the United States as βan affront to our ideals and the ongoing genius of American medicine.β Since then, a vast amount of scientific evidence regarding health equity has been generated and important interventions developed. Yet despite substantial concerted efforts, health inequity remains a persistent and pervasive public health concern. A significant challenge is the lack of scalable resources to organize, synthesize, and integrate knowledge gleaned from available scientific evidence and real-world observational data in a comprehensive, systematic fashion. Recognizing this, the National Institute on Minority Health and Health Disparities recently embarked on a science visioning process and enumerated a set of strategic goals to foster a new generation of research capable of making major strides. Among the strategies proposed, the institute promotes development of new methods and measurements that enable 1) evaluation of health equity research and ensuring its relevance to a diversity of populations, 2) better leveraging and fostering linkage between existing and emerging data sources to enhance analytic capacity, and 3) analysis of health determinants contributing to health disparities among subpopulations and small groups.
In alignment with this vision, the studies presented in this thesis seek to advance health equity science by developing, applying, and evaluating informatics-based approaches to support evidence synthesis, monitoring, and precision. In particular, systematic, scalable, and sustainable (i.e., reproducible) approaches are emphasized. This dissertation employs robust methods for big data collection, integration, and analysis while drawing from a rich set of existing and emerging data sources including a large corpus of biomedical literature, electronic health records from the largest public health information exchange in the United States, open datasets provided by public agencies, proprietary national insurance claims datasets, and public health reporting data.
The dissertation first aims to facilitate large-scale evidence synthesis to identify major areas of focus in health equity research and elucidate potentially less well studied populations, conditions, and topics. In order to accomplish these tasks, it draws from the informatics toolbox by leveraging machine learning, natural language processing, and symbolic reasoning to assess a vast portfolio of research and compare it to real-world data on condition prevalence. In doing so, it spotlights potential paths for additional scientific inquiry and attention in public health practice. Ultimately, we find that there are indeed potential disparities in disparities research and elucidate less well studied areas of interest.
Building on these insights, this thesis then leverages underutilized real-world data sources (e.g., health information exchanges) and a common data model to buttress the highly fragmented and outdated public health data infrastructure currently used to monitor conditions of interest and elucidate potential disparities among populations. By intention, the illustrative example presented operates at a scale commensurate with current regional public health reporting, increases the number and types of data elements available for analysis, and improves turnaround time for surveillance report generation. We found a substantial lack of alignment between testing practices and population- and neighborhood-level trends in cases of sexually transmitted infections, signaling potential disparities and inefficient resource allocation. Thus, our work meets several hallmarks of infectious disease surveillance in the era of big data including volume, variety, velocity, and value, respectively. Importantly, these findings were not otherwise or less likely detectable given existing public health reporting practices.
Finally, as existing health equity literature and public health surveillance practices do not often incorporate intersectionality as an integral lens for drawing comparisons, in large part due to the technical intractability of examining all possible demographic intersections, we demonstrate a novel subgroup discovery approach tailored to elucidating intersectional disparities in health outcomes. In doing so, it aims to better inform efforts to recognize and engage subgroups who might benefit from greater attention and more closely tailored interventions. To our knowledge, this new approach is the first to leverage supervised clustering to operationalize intersectionality for health disparities research.
To ground this work, clinical examples focus especially on demonstrating application of each new approach to generalizable use cases involving HIV and other sexually transmitted infections β highly stigmatized conditions for which there is a long history of inequity.
Identifer | oai:union.ndltd.org:columbia.edu/oai:academiccommons.columbia.edu:10.7916/jrcc-8q66 |
Date | January 2025 |
Creators | Reyes Nieva, Harry |
Source Sets | Columbia University |
Language | English |
Detected Language | English |
Type | Theses |
Page generated in 0.0025 seconds