Social Determinants of Health (SDOH) play a crucial role in healthcare outcomes, yet identifying them from unstructured patient data remains a challenge. This research explores the potential of Large Language Models (LLMs) for automated SDOH identification from patient notes. We propose a general framework for SDOH screening that is simple and straightforward. We leverage existing SDOH datasets, adapting and combining them to create a more comprehensive benchmark for this task, addressing the research gap of limited datasets. Using the benchmark and proposed framework, we conclude by conducting several preliminary experiments exploring and comparing promising LLM system implementations. Our findings highlight the potential of LLMs for automated SDOH screening while emphasizing the need for more robust datasets and evaluation frameworks. / Master of Science / Social Determinants of Health (SDOH) have been shown to significantly impact health outcomes and are seen as a major contributor to global health inequities. However, their use within the healthcare industry is still significantly under emphasized, largely due to the difficulty of manually identifying SDOH factors. While previous works have explored automated approaches for SDOH identification, they lack standardization, data transparency and robustness, and are largely outdated compared to the latest Artificial Intelligence (AI) approaches. Therefore, in this work we propose a holistic framework for automated SDOH identification. We also present a higher quality SDOH benchmark, merging existing publicly available datasets, standardizing them, and cleaning them for errors. With this benchmark, we then conducted experiments to gain greater insights into the best performance across different state-of-the-art AI approaches. Through this work, we contribute a better way to think about automated SDOH screening systems, the first publicly accessible multi-clinic and multi-annotator benchmark, as well as greater insights into the latest AI approaches for state-of-the-art results.
Identifer | oai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/121097 |
Date | 09 September 2024 |
Creators | King III, Kenneth Hale |
Contributors | Computer Science and#38; Applications, Gracanin, Denis, Luther, Kurt, Azab, Mohamed Mahmoud Mahmoud |
Publisher | Virginia Tech |
Source Sets | Virginia Tech Theses and Dissertation |
Language | English |
Detected Language | English |
Type | Thesis |
Format | ETD, application/pdf |
Rights | Creative Commons Attribution 4.0 International, http://creativecommons.org/licenses/by/4.0/ |
Page generated in 0.0019 seconds