Numerous of high-impact applications involve predictive modeling of real-world data. This spans from hospital readmission prediction for enhanced patient care up to event detection in power systems for grid stabilization. Developing performant machine learning models necessitates extensive high-quality training data, ample labeled samples, and training and testing datasets derived from identical distributions. Though, such methodologies may be impractical in applications where obtaining labeled data is expensive or challenging, the quality of data is low, or when challenged with covariate or concept shifts. Our emphasis was on devising transfer learning methods to address the inherent challenges across two distinct applications.We delved into a notably challenging transfer learning application that revolves around predicting hospital readmission risks using electronic health record (EHR) data to identify patients who may benefit from extra care. Readmission models based on EHR data can be compromised by quality variations due to manual data input methods. Utilizing high-quality EHR data from a different hospital system to enhance prediction on a target hospital using traditional approaches might bias the dataset if distributions of the source and target data are different. To address this, we introduce an Early Readmission Risk Temporal Deep Adaptation Network, ERR-TDAN, for cross-domain knowledge transfer. A model developed using target data from an urban academic hospital was enhanced by transferring knowledge from high-quality source data. Given the success of our method in learning from data sourced from multiple hospital systems with different distributions, we further addressed the challenge and infeasibility of developing hospital-specific readmission risk prediction models using data from individual hospital systems. Herein, based on an extension of the previous method, we introduce an Early Readmission Risk Domain Generalization Network, ERR-DGN. It is adept at generalizing across multiple EHR data sources and seamlessly adapting to previously unseen test domains.
In another challenging application, we addressed event detection in electrical grids where dependencies are spatiotemporal, highly non-linear, and non-linear systems using high-volume field-recorded data from multiple Phasor Measurement Units (PMUs). Existing historical event logs created manually do not correlate well with the corresponding PMU measurements due to scarce and temporally imprecise labels. Extending event logs to a more complete set of labeled events is very costly and often infeasible to obtain. We focused on utilizing a transfer learning method tailored for event detection from PMU data to reduce the need for additional manual labeling. To demonstrate the feasibility, we tested our approach on large datasets collected from the Western and Eastern Interconnections of the U.S.A. by reusing a small number of carefully selected labeled PMU data from a power system to detect events from another.
Experimental findings suggest that the proposed knowledge transfer methods for healthcare and power system applications have the potential to effectively address the identified challenges and limitations. Evaluation of the proposed readmission models show that readmission risk predictions can be enhanced when leveraging higher-quality EHR data from a different site, and when trained on data from multiple sites and subsequently applied to a novel hospital site. Moreover, labels scarcity in power systems can be addressed by a transfer learning method in conjunction with a semi-supervised algorithm that is capable of detecting events based on minimal labeled instances. / Computer and Information Science
Identifer | oai:union.ndltd.org:TEMPLE/oai:scholarshare.temple.edu:20.500.12613/10285 |
Date | 12 1900 |
Creators | Abdel Hai, Ameen, 0000-0001-5173-5291 |
Contributors | Obradovic, Zoran, Dragut, Eduard Constantin, Gao, Hongchang, Rubin, Daniel J. |
Publisher | Temple University. Libraries |
Source Sets | Temple University |
Language | English |
Detected Language | English |
Type | Thesis/Dissertation, Text |
Format | 146 pages |
Rights | IN COPYRIGHT- This Rights Statement can be used for an Item that is in copyright. Using this statement implies that the organization making this Item available has determined that the Item is in copyright and either is the rights-holder, has obtained permission from the rights-holder(s) to make their Work(s) available, or makes the Item available under an exception or limitation to copyright (including Fair Use) that entitles it to make the Item available., http://rightsstatements.org/vocab/InC/1.0/ |
Relation | http://dx.doi.org/10.34944/dspace/10247, Theses and Dissertations |
Page generated in 0.0015 seconds