Return to search

Combining scientific computing and machine learning techniques to model longitudinal outcomes in clinical trials.

Scientific machine learning (SciML) is a new branch of AI research at the edge of scientific computing (Sci) and machine learning (ML). It deals with efficient amalgamation of data-driven algorithms along with scientific computing to discover the dynamics of the time-evolving process. The output of such algorithms is represented in the form of a governing equation(s) (e.g., ordinary differential equation(s), ODE(s)), which one can solve then for any time point and, thus, obtain a rigorous prediction.  In this thesis, we present a methodology on how to incorporate the SciML approach in the context of clinical trials to predict IPF disease progression in the form of governing equation. Our proposed methodology also quantifies the uncertainties associated with the model by fitting 95\% high density interval (HDI) for the ODE parameters and 95\% posterior prediction interval for posterior predicted samples. We have also investigated the possibility of predicting later outcomes by using the observations collected at early phase of the study. We were successful in combining ML techniques, statistical methodologies and scientific computing tools such as bootstrap sampling, cubic spline interpolation, Bayesian inference and sparse identification of nonlinear dynamics (SINDy) to discover the dynamics behind the efficacy outcome as well as in quantifying the uncertainty of the parameters of the governing equation in the form of 95 \% HDI intervals. We compared the resulting model with the existed disease progression model described by the Weibull function. Based on the mean squared error (MSE) criterion between our ODE approximated values and population means of respective datasets, we achieved the least possible MSE of 0.133,0.089,0.213 and 0.057. After comparing these MSE values with the MSE values obtained after using Weibull function, for the third dataset and pooled dataset, our ODE model performed better in reducing error than the Weibull baseline model by 7.5\% and 8.1\%, respectively. Whereas for the first and second datasets, the Weibull model performed better in reducing errors by 1.5\% and 1.2\%, respectively. Comparing the overall performance in terms of MSE, our proposed model approximates the population means better in all the cases except for the first and second datasets, assuming the latter case's error margin is very small. Also, in terms of interpretation, our dynamical system model contains the mechanistic elements that can explain the decay/acceleration rate of the efficacy endpoint, which is missing in the Weibull model. However, our approach had a limitation in predicting final outcomes using a model derived from  24, 36, 48 weeks observations with good accuracy where as on the contrast, the Weibull model do not possess the predicting capability. However, the extrapolated trend based on 60 weeks of data was found to be close to population mean and the ODE model built on 72 weeks of data. Finally we highlight potential questions for the future work.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:liu-176427
Date January 2021
CreatorsSubramanian, Harshavardhan
PublisherLinköpings universitet, Institutionen för datavetenskap
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0023 seconds