Return to search

Toward a novel predictive analysis framework for new-generation clinical decision support systems

The idea of developing automated tools able to deal with the complexity of clinical information processing dates back to the late 60s: since then, there has been scope for improving medical care due to the rapid growth of medical knowledge, and the need to explore new ways of delivering this due to the shortage of physicians. Clinical decision support systems (CDSS) are able to aid in the acquisition of patient data and to suggest appropriate decisions on the basis of the data thus acquired. Many improvements are envisaged due to the adoption of such systems including: reduction of costs by faster diagnosis, reduction of unnecessary examinations, reduction of risk of adverse events and medication errors, increase in the available time for direct patient care, improved medications and examination prescriptions, improved patient satisfaction, and better compliance to gold-standard up-to-date clinical pathways and guidelines. Logistic regression is a widely used algorithm which frequently appears in medical literature for building clinical decision support systems: however, published studies frequently have not followed commonly recommended procedures for using logistic regression and substantial shortcomings in the reporting of logistic regression results have been noted. Published literature has often accepted conclusions from studies which have not addressed the appropriateness and accuracy of the statistical analyses and other methodological issues, leading to design flaws in those models and to possible inconsistencies in the novel clinical knowledge based on such results. The main objective of this interdisciplinary work is to design a sound framework for the development of clinical decision support systems. We propose a framework that supports the proper development of such systems, and in particular the underlying predictive models, identifying best practices for each stage of the model’s development. This framework is composed of a number of subsequent stages: 1) dataset preparation insures that appropriate variables are presented to the model in a consistent format, 2) the model construction stage builds the actual regression (or logistic regression) model determining its coefficients and selecting statistically significant variables; this phase is generally preceded by a pre-modelling stage during which model functional forms are hypothesized based on a priori knowledge 3) the further model validation stage investigates whether the model could suffer from overfitting, i.e., the model has a good accuracy on training data but significantly lower accuracy on unseen data, 4) the evaluation stage gives a measure of the predictive power of the model (making use of the ROC curve, which allows to evaluate the predictive power of the model without any assumptions on error costs, and possibly R2 from regressions), 5) misclassification analysis could suggest useful insights into determining where the model could be unreliable, 6) implementation stage. The proposed framework has been applied to three applications on different domains, with a view to improve previous research studies. The first developed model predicts mortality within 28 days of patients suffering from acute alcoholic hepatitis. The aim of this application is to build a new predictive model that can be used in clinical practice to identify patients at greatest risk of mortality in 28 days as they may benefit from aggressive intervention, and to monitor their progress while in hospital. A comparison generated by state of the art tools shows an improved predictive power, demonstrating how an appropriate variables inclusion may result in an overall better accuracy of the model, which increased by 25% following an appropriate variables selection process. The second proposed predictive model is designed to aid the diagnosis of dementia, as clinicians often experience difficulties in the diagnosis of dementia due to the intrinsic complexity of the process and lack of comprehensive diagnostic tools. The aim of this application is to improve on the performance of a recent application of Bayesian belief networks using an alternative approach based on logistic regression. The approach based on statistical variables selection outperformed the model which used variables selected by domain experts in previous studies. Obtained results outperform considered benchmarks by 15%. The third built model predicts the probability of experiencing a certain symptom among common side-effects in patients receiving chemotherapy. The newly developed model includes a pre-modelling stage (which was based on previous research studies) and a subsequent regression. The computed accuracy of results (computed on a daily basis for each cycle of therapy) shows that the newly proposed approach has increased its predictive power by 19% when compared to the previously developed model: this has been obtained by an appropriate usage of available a priori knowledge to pre-model the functional forms. As shown by the proposed applications, different aspects of CDSS development are subject to substantial improvements: the application of the proposed framework to different domains leads to more accurate models than the existing state-of-the-art proposals. The developed framework is capable of helping researchers to identify and overcome possible pitfalls in their ongoing research works, by providing them with best practices for each step of the development process. An impact on the development of future clinical decision support systems is envisaged: the usage of an appropriate procedure in model development will produce more reliable and accurate systems, and will have a positive impact on the newly produced medical knowledge which may eventually be included in standard clinical practice.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:644863
Date January 2014
CreatorsMazzocco, Thomas
ContributorsHussain, Amir; Cairns, David
PublisherUniversity of Stirling
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://hdl.handle.net/1893/21684

Page generated in 0.0021 seconds