Return to search

The impact of bias on the predictive value of EHR driven machine learning models.

The  rapid  digitization  in  the  health  care  sector  leads  to  an  increaseof  data.  This  routine  collected  data  in  the  form  of  electronic  healthrecords (EHR) is not only used by medical professionals but also hasa  secondary  purpose:  health  care  research.  It  can  be  opportune  touse this EHR data for predictive modeling in order to support medi-cal professionals in their decisions. However, using routine collecteddata  (RCD)  often  comes  with  subtle  biases  that  might  risk  efficientlearning of predictive models. In this thesis the effects of RCD on theprediction performance are reviewed.In particular we thoroughly investigate and reason if the performanceof  particular  prediction  models  is  consistent  over  a  range  of  hand-crafted sub-populations within the data.Evidence  is  presented  that  the  overall  prediction  score  of  the  algo-rithms trained by EHR significantly differ for some groups of patientsin  the  data.  A  method  is  presented  to  give  more  insight  why  thesegroups of patients have different scores.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:hh-39960
Date January 2019
CreatorsBoonen, Dries
PublisherHögskolan i Halmstad
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess

Page generated in 0.0026 seconds