1 |
Conformal survival predictions at a user-controlled time point : The introduction of time point specialized Conformal Random Survival Forestsvan Miltenburg, Jelle January 2018 (has links)
The goal of this research is to expand the field of conformal predictions using Random Survival Forests. The standard Conformal Random Survival Forest can predict with a fixed certainty whether something will survive up until a certain time point. This research is the first to show that there is little practical use in the standard Conformal Random Survival Forest algorithm. It turns out that the confidence guarantees of the conformal prediction framework are violated if the Standard algorithm makes predictions for a user-controlled fixed time point. To solve this challenge, this thesis proposes two algorithms that specialize in conformal predictions for a fixed point in time: a Fixed Time algorithm and a Hybrid algorithm. Both algorithms transform the survival data that is used by the split evaluation metric in the Random Survival Forest algorithm. The algorithms are evaluated and compared along six different set prediction evaluation criteria. The prediction performance of the Hybrid algorithm outperforms the prediction performance of the Fixed Time algorithm in most cases. Furthermore, the Hybrid algorithm is more stable than the Fixed Time algorithm when the predicting job extends to various time points. The hybrid Conformal Random Survival Forest should thus be considered by anyone who wants to make conformal survival predictions at usercontrolled time points. / Målet med denna avhandling är att utöka området för konformitetsprediktion med hjälp av Random Survival Forests. Standardutförandet av Conformal Random Survival Forest kan förutsäga med en viss säkerhet om någonting kommer att överleva fram till en viss tidpunkt. Denna avhandling är den första som visar att det finns liten praktisk användning i standardutförandet av Conformal Random Survival Forest-algoritmen. Det visar sig att konfidensgarantierna för konformitetsprediktionsramverket bryts om standardalgoritmen gör förutsägelser för en användarstyrd fast tidpunkt. För att lösa denna utmaning, föreslår denna avhandling två algoritmer som specialiserar sig i konformitetsprediktion för en bestämd tidpunkt: en fast-tids algoritm och en hybridalgoritm. Båda algoritmerna omvandlar den överlevnadsdata som används av den delade utvärderingsmetoden i Random Survival Forest-algoritmen. Uppskattningsförmågan för hybridalgoritmen överträffar den för fast-tids algoritmen i de flesta fall. Dessutom är hybrid algoritmen stabilare än fast-tids algoritmen när det förutsägelsejobbet sträcker sig till olika tidpunkter. Hybridalgoritmen för Conformal Random Survival Forest bör därför föredras av den som vill göra konformitetsprediktion av överlevnad vid användarstyrda tidpunkter.
|
2 |
Statistical Modelling of Price Difference Durations Between Limit Order Books: Applications in Smart Order Routing / Statistisk modellering av varaktigheten av prisskillnader mellan orderböcker: Tillämpningar inom smart order routingBacke, Hannes, Rydberg, David January 2023 (has links)
The modern electronic financial market is composed of a large amount of actors. With the surge in algorithmic trading some of these actors collectively behave in increasingly complex ways. Historically, academic research related to financial markets has been focused on areas such as asset pricing, portfolio management and financial econometrics. However, the fragmentation of the financial market has given rise to a different set of problems, namely the order allocation problem, as well as smart order routers as a tool to comply with these. In this thesis we consider price discrepancies between order books, trading the same instruments, as a proxy for order routing opportunities. A survival analysis framework for these price differences is developed. Specifically, we consider the two widely used Kaplan-Meier and Cox Proportional Hazards models, as well as the somewhat less known Random Survival Forest model, in order to investigate whether such a framework is effective for predicting the survival times of price differences. The results show that the survival models outperform random models and fixed routing decisions significantly. Thus suggesting that such models could beneficially be incorporated into existing SOR environments. Furthermore, the implementation of order book parameters as covariates in the CPH and RSF models add additional performance. / Den moderna elektroniska marknaden består av ett stort antal aktörer som, till följd av ökningen av algoritmisk handel, beter sig alltmer komplext. Historiskt sett har akademisk forskning inom finans i huvudsak fokuserat på områden som prissättning av tillgångar, portföljförvaltning och finansiell ekonometri. Fragmentering av finansiella marknader har däremot gett upphov till nya sorters problem, däribland orderplaceringsproblemet. Följdaktligen har smart order routers utvecklats som ett verktyg för att tillmötesgå detta problem. I detta examensarbete studerar vi prisskillnader mellan orderböcker som tillhandhåller handel av samma instrument. Dessa prisskillnader representerar möjligheter för order routing. Vi utvecklar ett ramverk inom överlevnadsanalys för dessa prisskillnader. Specifikt används de välkända Kaplan-Meier- och Cox Proportional Hazards-modellerna samt den något mindre kända Random Survival Forest, för att utvärdera om ett sådant ramverk kan användas för att förutspå prisskillnadernas livstider. Våra resultat visar att dessa modeller överträffar slumpmässiga modeller samt deterministiska routingstrategier med stor marginal och antyder därmed att ett sådant ramverk kan integreras i SOR-system. Resultaten visar dessutom att användning av orderboksparametrar som variabler i CPH- och RSF-modellerna ökar prestandan.
|
3 |
Quantitative image analysis for prognostic prediction in lung SBRT / 肺定位放射線治療における予後予測に向けた定量的画像解析Kakino, Ryo 23 March 2021 (has links)
京都大学 / 新制・課程博士 / 博士(人間健康科学) / 甲第23121号 / 人健博第83号 / 新制||人健||6(附属図書館) / 京都大学大学院医学研究科人間健康科学系専攻 / (主査)教授 椎名 毅, 教授 藤井 康友, 教授 平井 豊博 / 学位規則第4条第1項該当 / Doctor of Human Health Sciences / Kyoto University / DFAM
|
4 |
An integrated genomic approach for the identification and analysis of single nucleotide polymorphisms that affect cancer in humansRepapi, Emmanouela January 2013 (has links)
The identification of genetic variants such as single nucleotide polymorphisms (SNPs), which affect cancer progression, survival and response to treatments could help in the design of better prevention and treatment strategies. Genome-wide association studies (GWAS) have provided the first step of identifying SNPs associating with cancer risk. However, identifying the causal SNPs responsible for the associations has proven challenging, and GWAS have not been successful for time-to-event phenotypes such as cancer progression, due to the insurmountable obstacle of the large sample size needed. The aim of this thesis is to design and implement strategies that combine the identification of SNPs significantly associated with cancer, focusing on time-to-event phenotypes, with detailed bioinformatics analysis to allow for further experimental validation and modelling, to better understand cancer-associated genomic loci and accelerate their incorporation into the clinic. First, a methodology that utilises the Random Survival Forest is developed and combined with a bioinformatics analysis that ranks SNPs according to their potential to result in differential protein levels or activity, in order to identify SNPs that affect the progression of B-cell chronic lymphocytic leukaemia. Next, an analysis that aims to extend our understanding of the role of SNPs in mediating the cellular responses to chemotherapeutic agents is applied. SNPs that could associate with differential cellular growth responses in cancer cell line panels are identified, and their association with the differential survival of cancer patients is explored. Finally, the potential roles of SNPs in affecting the transcriptional regulation of key cancer genes resulting in differential cancer risk are assessed. First, by focusing on SNPs in an important transcription factor binding motif that has been shown to be extremely sensitive to single base pair changes (the E-box) and next, by exploring the possibility that polymorphic transcription factor binding sites could underlie the significant associations noted in cancer GWAS.
|
5 |
Survival Comparison of Open and Endovascular Repair Using Machine Learning / Överlevnadsjämförelse av öppen och endovaskulär kirurgi med maskininlärningBrunnberg, Aston, Holte, Gustaf January 2021 (has links)
Today there exists two types of preventive surgical treatment procedures for Abdominal Aortic Aneurysm. In order to make an informed choice of treatment, the clinician needs to have a clear picture of how the choice will affect the patients chances of survival. In this master thesis, machine learning techniques are used to predict survival probabilities after respective treatment procedure and the performance is compared to the more conventional Kaplan-Meier estimator. Using Danish patient data, different machine learning models for survival predictions were trained and evaluated by their performance. Administrative Brier Score was used as performance metric as the data was administratively censored. An Ensemble model consisting of one Random Survival Forest and one Neural Multi Task Logistic Regression model was shown to achieve the best performance and significantly outperformed the conventional Kaplan-Meier model. Furthermore, an approach to investigate the predicted effects of choice of treatment was introduced. It showed that on average the Ensemble model predicted the choice of treatment to have less effect on the long term survival than what the corresponding prediction using the Kaplan-Meier estimator suggested. This applies to the full patient group as well as for patients of age between 70 and 79 years. In the latter case this prediction was also shown to be more accurate. / Idag finns det två typer av förebyggande kirurgiska behandlingsmetoder för abdominal aortaaneurysm. För att göra ett välgrundat val av behandlingsmetod måste läkaren ha en tydlig bild av hur valet kommer att påverka patienternas överlevadschanser. I detta examensarbete används maskininlärningstekniker för att förutsäga överlevnadssannolikheten efter respektive behandlingsmetod och prestandan jämförs mot den mer konventionella Kaplan-Meier-estimatorn. Med hjälp av dansk patientdata tränades olika maskininlärningsmodeller avsedda för överlevnadanalys och utvärderades utifrån deras prestanda. Administrativt Brier Score användes som mätvärde då censureringen i datan skett administrativt. En Ensemble-modell bestående av en Random Survival Forest- och en Neural Multi-Task Logistic Regression-modell visade sig uppnå bäst prestanda och överträffade signifikant den konventionella Kaplan-Meier-estimatorn. Dessutom introducerades ett tillvägagångssätt för att undersöka de predikterade effekterna av valet av behandling. Resultaten visade att Ensemble-modellen i genomsnitt förutspådde valet av behandling att ha mindre effekt på den långsiktiga överlevnaden än vad motsvarande förutsägelse med Kaplan-Meier-estimatorn föreslog. Detta både för alla patienter såväl som för patienter i åldern mellan 70 och 79 år. I det senare fallet visade sig denna förutsägelse också vara mer träffsäker.
|
6 |
Investigating the Impact of Age-Biased Samples on Lifetime Prediction Models of Traffic SignsWickramarachchi, Anupa, Jayasinghe, Nuwan January 2024 (has links)
The thesis investigates the impact of age-biased sampling on the accuracy of lifetime prediction models for traffic signs. The bias in question originates from age-biased sampling as a result of the inspection paradox. This phenomenon occurs because longer intervals have a higher probability of being observed compared to shorter intervals, leading to a skewed representation in the data. The research employs a dual approach: firstly, conducting an extensive analysis of real data on traffic sign longevity using a Weibull Survival Model. This analysis is based on the data set compiled by Saleh et al., (2023). Secondly, the study sets up a Monte Carlo simulation to systematically explore the effects of varying degrees and patterns of age bias on the sample. The simulation parameters are derived from the original Weibull Model parameters, obtained from the real dataset. This approach ensures that the simulations closely replicate the actual parameters and estimates. The comparison of the true shape, scale, intercept, and the coefficients associated with the covariates against the simulated estimates indicates a significant bias in the dataset. The study also examines the impact of this bias on the predictive capabilities of various models: Weibull Modeling, Cox Proportional Hazards, Kaplan Meier, and Random Survival Forest. This is done by comparing the true means and medians of the simulated data with the estimates from each model. The findings show that all models exhibit large deviations from the actual means and medians at varying bias levels in the simulated data. The accuracy of the predictions is measured using the Brier Score. This score also shows significant deviations from the prediction accuracy of the original Weibull Model applied to the real dataset, especially when the bias levels vary across simulated datasets. Given these findings, the study advises against using the aforementioned methods for lifetime modeling of traffic signs when there is age bias due to the inspection paradox.
|
7 |
Prognostics for Condition Based Maintenance of Electrical Control Units Using On-Board Sensors and Machine LearningFredriksson, Gabriel January 2022 (has links)
In this thesis it has been studied how operational and workshop data can be used to improve the handling of field quality (FQ) issues for electronic units. This was done by analysing how failure rates can be predicted, how failure mechanisms can be detected and how data-based lifetime models could be developed. The work has been done on an electronic control unit (ECU) that has been subject to a field quality (FQ) issue, determining thermomechanical stress on the solder joints of the BGAs (Ball Grid Array) on the PCBAs (Printed circuit board assembly) to be the main cause of failure. The project is divided into two parts. Part one, "PCBA" where a laboratory study on the effects of thermomechanical cycling on solder joints for different electrical components of the PCBAs are investigated. The second part, "ECU" is the main part of the project investigating data-driven solutions using operational and workshop history data. The results from part one show that the Weibull distribution commonly used to predict lifetimes of electrical components, work well to describe the laboratory results but also that non parametric methods such as kernel distribution can give good results. In part two when Weibull together with Gamma and Normal distributions were tested on the real ECU (electronic control unit) data, it is shown that none of them describe the data well. However, when random forest is used to develop data-based models most of the ECU lifetimes of a separate test dataset can be correctly predicted within a half a year margin. Further using random survival forest it was possible to produce a model with just 0.06 in (OOB) prediction error. This shows that machine learning methods could potentially be used in the purpose of condition based maintenance for ECUs.
|
Page generated in 0.0756 seconds