Global ETD Search

171	Estimating Loss-Given-Default through Survival Analysis : A quantitative study of Nordea's default portfolio consisting of corporate customers Hallström, Richard January 2016 (has links) In Sweden, all banks must report their regulatory capital in their reports to the market and their models for calculating this capital must be approved by the financial authority, Finansinspektionen. The regulatory capital is the capital that a bank has to hold as a security for credit risk and this capital should serve as a buffer if they would loose unexpected amounts of money in their lending business. Loss-Given-Default (LGD) is one of the main drivers of the regulatory capital and the minimum required capital is highly sensitive to the reported LGD. Workout LGD is based on the discounted future cash flows obtained from defaulted customers. The main issue with workout LGD is the incomplete workouts, which in turn results in two problems for banks when they calculate their workout LGD. A bank either has to wait for the workout period to end, in which some cases take several years, or to exclude or make rough assumptions about those incomplete workouts in their calculations. In this study the idea from Survival analysis (SA) methods has been used to solve these problems. The mostly used SA model, the Cox proportional hazards model (Cox model), has been applied to investigate the effect of covariates on the length of survival for a monetary unit. The considered covariates are Country of booking, Secured/Unsecured, Collateral code, Loan-To-Value, Industry code, Exposure-At- Default and Multi-collateral. The data sample was first split into 80 % training sample and 20 % test sample. The applied Cox model was based on the training sample and then validated with the test sample through interpretation of the Kaplan-Meier survival curves for risk groups created from the prognostic index (PI). The results show that the model correctly rank the expected LGD for new customers but is not always able to distinguish the difference between risk groups. With the results presented in the study, Nordea can get an expected LGD for newly defaulted customers, given the customers’ information on the considered covariates in this study. They can also get a clear picture of what factors that drive a low respectively high LGD. / I Sverige måste alla banker rapportera sitt lagstadgade kapital i deras rapporter till marknaden och modellerna för att beräkna detta kapital måste vara godkända av den finansiella myndigheten, Finansinspektionen. Det lagstadgade kapitalet är det kapital som en bank måste hålla som en säkerhet för kreditrisk och den agerar som en buffert om banken skulle förlora oväntade summor pengar i deras utlåningsverksamhet. Loss- Given-Default (LGD) är en av de främsta faktorerna i det lagstadgade kapitalet och kravet på det minimala kapitalet är mycket känsligt för det rapporterade LGD. Workout LGD är baserat på diskonteringen av framtida kassaflöden från kunder som gått i default. Det huvudsakliga problemet med workout LGD är ofullständiga workouts, vilket i sin tur resulterar i två problem för banker när de ska beräkna workout LGD. Banken måste antingen vänta på att workout-perioden ska ta slut, vilket i vissa fall kan ta upp till flera år, eller så får banken exkludera eller göra grova antaganden om dessa ofullständiga workouts i sina beräkningar. I den här studien har idén från Survival analysis (SA) metoder använts för att lösa dessa problem. Den mest använda SA modellen, Cox proportional hazards model (Cox model), har applicerats för att undersöka effekten av kovariat på livslängden hos en monetär enhet. De undersökta kovariaten var Land, Säkrat/Osäkrat, Kollateral-kod, Loan-To-Value, Industri-kod Exposure-At-Default och Multipla-kollateral. Dataurvalet uppdelades först i 80 % träningsurval och 20 % testurval. Den applicerade Cox modellen baserades på träningsurvalet och validerades på testurvalet genom tolkning av Kaplan-Meier överlevnadskurvor för riskgrupperna skapade från prognosindexet (PI). Med de presenterade resultaten kan Nordea beräkna ett förväntat LGD för nya kunder i default, givet informationen i den här studiens undersökta kovariat. Nordea kan också få en klar bild över vilka faktorer som driver ett lågt respektive högt LGD. Survival analysis Cox proportional hazards model Loss-Given-Default workout LGD Recovery data
172	EMPIRICAL LIKELIHOOD AND DIFFERENTIABLE FUNCTIONALS Shen, Zhiyuan 01 January 2016 (has links) Empirical likelihood (EL) is a recently developed nonparametric method of statistical inference. It has been shown by Owen (1988,1990) and many others that empirical likelihood ratio (ELR) method can be used to produce nice confidence intervals or regions. Owen (1988) shows that -2logELR converges to a chi-square distribution with one degree of freedom subject to a linear statistical functional in terms of distribution functions. However, a generalization of Owen's result to the right censored data setting is difficult since no explicit maximization can be obtained under constraint in terms of distribution functions. Pan and Zhou (2002), instead, study the EL with right censored data using a linear statistical functional constraint in terms of cumulative hazard functions. In this dissertation, we extend Owen's (1988) and Pan and Zhou's (2002) results subject to non-linear but Hadamard differentiable statistical functional constraints. In this purpose, a study of differentiable functional with respect to hazard functions is done. We also generalize our results to two sample problems. Stochastic process and martingale theories will be applied to prove the theorems. The confidence intervals based on EL method are compared with other available methods. Real data analysis and simulations are used to illustrate our proposed theorem with an application to the Gini's absolute mean difference. Empirical Likelihood Statistical Functional Hadamard Differentiable Applied Statistics Statistical Methodology Statistical Theory Survival Analysis
173	Estimating Companies’ Survival in Financial Crisis : Using the Cox Proportional Hazards Model Andersson, Niklas January 2014 (has links) This master thesis is aimed towards answering the question What is the contribution from a company’s sector with regards to its survival of a financial crisis? with the sub question Can we use survival analysis on financial data to answer this?. Thus survival analysis is used to answer our main question which is seldom used on financial data. This is interesting since it will study how well survival analysis can be used on financial data at the same time as it will evaluate if all companies experiences a financial crisis in the same way. The dataset consists of all companies traded on the Swedish stock market during 2008. The results show that the survival method is very suitable the data that is used. The sector a company operated in has a significant effect. However the power is to low too give any indication of specific differences between the different sectors. Further on it is found that the group of smallest companies had much better survival than larger companies. Survival Analysis Survival Data Time to Event 2008 Financial Crisis Swedish Stock Market
174	Développement d’un indice de séparabilité adapté aux données de génomique en analyse de survie / Development of a separability index for genomic data in survival analysis Rouam, Sigrid Laure 30 March 2011 (has links) Dans le domaine de l’oncogénomique, l’un des axes actuels de recherche est l’identification de nouveaux marqueurs génétiques permettant entre autres de construire des règles prédictives visant à classer les patients selon le risque d’apparition d’un événement d’intérêt (décès ou récidive tumorale). En présence de telles données de haute dimension, une première étape de sélection parmi l’ensemble des variables candidates est généralement employée afin d’identifier les marqueurs ayant un intérêt explicatif jugé suffisant. Une question récurrente pour les biologistes est le choix de la règle de sélection. Dans le cadre de l’analyse de survie, les approches classiques consistent à ranger les marqueurs génétiques à partir du risque relatif ou de quantités issues de test statistiques (p-value, q-value). Cependant, ces méthodes ne sont pas adaptées à la combinaison de résultats provenant d’études hétérogènes dont les tailles d’échantillons sont très différentes.Utiliser un indice tenant compte à la fois de l’importance de l’effet pronostique et ne dépendant que faiblement de la taille de l’échantillon permet de répondre à cette problématique. Dans ce travail, nous proposons un nouvel indice de capacité de prédiction afin de sélectionner des marqueurs génomiques ayant un impact pronostique sur le délai de survenue d’un évènement.Cet indice étend la notion de pseudo-R2 dans le cadre de l’analyse de survie. Il présente également une interprétation originale et intuitive en terme de « séparabilité ». L’indice est tout d’abord construit dans le cadre du modèle de Cox, puis il est étendu à d’autres modèles plus complexes à risques non-proportionnels. Des simulations montrent que l’indice est peu affectée par la taille de l’échantillon et la censure. Il présente de plus une meilleure séparabilité que les indices classiques de la littérature. L’intérêt de l’indice est illustré sur deux exemples. Le premier consiste à identifier des marqueurs génomiques communs à différents types de cancers. Le deuxième, dans le cadre d’une étude sur le cancer broncho-pulmonaire, montre l’intérêt de l’indice pour sélectionner des facteurs génomiques entraînant un croisement des fonctions de risques instantanés pouvant être expliqué par un effet « modulateur » entre les marqueurs. En conclusion, l’indice proposé est un outil prometteur pouvant aider les chercheurs à identifier des listes de gènes méritant des études plus approfondies. / In oncogenomics research, one of the main objectives is to identify new genomic markers so as to construct predictive rules in order to classify patients according to time-to-event outcomes (death or tumor relapse). Most of the studies dealing with such high throughput data usually rely on a selection process in order to identify, among the candidates, the markers having a prognostic impact. A common problem among biologists is the choice of the selection rule. In survival analysis, classical procedures consist in ranking genetic markers according to either the estimated hazards ratio or quantities derived from a test statistic (p-value, q-value). However, these methods are not suitable for gene selection across multiple genomic datasets with different sample sizes.Using an index taking into account the magnitude of the prognostic impact of factors without being highly dependent on the sample size allows to address this issue. In this work, we propose a novel index of predictive ability for selecting genomic markers having a potential impact on timeto-event outcomes. This index extends the notion of "pseudo-R2" in the ramework of survival analysis. It possesses an original and straightforward interpretation in terms of "separability". The index is first derived in the framework of the Cox model and then extended to more complex non-proportional hazards models. Simulations show that our index is not substantially affected by the sample size of the study and the censoring. They also show that its separability performance is higher than indices from the literature. The interest of the index is illustrated in two examples. The first one aims at identifying genomic markers with common effects across different cancertypes. The second shows, in the framework of a lung cancer study, the interest of the index for selecting genomic factor with crossing hazards functions, which could be explained by some "modulating" effects between markers. The proposed index is a promising tool, which can help researchers to select a list of features of interest for further biological investigations. Analyse de survie Génomique Oncologie Pseudo-R2 Survival Analysis Genomics Oncology Pseudo-R2
175	Statistical Models and Analysis of Growth Processes in Biological Tissue Xia, Jun 15 December 2016 (has links) The mechanisms that control growth processes in biology tissues have attracted continuous research interest despite their complexity. With the emergence of big data experimental approaches there is an urgent need to develop statistical and computational models to fit the experimental data and that can be used to make predictions to guide future research. In this work we apply statistical methods on growth process of different biological tissues, focusing on development of neuron dendrites and tumor cells. We first examine the neuron cell growth process, which has implications in neural tissue regenerations, by using a computational model with uniform branching probability and a maximum overall length constraint. One crucial outcome is that we can relate the parameter fits from our model to real data from our experimental collaborators, in order to examine the usefulness of our model under different biological conditions. Our methods can now directly compare branching probabilities of different experimental conditions and provide confidence intervals for these population-level measures. In addition, we have obtained analytical results that show that the underlying probability distribution for this process follows a geometrical progression increase at nearby distances and an approximately geometrical series decrease for far away regions, which can be used to estimate the spatial location of the maximum of the probability distribution. This result is important, since we would expect maximum number of dendrites in this region; this estimate is related to the probability of success for finding a neural target at that distance during a blind search. We then examined tumor growth processes which have similar evolutional evolution in the sense that they have an initial rapid growth that eventually becomes limited by the resource constraint. For the tumor cells evolution, we found an exponential growth model best describes the experimental data, based on the accuracy and robustness of models. Furthermore, we incorporated this growth rate model into logistic regression models that predict the growth rate of each patient with biomarkers; this formulation can be very useful for clinical trials. Overall, this study aimed to assess the molecular and clinic pathological determinants of breast cancer (BC) growth rate in vivo. Neuronal Tree Stochastic Models Maximum Length Constraint Tumor Size Growth Rate Survival Analysis Model Selection
176	Factors Affecting Discrete-Time Survival Analysis Parameter Estimation and Model Fit Statistics Denson, Kathleen 05 1900 (has links) Discrete-time survival analysis as an educational research technique has focused on analysing and interpretating parameter estimates. The purpose of this study was to examine the effects of certain data characteristics on the hazard estimates and goodness of fit statistics. Fifty-four simulated data sets were crossed with four conditions in a 2 (time period) by 3 (distribution of Y = 1) by 3 (distribution of Y = 0) by 3 (sample size) design. Discrete-time survival analysis Model fit Discrete-time systems. Parameter estimation. Goodness-of-fit tests.
177	Choosing the Cut Point for a Restricted Mean in Survival Analysis, a Data Driven Method Sheldon, Emily H 25 April 2013 (has links) Survival Analysis generally uses the median survival time as a common summary statistic. While the median possesses the desirable characteristic of being unbiased, there are times when it is not the best statistic to describe the data at hand. Royston and Parmar (2011) provide an argument that the restricted mean survival time should be the summary statistic used when the proportional hazards assumption is in doubt. Work in Restricted Means dates back to 1949 when J.O. Irwin developed a calculation for the standard error of the restricted mean using Greenwood’s formula. Since then the development of the restricted mean has been thorough in the literature, but its use in practical analyses is still limited. One area that is not well developed in the literature is the choice of the time point to which the mean is restricted. The aim of this dissertation is to develop a data driven method that allows the user to find a cut-point to use to restrict the mean. Three methods are developed. The first is a simple method that locates the time at which the maximum distance between two curves exists. The second is a method adapted from a Renyi-type test, typically used when proportional hazards assumptions are not met, where the Renyi statistics are plotted and piecewise regression model is fit. The join point of the two pieces is where the meant will be restricted. Third is a method that applies a nonlinear model fit to the hazard estimates at each event time, the model allows for the hazards between the two groups to be different up until a certain time, after which the groups hazards are the same. The time point where the two groups’ hazards become the same is the time to which the mean is restricted. The methods are evaluated using MSE and bias calculations, and bootstrap techniques to estimate the variance. Survival Analysis Mean survival time Restricted mean Proportional hazards Biostatistics Physical Sciences and Mathematics Statistics and Probability
178	The Impact of Service-Learning among Other Predictors for Persistence and Degree Completion of Undergraduate Students Lockeman, Kelly 01 January 2012 (has links) College completion is an issue of great concern in the United States, where only 50% of students who start college as freshmen complete a bachelor's degree at that institution within six years. Researchers have studied a variety of factors to understand their relationship to student persistence. Not surprisingly, student characteristics, particularly their academic background prior to entering college, have a tremendous influence on college success. Colleges and universities have little control over student characteristics unless they screen out lesser qualified students during the admissions process, but selectivity is contrary to the push for increased accessibility for under-served groups. As a result, institutions need to better understand the factors that they can control. High-impact educational practices have been shown to improve retention and persistence through increased student engagement. Service-learning, a pedagogical approach that blends meaningful community service and reflection with course content, is a practice that is increasing in popularity, and it has proven beneficial at increasing student learning and engagement. The purpose of this study was to investigate whether participation in service-learning has any influence in the likelihood of degree completion or time to degree and, secondarily, to compare different methods of analysis to determine whether use of more complex models provides better information or more accurate prediction. The population for this study was a large public urban research institution in the mid-Atlantic region, and the sample was the cohort of students who started as first-time, full-time, bachelor's degree-seeking undergraduates in the fall of 2005. Data included demographic and academic characteristics upon matriculation, as well as financial need and aid, academic major, and progress indicators for each of the first six years of enrollment. Cumulative data were analyzed using logistic regression, and year-to-year data were analyzed using discrete-time survival analysis in a structural equation modeling (SEM) framework. Parameter estimates and odds ratios for the predictors in each model were compared. Some similarities were found in the variables that predict degree completion, but there were also some striking differences. The strongest predictors for degree completion were pre-college academic characteristics and strength of academic progress while in college (credits earned and GPA). When analyzed using logistic regression and cross-sectional data, service-learning participation was not a significant predictor for completion, but it did have an effect on completion time for those students who earned a degree within six years. When analyzed longitudinally using discrete-time survival analysis, however, service-learning participation is strongly predictive of degree completion, particularly when credits are earned in the third, fourth, and sixth years of enrollment. In the survival analysis model, service-learning credits earned were also more significant for predicting degree completion than other credits earned. In terms of data analysis, logistic regression was effective at predicting completion, but survival analysis seems to provide a more robust method for studying specific variables that may vary by time. degree completion college students service-learning survival analysis logistic regression Education
179	Modely pro přežití s možností vyléčení / Cure-rate models Drabinová, Adéla January 2016 (has links) In this work we deal with survival models, when we consider that with positive probability some patients never relapse because they are cured. We focus on two-component mixture model and model with biological motivation. For each model, we derive estimate of probability of cure and estimate of survival function of time to relaps of uncured patients by maximum likelihood method. Further we consider, that both probability of cure and survival time can depend on regressors. Models are then compared through simulation study. 1
180	A study of the robustness of Cox's proportional hazards model used in testing for covariate effects Fei, Mingwei January 1900 (has links) Master of Arts / Department of Statistics / Paul Nelson / There are two important statistical models for multivariate survival analysis, proportional hazards(PH) models and accelerated failure time(AFT) model. PH analysis is most commonly used multivariate approach for analysing survival time data. For example, in clinical investigations where several (known) quantities or covariates, potentially affect patient prognosis, it is often desirable to investigate one factor effect adjust for the impact of others. This report offered a solution to choose appropriate model in testing covariate effects under different situations. In real life, we are very likely to just have limited sample size and censoring rates(people dropping off), which cause difficulty in statistical analysis. In this report, each dataset is randomly repeated 1000 times from three different distributions (Weibull, Lognormal and Loglogistc) with combination of sample sizes and censoring rates. Then both models are evaluated by hypothesis testing of covariate effect using the simulated data using the derived statistics, power, type I error rate and covergence rate for each situation. We would recommend PH method when sample size is small(n<20) and censoring rate is high(p>0.8). In this case, both PH and AFT analyses may not be suitable for hypothesis testing, but PH analysis is more robust and consistent than AFT analysis. And when sample size is 20 or above and censoring rate is 0.8 or below, AFT analysis will have slight higher convergence rate and power than PH, but not much improvement in Type I error rates when sample size is big(n>50) and censoring rate is low(p<0.3). Considering the privilege of not requiring knowledge of distribution for PH analysis, we concluded that PH analysis is robust in hypothesis testing for covariate effects using data generated from an AFT model. Survival analysis Proportional hazards(PH) model Accelerated failure time(AFT) model Covariate effect test Statistics (0463)

Search results