Global ETD Search

171	EMPIRICAL LIKELIHOOD AND DIFFERENTIABLE FUNCTIONALS Shen, Zhiyuan 01 January 2016 (has links) Empirical likelihood (EL) is a recently developed nonparametric method of statistical inference. It has been shown by Owen (1988,1990) and many others that empirical likelihood ratio (ELR) method can be used to produce nice confidence intervals or regions. Owen (1988) shows that -2logELR converges to a chi-square distribution with one degree of freedom subject to a linear statistical functional in terms of distribution functions. However, a generalization of Owen's result to the right censored data setting is difficult since no explicit maximization can be obtained under constraint in terms of distribution functions. Pan and Zhou (2002), instead, study the EL with right censored data using a linear statistical functional constraint in terms of cumulative hazard functions. In this dissertation, we extend Owen's (1988) and Pan and Zhou's (2002) results subject to non-linear but Hadamard differentiable statistical functional constraints. In this purpose, a study of differentiable functional with respect to hazard functions is done. We also generalize our results to two sample problems. Stochastic process and martingale theories will be applied to prove the theorems. The confidence intervals based on EL method are compared with other available methods. Real data analysis and simulations are used to illustrate our proposed theorem with an application to the Gini's absolute mean difference. Empirical Likelihood Statistical Functional Hadamard Differentiable Applied Statistics Statistical Methodology Statistical Theory Survival Analysis
172	Estimating Companies’ Survival in Financial Crisis : Using the Cox Proportional Hazards Model Andersson, Niklas January 2014 (has links) This master thesis is aimed towards answering the question What is the contribution from a company’s sector with regards to its survival of a financial crisis? with the sub question Can we use survival analysis on financial data to answer this?. Thus survival analysis is used to answer our main question which is seldom used on financial data. This is interesting since it will study how well survival analysis can be used on financial data at the same time as it will evaluate if all companies experiences a financial crisis in the same way. The dataset consists of all companies traded on the Swedish stock market during 2008. The results show that the survival method is very suitable the data that is used. The sector a company operated in has a significant effect. However the power is to low too give any indication of specific differences between the different sectors. Further on it is found that the group of smallest companies had much better survival than larger companies. Survival Analysis Survival Data Time to Event 2008 Financial Crisis Swedish Stock Market
173	Développement d’un indice de séparabilité adapté aux données de génomique en analyse de survie / Development of a separability index for genomic data in survival analysis Rouam, Sigrid Laure 30 March 2011 (has links) Dans le domaine de l’oncogénomique, l’un des axes actuels de recherche est l’identification de nouveaux marqueurs génétiques permettant entre autres de construire des règles prédictives visant à classer les patients selon le risque d’apparition d’un événement d’intérêt (décès ou récidive tumorale). En présence de telles données de haute dimension, une première étape de sélection parmi l’ensemble des variables candidates est généralement employée afin d’identifier les marqueurs ayant un intérêt explicatif jugé suffisant. Une question récurrente pour les biologistes est le choix de la règle de sélection. Dans le cadre de l’analyse de survie, les approches classiques consistent à ranger les marqueurs génétiques à partir du risque relatif ou de quantités issues de test statistiques (p-value, q-value). Cependant, ces méthodes ne sont pas adaptées à la combinaison de résultats provenant d’études hétérogènes dont les tailles d’échantillons sont très différentes.Utiliser un indice tenant compte à la fois de l’importance de l’effet pronostique et ne dépendant que faiblement de la taille de l’échantillon permet de répondre à cette problématique. Dans ce travail, nous proposons un nouvel indice de capacité de prédiction afin de sélectionner des marqueurs génomiques ayant un impact pronostique sur le délai de survenue d’un évènement.Cet indice étend la notion de pseudo-R2 dans le cadre de l’analyse de survie. Il présente également une interprétation originale et intuitive en terme de « séparabilité ». L’indice est tout d’abord construit dans le cadre du modèle de Cox, puis il est étendu à d’autres modèles plus complexes à risques non-proportionnels. Des simulations montrent que l’indice est peu affectée par la taille de l’échantillon et la censure. Il présente de plus une meilleure séparabilité que les indices classiques de la littérature. L’intérêt de l’indice est illustré sur deux exemples. Le premier consiste à identifier des marqueurs génomiques communs à différents types de cancers. Le deuxième, dans le cadre d’une étude sur le cancer broncho-pulmonaire, montre l’intérêt de l’indice pour sélectionner des facteurs génomiques entraînant un croisement des fonctions de risques instantanés pouvant être expliqué par un effet « modulateur » entre les marqueurs. En conclusion, l’indice proposé est un outil prometteur pouvant aider les chercheurs à identifier des listes de gènes méritant des études plus approfondies. / In oncogenomics research, one of the main objectives is to identify new genomic markers so as to construct predictive rules in order to classify patients according to time-to-event outcomes (death or tumor relapse). Most of the studies dealing with such high throughput data usually rely on a selection process in order to identify, among the candidates, the markers having a prognostic impact. A common problem among biologists is the choice of the selection rule. In survival analysis, classical procedures consist in ranking genetic markers according to either the estimated hazards ratio or quantities derived from a test statistic (p-value, q-value). However, these methods are not suitable for gene selection across multiple genomic datasets with different sample sizes.Using an index taking into account the magnitude of the prognostic impact of factors without being highly dependent on the sample size allows to address this issue. In this work, we propose a novel index of predictive ability for selecting genomic markers having a potential impact on timeto-event outcomes. This index extends the notion of "pseudo-R2" in the ramework of survival analysis. It possesses an original and straightforward interpretation in terms of "separability". The index is first derived in the framework of the Cox model and then extended to more complex non-proportional hazards models. Simulations show that our index is not substantially affected by the sample size of the study and the censoring. They also show that its separability performance is higher than indices from the literature. The interest of the index is illustrated in two examples. The first one aims at identifying genomic markers with common effects across different cancertypes. The second shows, in the framework of a lung cancer study, the interest of the index for selecting genomic factor with crossing hazards functions, which could be explained by some "modulating" effects between markers. The proposed index is a promising tool, which can help researchers to select a list of features of interest for further biological investigations. Analyse de survie Génomique Oncologie Pseudo-R2 Survival Analysis Genomics Oncology Pseudo-R2
174	Statistical Models and Analysis of Growth Processes in Biological Tissue Xia, Jun 15 December 2016 (has links) The mechanisms that control growth processes in biology tissues have attracted continuous research interest despite their complexity. With the emergence of big data experimental approaches there is an urgent need to develop statistical and computational models to fit the experimental data and that can be used to make predictions to guide future research. In this work we apply statistical methods on growth process of different biological tissues, focusing on development of neuron dendrites and tumor cells. We first examine the neuron cell growth process, which has implications in neural tissue regenerations, by using a computational model with uniform branching probability and a maximum overall length constraint. One crucial outcome is that we can relate the parameter fits from our model to real data from our experimental collaborators, in order to examine the usefulness of our model under different biological conditions. Our methods can now directly compare branching probabilities of different experimental conditions and provide confidence intervals for these population-level measures. In addition, we have obtained analytical results that show that the underlying probability distribution for this process follows a geometrical progression increase at nearby distances and an approximately geometrical series decrease for far away regions, which can be used to estimate the spatial location of the maximum of the probability distribution. This result is important, since we would expect maximum number of dendrites in this region; this estimate is related to the probability of success for finding a neural target at that distance during a blind search. We then examined tumor growth processes which have similar evolutional evolution in the sense that they have an initial rapid growth that eventually becomes limited by the resource constraint. For the tumor cells evolution, we found an exponential growth model best describes the experimental data, based on the accuracy and robustness of models. Furthermore, we incorporated this growth rate model into logistic regression models that predict the growth rate of each patient with biomarkers; this formulation can be very useful for clinical trials. Overall, this study aimed to assess the molecular and clinic pathological determinants of breast cancer (BC) growth rate in vivo. Neuronal Tree Stochastic Models Maximum Length Constraint Tumor Size Growth Rate Survival Analysis Model Selection
175	Factors Affecting Discrete-Time Survival Analysis Parameter Estimation and Model Fit Statistics Denson, Kathleen 05 1900 (has links) Discrete-time survival analysis as an educational research technique has focused on analysing and interpretating parameter estimates. The purpose of this study was to examine the effects of certain data characteristics on the hazard estimates and goodness of fit statistics. Fifty-four simulated data sets were crossed with four conditions in a 2 (time period) by 3 (distribution of Y = 1) by 3 (distribution of Y = 0) by 3 (sample size) design. Discrete-time survival analysis Model fit Discrete-time systems. Parameter estimation. Goodness-of-fit tests.
176	Choosing the Cut Point for a Restricted Mean in Survival Analysis, a Data Driven Method Sheldon, Emily H 25 April 2013 (has links) Survival Analysis generally uses the median survival time as a common summary statistic. While the median possesses the desirable characteristic of being unbiased, there are times when it is not the best statistic to describe the data at hand. Royston and Parmar (2011) provide an argument that the restricted mean survival time should be the summary statistic used when the proportional hazards assumption is in doubt. Work in Restricted Means dates back to 1949 when J.O. Irwin developed a calculation for the standard error of the restricted mean using Greenwood’s formula. Since then the development of the restricted mean has been thorough in the literature, but its use in practical analyses is still limited. One area that is not well developed in the literature is the choice of the time point to which the mean is restricted. The aim of this dissertation is to develop a data driven method that allows the user to find a cut-point to use to restrict the mean. Three methods are developed. The first is a simple method that locates the time at which the maximum distance between two curves exists. The second is a method adapted from a Renyi-type test, typically used when proportional hazards assumptions are not met, where the Renyi statistics are plotted and piecewise regression model is fit. The join point of the two pieces is where the meant will be restricted. Third is a method that applies a nonlinear model fit to the hazard estimates at each event time, the model allows for the hazards between the two groups to be different up until a certain time, after which the groups hazards are the same. The time point where the two groups’ hazards become the same is the time to which the mean is restricted. The methods are evaluated using MSE and bias calculations, and bootstrap techniques to estimate the variance. Survival Analysis Mean survival time Restricted mean Proportional hazards Biostatistics Physical Sciences and Mathematics Statistics and Probability
177	The Impact of Service-Learning among Other Predictors for Persistence and Degree Completion of Undergraduate Students Lockeman, Kelly 01 January 2012 (has links) College completion is an issue of great concern in the United States, where only 50% of students who start college as freshmen complete a bachelor's degree at that institution within six years. Researchers have studied a variety of factors to understand their relationship to student persistence. Not surprisingly, student characteristics, particularly their academic background prior to entering college, have a tremendous influence on college success. Colleges and universities have little control over student characteristics unless they screen out lesser qualified students during the admissions process, but selectivity is contrary to the push for increased accessibility for under-served groups. As a result, institutions need to better understand the factors that they can control. High-impact educational practices have been shown to improve retention and persistence through increased student engagement. Service-learning, a pedagogical approach that blends meaningful community service and reflection with course content, is a practice that is increasing in popularity, and it has proven beneficial at increasing student learning and engagement. The purpose of this study was to investigate whether participation in service-learning has any influence in the likelihood of degree completion or time to degree and, secondarily, to compare different methods of analysis to determine whether use of more complex models provides better information or more accurate prediction. The population for this study was a large public urban research institution in the mid-Atlantic region, and the sample was the cohort of students who started as first-time, full-time, bachelor's degree-seeking undergraduates in the fall of 2005. Data included demographic and academic characteristics upon matriculation, as well as financial need and aid, academic major, and progress indicators for each of the first six years of enrollment. Cumulative data were analyzed using logistic regression, and year-to-year data were analyzed using discrete-time survival analysis in a structural equation modeling (SEM) framework. Parameter estimates and odds ratios for the predictors in each model were compared. Some similarities were found in the variables that predict degree completion, but there were also some striking differences. The strongest predictors for degree completion were pre-college academic characteristics and strength of academic progress while in college (credits earned and GPA). When analyzed using logistic regression and cross-sectional data, service-learning participation was not a significant predictor for completion, but it did have an effect on completion time for those students who earned a degree within six years. When analyzed longitudinally using discrete-time survival analysis, however, service-learning participation is strongly predictive of degree completion, particularly when credits are earned in the third, fourth, and sixth years of enrollment. In the survival analysis model, service-learning credits earned were also more significant for predicting degree completion than other credits earned. In terms of data analysis, logistic regression was effective at predicting completion, but survival analysis seems to provide a more robust method for studying specific variables that may vary by time. degree completion college students service-learning survival analysis logistic regression Education
178	Modely pro přežití s možností vyléčení / Cure-rate models Drabinová, Adéla January 2016 (has links) In this work we deal with survival models, when we consider that with positive probability some patients never relapse because they are cured. We focus on two-component mixture model and model with biological motivation. For each model, we derive estimate of probability of cure and estimate of survival function of time to relaps of uncured patients by maximum likelihood method. Further we consider, that both probability of cure and survival time can depend on regressors. Models are then compared through simulation study. 1
179	A study of the robustness of Cox's proportional hazards model used in testing for covariate effects Fei, Mingwei January 1900 (has links) Master of Arts / Department of Statistics / Paul Nelson / There are two important statistical models for multivariate survival analysis, proportional hazards(PH) models and accelerated failure time(AFT) model. PH analysis is most commonly used multivariate approach for analysing survival time data. For example, in clinical investigations where several (known) quantities or covariates, potentially affect patient prognosis, it is often desirable to investigate one factor effect adjust for the impact of others. This report offered a solution to choose appropriate model in testing covariate effects under different situations. In real life, we are very likely to just have limited sample size and censoring rates(people dropping off), which cause difficulty in statistical analysis. In this report, each dataset is randomly repeated 1000 times from three different distributions (Weibull, Lognormal and Loglogistc) with combination of sample sizes and censoring rates. Then both models are evaluated by hypothesis testing of covariate effect using the simulated data using the derived statistics, power, type I error rate and covergence rate for each situation. We would recommend PH method when sample size is small(n<20) and censoring rate is high(p>0.8). In this case, both PH and AFT analyses may not be suitable for hypothesis testing, but PH analysis is more robust and consistent than AFT analysis. And when sample size is 20 or above and censoring rate is 0.8 or below, AFT analysis will have slight higher convergence rate and power than PH, but not much improvement in Type I error rates when sample size is big(n>50) and censoring rate is low(p<0.3). Considering the privilege of not requiring knowledge of distribution for PH analysis, we concluded that PH analysis is robust in hypothesis testing for covariate effects using data generated from an AFT model. Survival analysis Proportional hazards(PH) model Accelerated failure time(AFT) model Covariate effect test Statistics (0463)
180	Dinâmica espacial e análise de sobrevivência no setor sucroenergético do Brasil: 2001 - 2016 / Spatial dynamics and survival analysis on the Brazilian sugar and ethanol industry: 2001 - 2016 Bastos, André da Cunha 17 April 2019 (has links) O setor sucroenergético do Brasil encarou uma década de expansão da produção, seguida por um período de retração das atividades. Neste período, 80 usinas e destilarias deixaram de operar por motivo de falência das empresas ou decisões operacionais dos grupos controladores. A cana de açúcar possui características biológicas que determinam a necessidade de processamento industrial em poucas horas após a atividade de colheita, sob pena de grande perda de teor de sacarose. Essa questão tecnológica determina que a organização espacial da produção de açúcar e etanol seja preferencialmente realizada dentro de um raio geográfico reduzido em relação aos canaviais. O objetivo desta tese é analisar se as recentes paralisações e falências no setor sucroenergético do Brasil foram influenciadas pelas condições operacionais dos mercados regionais de cana de açúcar, assim como identificar os principais determinantes dos eventos de paralisação das atividades das usinas e destilarias entre 2001 e 2016. Indicadores de estrutura de mercado e concentração espacial são desenvolvidos e discutidos com base no conceito de mercado antitruste usado pelas principais agências de defesa da concorrência. A descrição da estrutura de mercado e a intensidade dos eventos de desligamento e falência em cada região são avaliados considerando a existência de múltiplas unidades de processamento que são de propriedade de uma mesma empresa em cada região. Os resultados indicam que a concentração e a disputa pela cana de açúcar como insumo por usinas localizadas próximas umas das outras estão espacialmente associadas aos eventos de falência e desligamento. No entanto, a análise de sobrevivência indica que o aumento do número de concorrentes demandando este insumo e a expansão do processamento de cana de açúcar no raio geográfico de colheita viável não foram fatores que contribuíram isoladamente para as falências e interrupções. Estes eventos foram menos frequentes entre as unidades industriais cuja empresa controladora conta com investidores estrangeiros na estrutura de capital e nos casos em que uma parcela mais alta dos insumos é obtida por meio de uma estrutura verticalmente integrada. / Brazil\'s sugar and ethanol industry faced a decade of expansion of production, followed by a period of retraction of activities. During this period, 80 mills and distilleries ceased to operate due to corporate bankruptcy or operational decisions of the controlling groups. Sugarcane has biological characteristics that determine the need for industrial processing in a few hours after harvesting, under penalty of great loss of sucrose content. This technological issue determines that the spatial organization of sugar and ethanol production is preferably carried out within a reduced geographic radius in relation to sugar cane fields. The objective of this thesis is to analyze if the recent shutdowns and bankruptcies in the Brazilian sugarcane industry were influenced by the operational conditions of the regional sugarcane markets, as well as to identify the main determinants of stoppage events of the maills and distilleries between 2001 and 2016. Indicators of market structure and spatial concentration are developed and discussed based on the concept of the antitrust market used by the major antitrust agencies. The description of the market structure and the intensity of the shutdown and bankruptcy events in each region are evaluated considering the existence of multiple processing units that are owned by the same company in each region. The results indicate that the concentration and the dispute for sugarcane as an input by plants located close to each other are spatially associated with the events of bankruptcy and shutdown. However, the survival analysis indicates that the increase in the number of competitors demanding this input and the expansion of sugarcane processing in the viable crop geographical radius were not factors that contributed in isolation to bankruptcies and interruptions. These events were less frequent among industrial units whose parent company relies on foreign investors in the capital structure and where a higher share of inputs are obtained through a vertically integrated structure. Análise de sobrevivência Cana de açúcar Econometria espacial Estrutura de mercado. Market structure. Spatial econometrics Sugarcane Survival analysis

Search results