Global ETD Search

61	Estimation of wood fibre length distributions from censored mixture data Svensson, Ingrid January 2007 (has links) The motivating forestry background for this thesis is the need for fast, non-destructive, and cost-efficient methods to estimate fibre length distributions in standing trees in order to evaluate the effect of silvicultural methods and breeding programs on fibre length. The usage of increment cores is a commonly used non-destructive sampling method in forestry. An increment core is a cylindrical wood sample taken with a special borer, and the methods proposed in this thesis are especially developed for data from increment cores. Nevertheless the methods can be used for data from other sampling frames as well, for example for sticks with the shape of an elongated rectangular box. This thesis proposes methods to estimate fibre length distributions based on censored mixture data from wood samples. Due to sampling procedures, wood samples contain cut (censored) and uncut observations. Moreover the samples consist not only of the fibres of interest but of other cells (fines) as well. When the cell lengths are determined by an automatic optical fibre-analyser, there is no practical possibility to distinguish between cut and uncut cells or between fines and fibres. Thus the resulting data come from a censored version of a mixture of the fine and fibre length distributions in the tree. The methods proposed in this thesis can handle this lack of information. Two parametric methods are proposed to estimate the fine and fibre length distributions in a tree. The first method is based on grouped data. The probabilities that the length of a cell from the sample falls into different length classes are derived, the censoring caused by the sampling frame taken into account. These probabilities are functions of the unknown parameters, and ML estimates are found from the corresponding multinomial model. The second method is a stochastic version of the EM algorithm based on the individual length measurements. The method is developed for the case where the distributions of the true lengths of the cells at least partially appearing in the sample belong to exponential families. The cell length distribution in the sample and the conditional distribution of the true length of a cell at least partially appearing in the sample given the length in the sample are derived. Both these distributions are necessary in order to use the stochastic EM algorithm. Consistency and asymptotic normality of the stochastic EM estimates is proved. The methods are applied to real data from increment cores taken from Scots pine trees (Pinus sylvestris L.) in Northern Sweden and further evaluated through simulation studies. Both methods work well for sample sizes commonly obtained in practice. censoring fibre length distribution identifiability increment core length bias mixture stochastic EM algorithm Mathematical statistics Matematisk statistik
62	Robust Methods for Interval-Censored Life History Data Tolusso, David January 2008 (has links) Interval censoring arises frequently in life history data, as individuals are often only observed at a sequence of assessment times. This leads to a situation where we do not know when an event of interest occurs, only that it occurred somewhere between two assessment times. Here, the focus will be on methods of estimation for recurrent event data, current status data, and multistate data, subject to interval censoring. With recurrent event data, the focus is often on estimating the rate and mean functions. Nonparametric estimates are readily available, but are not smooth. Methods based on local likelihood and the assumption of a Poisson process are developed to obtain smooth estimates of the rate and mean functions without specifying a parametric form. Covariates and extra-Poisson variation are accommodated by using a pseudo-profile local likelihood. The methods are assessed by simulations and applied to a number of datasets, including data from a psoriatic arthritis clinic. Current status data is an extreme form of interval censoring that occurs when each individual is observed at only one assessment time. If current status data arise in clusters, this must be taken into account in order to obtain valid conclusions. Copulas offer a convenient framework for modelling the association separately from the margins. Estimating equations are developed for estimating marginal parameters as well as association parameters. Efficiency and robustness to the choice of copula are examined for first and second order estimating equations. The methods are applied to data from an orthopedic surgery study as well as data on joint damage in psoriatic arthritis. Multistate models can be used to characterize the progression of a disease as individuals move through different states. Considerable attention is given to a three-state model to characterize the development of a back condition known as spondylitis in psoriatic arthritis, along with the associated risk of mortality. Robust estimates of the state occupancy probabilities are derived based on a difference in distribution functions of the entry times. A five-state model which differentiates between left-side and right-side spondylitis is also considered, which allows us to characterize what effect spondylitis on one side of the body has on the development of spondylitis on the other side. Covariate effects are considered through multiplicative time homogeneous Markov models. The robust state occupancy probabilities are also applied to data on CMV infection in patients with HIV. multistate interval censoring robust estimation local likelihood recurrent events current status data generalized estimating equations piecewise constant Statistics (Biostatistics)
63	Robust Methods for Interval-Censored Life History Data Tolusso, David January 2008 (has links) Interval censoring arises frequently in life history data, as individuals are often only observed at a sequence of assessment times. This leads to a situation where we do not know when an event of interest occurs, only that it occurred somewhere between two assessment times. Here, the focus will be on methods of estimation for recurrent event data, current status data, and multistate data, subject to interval censoring. With recurrent event data, the focus is often on estimating the rate and mean functions. Nonparametric estimates are readily available, but are not smooth. Methods based on local likelihood and the assumption of a Poisson process are developed to obtain smooth estimates of the rate and mean functions without specifying a parametric form. Covariates and extra-Poisson variation are accommodated by using a pseudo-profile local likelihood. The methods are assessed by simulations and applied to a number of datasets, including data from a psoriatic arthritis clinic. Current status data is an extreme form of interval censoring that occurs when each individual is observed at only one assessment time. If current status data arise in clusters, this must be taken into account in order to obtain valid conclusions. Copulas offer a convenient framework for modelling the association separately from the margins. Estimating equations are developed for estimating marginal parameters as well as association parameters. Efficiency and robustness to the choice of copula are examined for first and second order estimating equations. The methods are applied to data from an orthopedic surgery study as well as data on joint damage in psoriatic arthritis. Multistate models can be used to characterize the progression of a disease as individuals move through different states. Considerable attention is given to a three-state model to characterize the development of a back condition known as spondylitis in psoriatic arthritis, along with the associated risk of mortality. Robust estimates of the state occupancy probabilities are derived based on a difference in distribution functions of the entry times. A five-state model which differentiates between left-side and right-side spondylitis is also considered, which allows us to characterize what effect spondylitis on one side of the body has on the development of spondylitis on the other side. Covariate effects are considered through multiplicative time homogeneous Markov models. The robust state occupancy probabilities are also applied to data on CMV infection in patients with HIV. multistate interval censoring robust estimation local likelihood recurrent events current status data generalized estimating equations piecewise constant Statistics (Biostatistics)
64	Statistical Inference From Complete And Incomplete Data Can Mutan, Oya 01 January 2010 (has links) (PDF) Let X and Y be two random variables such that Y depends on X=x. This is a very common situation in many real life applications. The problem is to estimate the location and scale parameters in the marginal distributions of X and Y and the conditional distribution of Y given X=x. We are also interested in estimating the regression coefficient and the correlation coefficient. We have a cost constraint for observing X=x, the larger x is the more expensive it becomes. The allowable sample size n is governed by a pre-determined total cost. This can lead to a situation where some of the largest X=x observations cannot be observed (Type II censoring). Two general methods of estimation are available, the method of least squares and the method of maximum likelihood. For most non-normal distributions, however, the latter is analytically and computationally problematic. Instead, we use the method of modified maximum likelihood estimation which is known to be essentially as efficient as the maximum likelihood estimation. The method has a distinct advantage: It yields estimators which are explicit functions of sample observations and are, therefore, analytically and computationally straightforward. In this thesis specifically, the problem is to evaluate the effect of the largest order statistics x(i) (i&gt / n-r) in a random sample of size n (i) on the mean E(X) and variance V(X) of X, (ii) on the cost of observing the x-observations, (iii) on the conditional mean E(Y\|X=x) and variance V(Y\|X=x) and (iv) on the regression coefficient. It is shown that unduly large x-observations have a detrimental effect on the allowable sample size and the estimators, both least squares and modified maximum likelihood. The advantage of not observing a few largest observations are evaluated. The distributions considered are Weibull, Generalized Logistic and the scaled Student&rsquo / s t. QA Probabilities 273-274.76 s t.
65	Jackknife Empirical Likelihood for the Accelerated Failure Time Model with Censored Data Bouadoumou, Maxime K 15 July 2011 (has links) Kendall and Gehan estimating functions are used to estimate the regression parameter in accelerated failure time (AFT) model with censored observations. The accelerated failure time model is the preferred survival analysis method because it maintains a consistent association between the covariate and the survival time. The jackknife empirical likelihood method is used because it overcomes computation difficulty by circumventing the construction of the nonlinear constraint. Jackknife empirical likelihood turns the statistic of interest into a sample mean based on jackknife pseudo-values. U-statistic approach is used to construct the confidence intervals for the regression parameter. We conduct a simulation study to compare the Wald-type procedure, the empirical likelihood, and the jackknife empirical likelihood in terms of coverage probability and average length of confidence intervals. Jackknife empirical likelihood method has a better performance and overcomes the under-coverage problem of the Wald-type method. A real data is also used to illustrate the proposed methods. Confidence interval Coverage probability Jackknife empirical likelihood Right-censoring U-statistic Kendall’s estimating equation Gehan Logrank Mathematics
66	Modelos semiparamétricos de fração de cura para dados com censura intervalar / Semiparametric cure rate models for interval censored data Julio Cezar Brettas da Costa 18 February 2016 (has links) Modelos de fração de cura compõem uma vasta subárea da análise de sobrevivência, apresentando grande aplicabilidade em estudos médicos. O uso deste tipo de modelo é adequado em situações tais que o pesquisador reconhece a existência de uma parcela da população não suscetível ao evento de interesse, consequentemente considerando a probabilidade de que o evento não ocorra. Embora a teoria encontre-se consolidada tratando-se de censuras à direita, a literatura de modelos de fração de cura carece de estudos que contemplem a estrutura de censura intervalar, incentivando os estudos apresentados neste trabalho. Três modelos semiparamétricos de fração de cura para este tipo de censura são aqui considerados para aplicações em conjuntos de dados reais e estudados por meio de simulações. O primeiro modelo, apresentado por Liu e Shen (2009), trata-se de um modelo de tempo de promoção com estimação baseada em uma variação do algoritmo EM e faz uso de técnicas de otimização convexa em seu processo de maximização. O modelo proposto por Lam et al. (2013) considera um modelo semiparamétrico de Cox, modelando a fração de cura da população através de um efeito aleatório com distribuição Poisson composta, utilizando métodos de aumento de dados em conjunto com estimadores de máxima verossimilhança. Em Xiang et al. (2011), um modelo de mistura padrão é proposto adotando um modelo logístico para explicar a incidência e fazendo uso da estrutura de riscos proporcionais para os efeitos sobre o tempo. Os dois últimos modelos mencionados possuem extensões para dados agrupados, utilizadas nas aplicações deste trabalho. Uma das principais motivações desta dissertação consiste em um estudo conduzido por pesquisadores da Fundação Pró-Sangue, em São Paulo - SP, cujo interesse reside em avaliar o tempo até a ocorrência de anemia em doadores de repetição por meio de avaliações periódicas do hematócrito, medido em cada visita ao hemocentro. A existência de uma parcela de doadores não suscetíveis à doença torna conveniente o uso dos modelos estudados. O segundo conjunto de dados analisado trata-se de um conjunto de observações periódicas de cervos de cauda branca equipados com rádiocolares. Tem-se como objetivo a avaliação do comportamento migratório dos animais no inverno para determinadas condições climáticas e geográficas, contemplando a possibilidade de os cervos não migrarem. Um estudo comparativo entre os modelos propostos é realizado por meio de simulações, a fim de avaliar a robustez ao assumir-se determinadas especificações de cenário e fração de cura. Até onde sabemos, nenhum trabalho comparando os diferentes mecanismos de cura na presença de censura intervalar foi realizado até o presente momento. / Cure rate models define an vast sub-area of the survival analysis, presenting great applicability in medical studies. The use of this type of model is suitable in situations such that the researcher recognizes the existence of an non-susceptible part of the population to the event of interest, considering then the probability that such a event does not occur. Although the theory finds itself consolidated when considering right censoring, the literature of cure rate models lacks of interval censoring studies, encouraging then the studies presented in this work. Three semiparametric cure rate models for this type of censoring are considered here for real data analysis and then studied by means of simulations. The first model, presented by Liu e Shen (2009), refers to a promotion time model with its estimation based on an EM algorithm variation and using convex optimization techniques for the maximization process. The model proposed by Lam et al. (2013) considers a Cox semiparametric model, modelling then the population cure fraction by an frailty distributed as an compound Poisson, used jointly with data augmentation methods and maximum likelihood estimators. In Xiang et al. (2011), an standard mixture cure rate model is proposed adopting an logistic model for explaining incidence and using proportional hazards structure for the effects over the time to event. The two last mentioned models have extensions for clustered data analysis and are used on the examples of applications of this work. One of the main motivations of this dissertation consists on a study conducted by researches of Fundação Pró-Sangue, in São Paulo - SP, whose interest resides on evaluating the time until anaemia, occurring to recurrent donors, detected through periodic evaluations of the hematocrit, measured on each visit to the blood center. The existence of a non-susceptible portion of donors turns the use of the cure rate models convenient. The second analysed dataset consists on an set of periodic observations of radio collar equipped white tail deers. The goal here is the evaluation of when these animals migrate in the winter for specic weather and geographic conditions, contemplating the possibility that deer could not migrate. A comparative study among the proposed models is realized using simulations, in order to assess the robustness when assuming determined specifications about scenario and cure fraction. As far as we know, no work has been done comparing different cure mechanisms in the presence of interval censoring data until the present moment. Análise de sobrevivência Anemia Censura intervalar Fração de cura Migração de cervos Simulações Anaemia Cure rate Deer migration Interval censoring Simulations Survival analysis
67	Estimação e comparação de curvas de sobrevivência sob censura informativa. / Estimation and comparison of survival curves with informative censoring. Raony Cassab Castro Cesar 10 July 2013 (has links) A principal motivação desta dissertação é um estudo realizado pelo Instituto do Câncer do Estado de São Paulo (ICESP), envolvendo oitocentos e oito pacientes com câncer em estado avançado. Cada paciente foi acompanhado a partir da primeira admissão em uma unidade de terapia intensiva (UTI) pelo motivo de câncer, por um período de no máximo dois anos. O principal objetivo do estudo é avaliar o tempo de sobrevivência e a qualidade de vida desses pacientes através do uso de um tempo ajustado pela qualidade de vida (TAQV). Segundo Gelber et al. (1989), a combinação dessas duas informações, denominada TAQV, induz a um esquema de censura informativa; consequentemente, os métodos tradicionais de análise para dados censurados, tais como o estimador de Kaplan-Meier (Kaplan e Meier, 1958) e o teste de log-rank (Peto e Peto, 1972), tornam-se inapropriados. Visando sanar essa deficiência, Zhao e Tsiatis (1997) e Zhao e Tsiatis (1999) propuseram novos estimadores para a função de sobrevivência e, em Zhao e Tsiatis (2001), foi desenvolvido um teste análogo ao teste log-rank para comparar duas funções de sobrevivência. Todos os métodos considerados levam em conta a ocorrência de censura informativa. Neste trabalho avaliamos criticamente esses métodos, aplicando-os para estimar e testar curvas de sobrevivência associadas ao TAQV no estudo do ICESP. Por fim, utilizamos um método empírico, baseado na técnica de reamostragem bootstrap, a m de propor uma generalização do teste de Zhao e Tsiatis para mais do que dois grupos. / The motivation for this research is related to a study undertaken at the Cancer Institute at São Paulo (ICESP), which comprises the follow up of eight hundred and eight patients with advanced cancer. The patients are followed up from the first admission to the intensive care unit (ICU) for a period up to two years. The main objective is to evaluate the quality-adjusted lifetime (QAL). According to Gelber et al. (1989), the combination of both this information leads to informative censoring; therefore, traditional methods of survival analisys, such as the Kaplan-Meier estimator (Kaplan and Meier, 1958) and log-rank test (Peto and Peto, 1972) become inappropriate. For these reasons, Zhao and Tsiatis (1997) and Zhao and Tsiatis (1999) proposed new estimators for the survival function, and Zhao and Tsiatis (2001) developed a test similar to the log-rank test to compare two survival functions. In this dissertation we critically evaluate and summarize these methods, and employ then in the estimation and hypotheses testing to compare survival curves derived for QAL, the proposed methods to estimate and test survival functions under informative censoring. We also propose a empirical method, based on the bootstrap resampling method, to compare more than two groups, extending the proposed test by Zhao and Tsiatis. Análise de sobrevivência censura informativa estatística não paramétrica estimação qualidade de vida estimation informative censoring nonparametric statistics quality of life Survival analysis
68	Modélisation de l'effet de facteurs de risque sur la probabilité de devenir dément et d'autres indicateurs de santé / Modelling of the effect of risk factors on the probability of becoming demented and others health indicators Sabathé, Camille 15 November 2019 (has links) Les indicateurs épidémiologiques de la démence tels que l'espérance de vie sans démence pour un âge donné ou le risque absolu sont des quantités utiles en santé publique. L'observation de la démence en temps discret entraine une censure par intervalle du temps d'apparition de la pathologie. De plus, certains individus peuvent développer une démence et décéder entre deux visites de suivi. Un modèle illness-death pour données censurées par intervalle est une solution pour modéliser simultanément les risques de démence et de décès et pour éviter la sous-estimation de l'incidence de la démence.Ces indicateurs dépendent à la fois du risque de démence mais aussi du risque de décès, contrairement à l'intensité de transition de la démence. Les modèles de régression disponibles ne prennent pas en compte la censure par intervalle ou ne sont pas adaptés à ces indicateurs. L'objectif de ce travail est de quantifier l'effet de facteurs de risque sur ces indicateurs épidémiologiques par des modèles de régression. La première partie de cette thèse est consacrée à l'extension de l'approche par pseudo-valeurs aux données censurées par intervalle. Les pseudo-valeurs sont calculées à partir d'estimateurs paramétriques ou d'estimateurs du maximum de vraisemblance pénalisée. Elles sont utilisées comme variable d'intérêt dans des modèles linéaires généralisés ou des modèles additifs généralisés pour permettre un effet non-linéaire des variables explicatives quantitatives. La seconde partie de cette thèse porte sur le développement d'un modèle par linéarisation des indicateurs épidémiologiques. L'idée est de calculer l'indicateur conditionnellement aux variables explicatives à partir des intensités de transition d'un modèle illness-death avec censure par intervalle du temps d'apparition de la maladie. Ces deux approches sont appliquées aux données de la cohorte française PAQUID pour étudier par exemple l'effet d'un score psychométrique (le MMS) sur des indicateurs épidémiologiques de la démence. / Dementia epidemiological indicators as the life expectancy without dementia at a specific age or the absolute risk are quantities meaningful for public health. Dementia is observed on discrete-time in cohort studies which leads to interval censoring of the time-to-onset. Moreover, some subjects can develop dementia and die between two follow-up visits. Illness-death model for interval-censored data is a solution to model simultaneously dementia risk and death risk and to avoid under-estimation of dementia incidence. These indicators depend on both dementia and death risks as opposed to dementia transition intensity. Available regression models do not take into account interval censoring or are not suitable for these indicators. The aim of this work is to propose regression models to quantify impact of risk factors on these indicators. Firstly, the pseudo-values approach is extended to interval-censored data. Pseudo-values are computed by parametric estimators or by maximum penalized likelihood estimators. Then pseudo-values are used as outcome in a generalized linear models or in a generalized additive models in case of non-linear effect of quantitative covariates. Secondly, the effect of covariates are summarized by linearization of the maximum likelihood estimator. In this part, the idea is to compute indicators conditionally on the covariates values from transition intensities of an illness-death model. These two approaches are applied to the French cohort PAQUID to study effect of a psychometric test (the MMS) on these indicators for example. Pseudo valeurs Censure par intervalle Démence Risque absolu Espérance de vie Pseudo values Interval censoring Dementia Absolue risk Life expectancy
69	Uncertainty in Estimation of Field-scale Variability of Soil Saturated Hydraulic Conductivity Abhishek Abhishek (7036820) 19 July 2022 (has links) <p>Saturated hydraulic conductivity (<em>K</em><sub><em>s</em></sub>) is among the most important soil properties that influence the partitioning of rainfall into surface and subsurface waters and is needed for understanding and modeling hydrologic processes at the field-scale. Field-scale variability of <em>K</em><sub><em>s</em></sub> is often represented as a lognormal random field, and its parameters are assessed either by making local- or point-scale measurements using instruments such as permeameters and infiltrometers or by calibrating probabilistic models with field-scale infiltration experiments under natural/artificial rainfall conditions. This research quantifies the uncertainty in the <em>K</em><sub><em>s</em></sub> random field when using observations from the above techniques and provides recommendations as to what constitutes a good experiment to assess the field-scale variability of <em>K</em><sub><em>s</em></sub>. Infiltration experiments with instruments sampling larger areas (or volumes) are typically expected to be more representative of field conditions than those sampling smaller ones; hence, the uncertainty arising from the field-scale natural rainfall-runoff experiments was evaluated first. A field-averaged infiltration model and Monte Carlo simulations were employed in a Bayesian framework to obtain the possible <em>K</em><sub><em>s</em></sub> random fields that would describe experimental observations over a field for a rainfall event. Results suggested the existence of numerous parameter combinations that could satisfy the experimental observations over a single rainfall event, and high variability of these combinations among different events, thereby providing insights regarding the identifiable space of <em>K</em><sub><em>s</em></sub> distributions from individual rainfall experiments. The non-unique parameter combinations from multiple rainfall events were subsequently consolidated using an information-theoretic measure, which provided a realistic estimate of our ability to quantify the spatial variability of <em>K</em><sub><em>s</em></sub> in natural fields using rainfall-runoff experiments. </p> <p> </p> <p>With the resolving ability from rainfall-runoff experiments constrained due to experimental limitations, the <em>K</em><sub><em>s</em></sub> estimates from in-situ point infiltration devices could provide additional information in conjunction with the rainfall-runoff experiments. With this hypothesis, the role of three in-situ point infiltration devices --- the double-ring infiltrometer, CSIRO version of tension permeameter, and Guelph constant-head permeameter --- was then evaluated in characterizing the field-scale variability of <em>K</em><sub><em>s</em></sub>. Results suggested that <em>K</em><sub><em>s</em></sub> estimates from none of the instruments could individually represent the field conditions due to the presence of measurement and structural errors besides any sampling biases; hence any naive efforts at assimilating their data (e.g., data pooling, instrument-specific transforms, etc.) and augmenting with field-scale rainfall-runoff observations as informative prior distributions would not be fruitful. In the absence of benchmarks establishing the true <em>K</em><sub><em>s</em></sub> field, it is also impossible to quantify these errors; therefore, a posterior coarsening method was used to alleviate their impact when estimating the field-scale variability of <em>K</em><sub><em>s</em></sub>. </p> <p> </p> <p>Finally, the impact of censored moments on the maximum likelihood (ML) estimates of the <em>K</em><sub><em>s</em></sub> distribution parameters was studied. Results highlighted the rainfall event's ability to only be able to resolve a fraction of the <em>K</em><sub><em>s</em></sub> field, and that the time and duration of peak rainfall intensity play a role in resolving the <em>K</em><sub><em>s</em></sub> field, besides the peak rainfall intensity. The reliability of the ML estimates is a function of the fraction of the <em>K</em><sub><em>s</em></sub> field resolved by the rainfall event, until a limit when the estimates start to overfit the calibration data. Rainfall-runoff experiments for which the ML estimates resolve 30--80 % of the <em>K</em><sub><em>s</em></sub> distribution are likely to be good calibration events. </p> Groundwater hydrology saturated hydraulic conductivity rainfall-runoff experiments point infiltration Bayesian reliability censoring spatial variability areal infiltration
70	Zero-Inflated Censored Regression Models: An Application with Episode of Care Data Prasad, Jonathan P. 07 July 2009 (has links) (PDF) The objective of this project is to fit a sequence of increasingly complex zero-inflated censored regression models to a known data set. It is quite common to find censored count data in statistical analyses of health-related data. Modeling such data while ignoring the censoring, zero-inflation, and overdispersion often results in biased parameter estimates. This project develops various regression models that can be used to predict a count response variable that is affected by various predictor variables. The regression parameters are estimated with Bayesian analysis using a Markov chain Monte Carlo (MCMC) algorithm. The tests for model adequacy are discussed and the models are applied to an observed data set. zero-inflation over-dispersion censoring Poisson generalized Poisson negative binomial Bayesian MCMC Proc MCMC health care Statistics and Probability

Search results