Global ETD Search

91	Metody analýzy longitudinálních dat / Methods of longitudinal data analysis Jindrová, Linda January 2015 (has links) Práce se zabývá longitudinálními daty - měřeními, která jsou prová- děna opakovaně na stejných subjektech. Popisuje r·zné typy model·, které jsou vhodné pro jejich analýzu. Postupuje od nejjednodušších lineárních model· s pevnými nebo náhodnými efekty, přes lineární a nelineární modely se smíšenými efekty, až ke zobecněným lineárním model·m a generalized estimating equati- ons (GEE). Vždy je uveden tvar modelu a zp·sob odhadu parametr·. Jednotlivé modely jsou také porovnávány mezi sebou. Teoretické poznatky jsou doplněny aplikacemi na reálná data. Pomocí lineárních model· analyzujeme data o výrobě v USA, nelineární modely využijeme k vysvětlení závislosti koncentrace léčiva v krvi na čase a GEE aplikujeme na data týkající se dýchacích potíží u dětí. 1
92	Statistical methods for genetic association studies: multi-cohort and rare genetic variants approaches Chen, Han 23 September 2015 (has links) Genetic association studies have successfully identified many genetic markers associated with complex human diseases and related quantitative traits. However, for most complex diseases and quantitative traits, all associated genetic markers identified to date only explain a small proportion of heritability. Thus, exploring the unexplained heritability in these traits will help us discover novel genetic determinants for these traits and better understand disease etiology and pathophysiology. Due to limited sample size, a single cohort study may not have sufficient power to identify novel genetic association with a small effect size, and meta-analysis approaches have been proposed and applied to combine results from multiple cohorts in large consortia, increasing the sample size and statistical power. Rare genetic variants and gene by environment interaction may both play a role in genetic association studies. In this dissertation, we develop statistical methods in meta-analysis, rare genetic variants analysis and gene by environment interaction analysis, conduct extensive simulation studies, and apply these methods in real data examples. First, we develop a method of moments estimator for the between-study covariance matrix in random effects model multivariate meta-analysis. Our estimator is the first such estimator in matrix form, and holds the invariance property to linear transformations. It has similar performance with existing methods in simulation studies and real data analysis. Next, we extend the Sequence Kernel Association Test (SKAT), a rare genetic variants analysis approach for unrelated individuals, to be applicable in family samples for quantitative traits. The extension is necessary, as the original test has inflated type I error when directly applied to related individuals, and selecting an unrelated subset from family samples reduces the sample size and power. Finally, we derive methods for rare genetic variants analysis in detecting gene by environment interaction on quantitative traits, in the context of univariate test on the interaction term parameter. We develop statistical tests in the settings of both burden test and SKAT, for both unrelated and related individuals. Our methods are relevant to genetic association studies, and we hope that they can facilitate research in this field and beyond. Biostatistics Correlated data analysis Gene-environment interaction Meta-analysis Method of moments Random effects model Rare genetic variants
93	Uncertainty Quantification and Propagation in Materials Modeling Using a Bayesian Inferential Framework Ricciardi, Denielle E. 13 November 2020 (has links) No description available. Materials Science Statistics Uncertainty Quantification Bayesian Inference Random Effects Crystal Plasticity VPSC CALPHAD Model Discrepancy, Gaussian Process
94	Contributions to statistical methods for meta-analysis of diagnostic test accuracy studies / Methods for meta-analysis of diagnostic test accuracy studies Negeri, Zelalem January 2019 (has links) Meta-analysis is a popular statistical method that synthesizes evidence from multiple studies. Conventionally, both the hierarchical and bivariate models for meta-analysis of diagnostic test accuracy (DTA) studies assume that the random-effects follow the bivariate normal distribution. However, this assumption is restrictive, and inferences could be misleading when it is violated. On the other hand, subjective methods such as inspection of forest plots are used to identify outlying studies in a meta-analysis of DTA studies. Moreover, inferences made using the well-established bivariate random-effects models, when outlying or influential studies are present, may lead to misleading conclusions. Thus, the aim of this thesis is to address these issues by introducing alternative and robust statistical methods. First, we extend the current bivariate linear mixed model (LMM) by assuming a flexible bivariate skew-normal distribution for the random-effects. The marginal distribution of the proposed model is analytically derived so that parameter estimation can be performed using standard likelihood methods. Overall, the proposed model performs better in terms of confidence interval width of the overall sensitivity and specificity, and with regards to bias and root mean squared error of the between-study (co)variances than the traditional bivariate LMM. Second, we propose objective methods based on solid statistical reasoning for identifying outlying and/or influential studies in a meta-analysis of DTA studies. The performances of the proposed methods are evaluated using a simulation study. The proposed methods outperform and avoid the subjectivity of the currently used ad hoc approaches. Finally, we develop a new robust bivariate random-effects model which accommodates outlying and influential observations and leads to a robust statistical inference by down-weighting the effect of outlying and influential studies. The proposed model produces robust point estimates of sensitivity and specificity compared to the standard models, and also generates a similar point and interval estimates of sensitivity and specificity as the standard models in the absence of outlying or influential studies. / Thesis / Doctor of Philosophy (PhD) / Diagnostic tests vary from the noninvasive rapid strep test used to identify whether a patient has a bacterial sore throat to the much complex and invasive biopsy test used to examine the presence, cause, and extent of a severe condition, say cancer. Meta-analysis is a widely used statistical method that synthesizes evidence from several studies. In this thesis, we develop novel statistical methods extending the traditional methods for meta-analysis of diagnostic test accuracy studies. Our proposed methods address the issue of modelling asymmetrical data, identifying outlier studies, and optimally accommodating these outlying studies in a meta-analysis of diagnostic test accuracy studies. Using both real-life and simulated datasets, we show that our proposed methods perform better than conventional methods in a wide range of scenarios. %Therefore, we believe that our proposed methods are essential for methodologists, clinicians and health policy professionals in the process of making a correct judgment to using the appropriate diagnostic test to diagnose patients. Statistical methods meta-analysis diagnostic test accuracy studies outlying studies influential studies skewness random-effects model robust models sensitivity specificity
95	Model-based Tests for Standards Evaluation and Biological Assessments Li, Zhengrong 27 September 2007 (has links) Implementation of the Clean Water Act requires agencies to monitor aquatic sites on a regular basis and evaluate the quality of these sites. Sites are evaluated individually even though there may be numerous sites within a watershed. In some cases, sampling frequency is inadequate and the evaluation of site quality may have low reliability. This dissertation evaluates testing procedures for determination of site quality based on modelbased procedures that allow for other sites to contribute information to the data from the test site. Test procedures are described for situations that involve multiple measurements from sites within a region and single measurements when stressor information is available or when covariates are used to account for individual site differences. Tests based on analysis of variance methods are described for fixed effects and random effects models. The proposed model-based tests compare limits (tolerance limits or prediction limits) for the data with the known standard. When the sample size for the test site is small, using model-based tests improves the detection of impaired sites. The effects of sample size, heterogeneity of variance, and similarity between sites are discussed. Reference-based standards and corresponding evaluation of site quality are also considered. Regression-based tests provide methods for incorporating information from other sites when there is information on stressors or covariates. Extension of some of the methods to multivariate biological observations and stressors is also discussed. Redundancy analysis is used as a graphical method for describing the relationship between biological metrics and stressors. A clustering method for finding stressor-response relationships is presented and illustrated using data from the Mid-Atlantic Highlands. Multivariate elliptical and univariate regions for assessment of site quality are discussed. / Ph. D. model-based tests water quality assessment random effects models regression-based test redundancy analysis reduced-rank analysis fixed effects models
96	Statistical Methods for Multi-type Recurrent Event Data Based on Monte Carlo EM Algorithms and Copula Frailties Bedair, Khaled Farag Emam 01 October 2014 (has links) In this dissertation, we are interested in studying processes which generate events repeatedly over the follow-up time of a given subject. Such processes are called recurrent event processes and the data they provide are referred to as recurrent event data. Examples include the cancer recurrences, recurrent infections or disease episodes, hospital readmissions, the filing of warranty claims, and insurance claims for policy holders. In particular, we focus on the multi-type recurrent event times which usually arise when two or more different kinds of events may occur repeatedly over a period of observation. Our main objectives are to describe features of each marginal process simultaneously and study the dependence among different types of events. We present applications to a real dataset collected from the Nutritional Prevention of Cancer Trial. The objective of the clinical trial was to evaluate the efficacy of Selenium in preventing the recurrence of several types of skin cancer among 1312 residents of the Eastern United States. Four chapters are involved in this dissertation. Chapter 1 introduces a brief background to the statistical techniques used to develop the proposed methodology. We cover some concepts and useful functions related to survival data analysis and present a short introduction to frailty distributions. The Monte Carlo expectation maximization (MCEM) algorithm and copula functions for the multivariate variables are also presented in this chapter. Chapter 2 develops a multi-type recurrent events model with multivariate Gaussian random effects (frailties) for the intensity functions. In this chapter, we present nonparametric baseline intensity functions and a multivariate Gaussian distribution for the multivariate correlated random effects. An MCEM algorithm with MCMC routines in the E-step is adopted for the partial likelihood to estimate model parameters. Equations for the variances of the estimates are derived and variances of estimates are computed by Louis' formula. Predictions of the individual random effects are obtained because in some applications the magnitude of the random effects is of interest for a better understanding and interpretation of the variability in the data. The performance of the proposed methodology is evaluated by simulation studies, and the developed model is applied to the skin cancer dataset. Chapter 3 presents copula-based semiparametric multivariate frailty models for multi-type recurrent event data with applications to the skin cancer data. In this chapter, we generalize the multivariate Gaussian assumption of the frailty terms and allow the frailty distributions to have more features than the symmetric, unimodal properties of the Gaussian density. More flexible approaches to modeling the correlated frailty, referred to as copula functions, are introduced. Copula functions provide tremendous flexibility especially in allowing taking the advantages of a variety of choices for the marginal distributions and correlation structures. Semiparametric intensity models for multi-type recurrent events based on a combination of the MCEM with MCMC sampling methods and copula functions are introduced. The combination of the MCEM approach and copula function is flexible and is a generally applicable approach for obtaining inferences of the unknown parameters for high dimension frailty models. Estimation procedures for fixed effects, nonparametric baseline intensity functions, copula parameters, and predictions for the subject-specific multivariate frailties and random effects are obtained. Louis' formula for variance estimates are derived and calculated. We investigate the impact of the specification of the frailty and random effect models on the inference of covariate effects, cumulative baseline intensity functions, prediction of random effects and frailties, and the estimation of the variance-covariance components. Performances of proposed models are evaluated by simulation studies. Applications are illustrated through the dataset collected from the clinical trial of patients with skin cancer. Conclusions and some remarks for future work are presented in Chapter 4. / Ph. D. MCEM algorithm cancer studies multi-type recurrent events multivariate frailty semiparametric model random effects copula survival analysis.
97	Modelos lineares mistos para dados longitudinais em ensaio fatorial com tratamento adicional / Mixed linear models for longitudinal data in a factorial experiment with additional treatment Rocha, Gilson Silvério da 09 October 2015 (has links) Em experimentos agronômicos são comuns ensaios planejados para estudar determinadas culturas por meio de múltiplas mensurações realizadas na mesma unidade amostral ao longo do tempo, espaço, profundidade entre outros. Essa forma com que as mensurações são coletadas geram conjuntos de dados que são chamados de dados longitudinais. Nesse contexto, é de extrema importância a utilização de metodologias estatísticas que sejam capazes de identificar possíveis padrões de variação e correlação entre as mensurações. A possibilidade de inclusão de efeitos aleatórios e de modelagem das estruturas de covariâncias tornou a metodologia de modelos lineares mistos uma das ferramentas mais apropriadas para a realização desse tipo de análise. Entretanto, apesar de todo o desenvolvimento teórico e computacional, a utilização dessa metodologia em delineamentos mais complexos envolvendo dados longitudinais e tratamentos adicionais, como os utilizados na área de forragicultura, ainda é passível de estudos. Este trabalho envolveu o uso do diagrama de Hasse e da estratégia top-down na construção de modelos lineares mistos no estudo de cortes sucessivos de forragem provenientes de um experimento de adubação com boro em alfafa (Medicago sativa L.) realizado no campo experimental da Embrapa Pecuária Sudeste. Primeiramente, considerou-se uma abordagem qualitativa para todos os fatores de estudo e devido à complexidade do delineamento experimental optou-se pela construção do diagrama de Hasse. A incorporação de efeitos aleatórios e seleção de estruturas de covariâncias para os resíduos foram realizadas com base no teste da razão de verossimilhanças calculado a partir de parâmetros estimados pelo método da máxima verossimilhança restrita e nos critérios de informação de Akaike (AIC), Akaike corrigido (AICc) e bayesiano (BIC). Os efeitos fixos foram testados por meio do teste Wald-F e, devido aos efeitos significativos das fontes de variação associadas ao fator longitudinal, desenvolveu-se um estudo de regressão. A construção do diagrama de Hasse foi fundamental para a compreensão e visualização simbólica do relacionamento de todos os fatores presentes no estudo, permitindo a decomposição das fontes de variação e de seus graus de liberdade, garantindo que todos os testes fossem realizados corretamente. A inclusão de efeito aleatório associado à unidade experimental foi essencial para a modelagem do comportamento de cada unidade e a estrutura de componentes de variância com heterogeneidade, incorporada aos resíduos, foi capaz de modelar eficientemente a heterogeneidade de variâncias presente nos diferentes cortes da cultura da alfafa. A verificação do ajuste foi realizada por meio de gráficos de diagnósticos de resíduos. O estudo de regressão permitiu avaliar a produtividade de matéria seca da parte aérea da planta (kg ha-1) de cortes consecutivos da cultura da alfafa, envolvendo a comparação de adubações com diferentes fontes e doses de boro. Os melhores resultados de produtividade foram observados para a combinação da fonte ulexita com as doses 3, 6 e 9 kg ha-1 de boro. / Assays aimed at studying some crops through multiple measurements performed in the same sample unit along time, space, depth etc. have been frequently adopted in agronomical experiments. This type of measurement originates a dataset named longitudinal data, in which the use of statistical procedures capable of identifying possible standards of variation and correlation among measurements has great importance. The possibility of including random effects and modeling of covariance structures makes the methodology of mixed linear models one of the most appropriate tools to perform this type of analysis. However, despite of all theoretical and computational development, the use of such methodology in more complex designs involving longitudinal data and additional treatments, such as those used in forage crops, still needs to be studied. The present work covered the use of the Hasse diagram and the top-down strategy in the building of mixed linear models for the study of successive cuts from an experiment involving boron fertilization in alfalfa (Medicago sativa L.) carried out in the field area of Embrapa Southeast Livestock. First, we considered a qualitative approach for all study factors and we chose the Hasse diagram building due to the model complexity. The inclusion of random effects and selection of covariance structures for residues were performed based on the likelihood ratio test, calculated based on parameters estimated through the restricted maximum likelihood method, the Akaike\'s Information Criterion (AIC), the Akaike\'s information criterion corrected (AICc) and the Bayesian Information Criterion (BIC). The fixed effects were analyzed through the Wald-F test and we performed a regression study due to the significant effects of the variation sources associated with the longitudinal factor. The Hasse diagram building was essential for understanding and symbolic displaying regarding the relation among all factors present in the study, thus allowing variation sources and their degrees of freedom to be decomposed, assuring that all tests were correctly performed. The inclusion of random effect associated with the sample unit was essential for modeling the behavior of each unity. Furthermore, the structure of variance components with heterogeneity, added to the residues, was capable of modeling efficiently the heterogeneity of variances present in the different cuts of alfalfa plants. The fit was checked by residual diagnostic plots. The regression study allowed us to evaluate the productivity of shoot dry matter (kg ha-1) related to successive cuts of alfalfa plants, involving the comparison of fertilization with different boron sources and doses. We observed the best productivity in the combination of the source ulexite with the doses 3, 6 and 9 kg ha-1 boron. Ácido bórico Alfafa Alfalfa Boric acid Diagrama de Hasse Dry matter Efeitos aleatórios Hasse diagram Likelihood ratio test Matéria seca Random effects Regressão Regression Teste da razão de verossimilhanças Ulexita Ulexite
98	Modeling Patterns of Small Scale Spatial Variation in Soil Huang, Fang 11 January 2006 (has links) The microbial communities found in soils are inherently heterogeneous and often exhibit spatial variations on a small scale. Becker et al. (2006) investigate this phenomenon and present statistical analyses to support their findings. In this project, alternative statistical methods and models are considered and employed in a re-analysis of the data from Becker. First, parametric nested random effects models are considered as an alternative to the nonparametric semivariogram models and kriging methods employed by Becker to analyze patterns of spatial variation. Second, multiple logistic regression models are employed to investigate factors influencing microbial community structure as an alternative to the simple logistic models used by Becker. Additionally, the microbial community profile data of Becker were unobservable at several points in the spatial grid. The Becker analysis assumes that the data are missing completely at random and as such have relatively little impact on inference. In this re-analysis, this assumption is investigated and it is shown that the pattern of missingness is correlated with both metabolic potential and spatial coordinates and thus provides useful information that was previously ignored by Becker. Multiple imputation methods are employed to incorporate the information present in the missing data pattern and results are compared with those of Becker. spatial variations nested random effects models semivariogram models kriging methods multiple logistic regression models missing multiple imputation Soil microbiology Mathematical models Spatial analysis (Statistics)
99	A Simulation Study On The Comparison Of Methods For The Analysis Of Longitudinal Count Data Inan, Gul 01 July 2009 (has links) (PDF) The longitudinal feature of measurements and counting process of responses motivate the regression models for longitudinal count data (LCD) to take into account the phenomenons such as within-subject association and overdispersion. One common problem in longitudinal studies is the missing data problem, which adds additional difficulties into the analysis. The missingness can be handled with missing data techniques. However, the amount of missingness in the data and the missingness mechanism that the data have affect the performance of missing data techniques. In this thesis, among the regression models for LCD, the Log-Log-Gamma marginalized multilevel model (Log-Log-Gamma MMM) and the random-intercept model are focused on. The performance of the models is compared via a simulation study under three missing data mechanisms (missing completely at random, missing at random conditional on observed data, and missing not random), two types of missingness percentage (10% and 20%), and four missing data techniques (complete case analysis, subject, occasion and conditional mean imputation). The simulation study shows that while the mean absolute error and mean square error values of Log-Log-Gamma MMM are larger in amount compared to the random-intercept model, both regression models yield parallel results. The simulation study results justify that the amount of missingness in the data and that the missingness mechanism that the data have, strictly influence the performance of missing data techniques under both regression models. Furthermore, while generally occasion mean imputation displays the worst performance, conditional mean imputation shows a superior performance over occasion and subject mean imputation and gives parallel results with complete case analysis. HA Statistics 36161
100	Assessing the Regularity and Predictability of the Age-Trajectories of Healthcare Utilization Turnbull, Margaret 20 August 2012 (has links) This research examines the viability of a need-based approach that models the age-trajectories of healthcare utilization. We propose a fundamentally different way of treating age in modeling healthcare use. Rather than treating age as a need indicator, we refocus modeling efforts to predicting the age-trajectories of healthcare use. Using inpatient hospital utilization data from the Discharge Abstract Database, first, we model the age-trajectories of the rate of hospital use employing a common functional form. Second, we assess variation in these age-trajectories using growth curve modeling. Third, we explain variation in these age-trajectories using census variables. Our analysis shows that the regional variation in the age-trajectories of the rate of inpatient hospital use is sufficient to justify this method, and could be partially explained using census variables. This indicates that modeling age-trajectories of healthcare use is advantageous, and the current need-based approach may benefit from this new modeling strategy. healthcare resource allocation healthcare utilization modeling need-based modeling need-based approaches age-patterns of morbidity modeling age-patterns of healthcare use Growth curve modeling Random effects modeling

Search results