Global ETD Search

21	A Hierarchical Spherical Radial Quadrature Algorithm for Multilevel GLMMS, GSMMS, and Gene Pathway Analysis Gagnon, Jacob A. 01 September 2010 (has links) The first part of my thesis is concerned with estimation for longitudinal data using generalized semi-parametric mixed models and multilevel generalized linear mixed models for a binary response. Likelihood based inferences are hindered by the lack of a closed form representation. Consequently, various integration approaches have been proposed. We propose a spherical radial integration based approach that takes advantage of the hierarchical structure of the data, which we call the 2 SR method. Compared to Pinheiro and Chao's multilevel Adaptive Gaussian quadrature, our proposed method has an improved time complexity with the number of functional evaluations scaling linearly in the number of subjects and in the dimension of random effects per level. Simulation studies show that our approach has similar to better accuracy compared to Gauss Hermite Quadrature (GHQ) and has better accuracy compared to PQL especially in the variance components. The second part of my thesis is concerned with identifying differentially expressed gene pathways/gene sets. We propose a logistic kernel machine to model the gene pathway effect with a binary response. Kernel machines were chosen since they account for gene interactions and clinical covariates. Furthermore, we established a connection between our logistic kernel machine with GLMMs allowing us to use ideas from the GLMM literature. For estimation and testing, we adopted Clarkson's spherical radial approach to perform the high dimensional integrations. For estimation, our performance in simulation studies is comparable to better than Bayesian approaches at a much lower computational cost. As for testing of the genetic pathway effect, our REML likelihood ratio test has increased power compared to a score test for simulated non-linear pathways. Additionally, our approach has three main advantages over previous methodologies: 1) our testing approach is self-contained rather than competitive, 2) our kernel machine approach can model complex pathway effects and gene-gene interactions, and 3) we test for the pathway effect adjusting for clinical covariates. Motivation for our work is the analysis of an Acute Lymphocytic Leukemia data set where we test for the genetic pathway effect and provide confidence intervals for the fixed effects. Gauss Hermite Quadrature Generalized Linear Mixed Model Generalized Semiparametric Mixed Model multilevel Spherical Radial splines Mathematics Statistics and Probability
22	Bayesian Hierarchical Latent Model for Gene Set Analysis Chao, Yi 13 May 2009 (has links) Pathway is a set of genes which are predefined and serve a particular celluar or physiological function. Ranking pathways relevant to a particular phenotype can help researchers focus on a few sets of genes in pathways. In this thesis, a Bayesian hierarchical latent model was proposed using generalized linear random effects model. The advantage of the approach was that it can easily incorporate prior knowledges when the sample size was small and the number of genes was large. For the covariance matrix of a set of random variables, two Gaussian random processes were considered to construct the dependencies among genes in a pathway. One was based on the polynomial kernel and the other was based on the Gaussian kernel. Then these two kernels were compared with constant covariance matrix of the random effect by using the ratio, which was based on the joint posterior distribution with respect to each model. For mixture models, log-likelihood values were computed at different values of the mixture proportion, compared among mixtures of selected kernels and point-mass density (or constant covariance matrix). The approach was applied to a data set (Mootha et al., 2003) containing the expression profiles of type II diabetes where the motivation was to identify pathways that can discriminate between normal patients and patients with type II diabetes. / Master of Science Pathway based analysis Point-mass density Probit regression model Bayesian hierarchical model Latent variable Generalized linear mixed model
23	The Effect of Productive Vocabulary Knowledge on Second Language Comprehension: Behavioral and Neurocognitive Studies / 産出語彙知識が第二言語理解に与える影響：行動及び神経認知研究 Allalsumoto, Kenzatakara 25 March 2024 (has links) 京都大学 / 新制・課程博士 / 博士(情報学) / 甲第25426号 / 情博第864号 / 新制\|\|情\|\|145(附属図書館) / 京都大学大学院情報学研究科知能情報学専攻 / (主査)准教授水原啓暁, 教授西田眞也, 教授熊田孝恒 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM Second language comprehenson Vocabulary Knowledge EEG Decoding Multivariate Pattern Analysis (MVPA) Generalized linear mixed model (GLMM) 7
24	Structure and restoration of natural secondary forests in the Central Highlands, Vietnam Bui, Manh Hung 15 December 2016 (has links) (PDF) Introduction and objectives In Vietnam, the forest resources have been declining and degrading severely in recent years. The degradation has decreased the natural forest area, changed the forest structure seriously and reduced timber volume and biodiversity. From 1999 to 2005, the rich forest area has decreased 10.2%, whereas the poor secondary forest has increased dramatically by 20.7%. Forest structure plays an important role in forestry research. Understanding forest structure will unlock an understanding of the history, function and future of a forest ecosystem (Spies, 1998). The forest structure is an excellent basis for restoration measures. Therefore, this research is necessary to contribute to improving forest area and quality, reducing difficulties in forest management. The study also enhances the grasp of forest structure, structure changes after harvesting and fills serious gaps in knowledge. In addition, the research results will contribute to improving and rescuing the poor secondary forest and restoring it, approaching the old-growth forest in Vietnam. Material and methods The study was conducted in Kon Ka Kinh national park. The park is located in the Northeastern region of Gia Lai province, 50 km from Pleiku city center to the Northeast. The park is distributed over seven different communes in three districts: K’Bang, Mang Yang and Đăk Đoa. Data were collected from 10 plots of secondary forests (Type IIb) and 10 plots of primeval forests (Type IV). Stratified random sampling was applied to select plot locations. 1 ha plots were used to investigate gaps. 2000 m2 plots were used to measure overstorey trees such as diameter at breast height, total height, crown width and species names. 500 m2 subplots were used to record tree positions. For regeneration, 25 systematic 4 m2 subplots were established inside 1 ha plots. After data were collected in the field, data analyses were conducted by using R and Excel. Firstly, some stand information, such as density, volume and so on, was calculated, and then descriptive statistics were computed for diameter and height variables. Linear mixed effect models were applied to analyze the difference of diameter and height and to check the effect of random factor between the two forest types. Diameter and height frequency distributions were also generated and compared by using permutational analysis of variance (PERMANOVA). Non-linear regression models were analyzed for diameter and height variables. Similar analyses were implemented for gaps. Regarding spatial point patterns of overstorey trees, replicated point pattern analysis techniques were applied in this research. For biodiversity, some calculations were run such as richness and biodiversity indices, comparison of biodiversity indices by using linear mixed models and biodiversity differences between two forest types tested again by permutational analysis of variance. In terms of regeneration, some analyses were implemented such as: height frequency distribution generation, frequency difference testing, biodiversity indices for the regeneration and spatial distribution checking by using a nonrandomness index. Results and discussion After analyzing the data, some essential findings were obtained as follows: Hypothesis H1 “The overstorey structure of secondary forests is more homogeneous and uniform than old-growth forests” is accepted. In other words, the secondary forest density is about 1.8 times higher than the jungle. However, the volume is only 0.56 times as large. The average diameter and height of the secondary forest is smaller by 5.71 cm and 3.73 m than the old-growth forest, respectively. Linear mixed effect model results indicate that this difference is statistically different and the effect of the random factor (Section) is not important. Type IIb has many small trees and the diameter frequency distribution is quite homogeneous. The old-growth forest has more big trees. For both forest stages, the height frequency distribution is positively skewed. PERMANOVA results illustrate that the frequency distribution is statistically different between the two forest types. Regression functions are also more variant and diverse in the old-growth forest, because all standard deviations of the parameters are greater there. Gap analysis results indicate that the number of gaps in the young forest is slightly higher, while the average gap size is much smaller. The gap frequency distribution is statistically different between the two types. In terms of the spatial point pattern of overlayer trees, the G-test and the pair correlation function results show that trees distribute randomly in the secondary forest. In contrast, the spatial point patterns of trees are more regular and diverse in the old-growth forest. The spatial point pattern difference is not significant, and this is proved by a permutational t-test for pair correlation function (pcf). Envelope function results indicate that the variation of pcf in young forests is much lower than in the primary forests. Hypothesis H2 “The overstorey species biodiversity of the secondary forest is less than in the old-growth forest” is rejected. Results show that the number of species of the secondary forest is much greater than in the old-growth forest, especially richness. The richness of the secondary forest is 1.16 times higher. The Simpson and Shannon indices are slightly smaller in the secondary forest. The average Simpson index for both forest stages is 0.898 and 0.920, respectively. However, the difference is not significant. Species accumulation curves become relatively flatter on the right, meaning a reasonable number of plots have been observed. Estimated number of species from accumulation curves in two forest types are 105 and 95/ha. PERMANOVA results show that number of species and proportion of individuals in each species are significantly different between forest types. Hypothesis H3 “The number regenerating species of the secondary forest is less and they distribute more regularly, compared to the old-growth forest” is rejected. There are both similarities and differences between the two types. The regeneration density of the stage IIb is 22,930 seedlings/ha, greater than the old forest by 9,030 seedlings. The height frequency distribution shows a decreasing trend. Similar to overstorey, the richness of the secondary forest is 141 species, higher than the old-growth forest by 9 species. Biodiversity indices are not statistically different between two types. PERMANOVA results indicate that the number of species and the proportion of individuals for each species are also not significantly different from observed forest types. Nonrandomness index results show that the regeneration distributes regularly. Up to 95% of the plots reflect this distribution trend. Hypothesis H4 “Restoration measures (with and without human intervention) could be implemented in the regenerating forest” is accepted. The investigated results show that the secondary forest still has mother trees, and it has enough seedlings to restore. Therefore, restoration solutions with and without human intervention can be implemented. Firstly, forest protection should be applied. This measure is relevant to national park regulations in Vietnam. Rangers and other related organizations will be responsible for carrying out protection activities. These activities will protect forest resources from illegal logging, grazing and tourist activities. Environmental education and awareness-raising activities for indigenous people is also important. Another measure is additional and enrichment planting. It should focus on exclusive species of the overstorey in Type IIb or exclusive species of the primary forest. Selection of these species will lead to species biodiversity increase in the future. This also meets the purpose of the maximum biodiversity solution. Conclusion Forest resources play a very important role in human life as well as maintaining the sustainability of ecosystems. However, at present, they are under serious threat, particularly in Vietnam. Central Highland, Vietnam, where forest resources are still relatively good, is also threatened by illegal logging, lack of knowledge of people and so on. Therefore, it needs the hands of the people, especially foresters and researchers. Through research, scientists can provide the knowledge and understanding of the forest, including the structure and forest restoration. This study has obtained important findings. The secondary forest is more homogeneous and uniform, while the old-growth forest is very diverse. Biodiversity of the overstorey in the secondary forest is more than the primary. The number of regenerating species in the secondary forest is higher, but other indices are not statistically different between two types. The regeneration distribute regularly on the ground. The secondary forest still has mother trees and sufficient regeneration, so some restoration measures can be applied here. Findings of the study contribute to improve people’s understanding of the structure and the structural changes after harvesting in Kon Ka Kinh national park, Gia Lai. That is a key to have better understandings of the history and values of the forests. These findings and the proposed restoration measures address rescuing degraded forests in Central Highland in particular and Vietnam in general. And further, this is a promising basis for the management and sustainable use of forest resources in the future. Structure Restoration Tropical forest Linear mixed model Replicated point pattern analysis Spatial distribution Gaps Kon Ka Kinh Vietnam Kon Ka Kinh ddc:630 rvk:ZC 73564
25	Metody výpočtu maximálně věrohodných odhadů v zobecněném lineárním smíšeném modelu / Computational Methods for Maximum Likelihood Estimation in Generalized Linear Mixed Models Otava, Martin January 2011 (has links) of the diploma thesis Title: Computational Methods for Maximum Likelihood Estimation in Generalized Linear Mixed Models Author: Bc. Martin Otava Department: Department of Probability and Mathematical Statistics Supervisor: RNDr. Arnošt Komárek, Ph.D., Department of Probability and Mathematical Statistics Abstract: Using maximum likelihood method for generalized linear mixed models, the analytically unsolvable problem of maximization can occur. As solution, iterative and ap- proximate methods are used. The latter ones are core of the thesis. Detailed and general introducing of the widely used methods is emphasized with algorithms useful in practical cases. Also the case of non-gaussian random effects is discussed. The approximate methods are demonstrated using the real data sets. Conclusions about bias and consistency are supported by the simulation study. Keywords: generalized linear mixed model, penalized quasi-likelihood, adaptive Gauss- Hermite quadrature 1
26	Flexible models for hierarchical and overdispersed data in agriculture / Modelos flexíveis para dados hierárquicos e superdispersos na agricultura Sercundes, Ricardo Klein 29 March 2018 (has links) In this work we explored and proposed flexible models to analyze hierarchical and overdispersed data in agriculture. A semi-parametric generalized linear mixed model was applied and compared with the main standard models to assess count data and, a combined model that take into account overdispersion and clustering through two separate sets of random effects was proposed to model nominal outcomes. For all models, the computational codes were implemented using the SAS software and are available in the appendix. / Nesse trabalho, exploramos e propusemos modelos flexíveis para a análise de dados hierárquicos e superdispersos na agricultura. Um modelo linear generalizado semi- paramétrico misto foi aplicado e comparado com os principais modelos para a análise de dados de contagem e, um modelo combinado que leva em consideração a superdispersão e a hierarquia dos dados por meio de dois efeitos aleatórios distintos foi proposto para a análise de dados nominais. Todos os códigos computacionais foram implementados no software SAS sendo disponibilizados no apêndice. B-spline B-spline Beta distribution Combined model Distribuição beta Distribuição multinomial Generalized linear mixed model Likelihood Modelo combinado Modelo linear generalizado misto Multinomial distribution Verossimilhança
27	Novel Statistical Methods in Quantitative Genetics : Modeling Genetic Variance for Quantitative Trait Loci Mapping and Genomic Evaluation Shen, Xia January 2012 (has links) This thesis develops and evaluates statistical methods for different types of genetic analyses, including quantitative trait loci (QTL) analysis, genome-wide association study (GWAS), and genomic evaluation. The main contribution of the thesis is to provide novel insights in modeling genetic variance, especially via random effects models. In variance component QTL analysis, a full likelihood model accounting for uncertainty in the identity-by-descent (IBD) matrix was developed. It was found to be able to correctly adjust the bias in genetic variance component estimation and gain power in QTL mapping in terms of precision. Double hierarchical generalized linear models, and a non-iterative simplified version, were implemented and applied to fit data of an entire genome. These whole genome models were shown to have good performance in both QTL mapping and genomic prediction. A re-analysis of a publicly available GWAS data set identified significant loci in Arabidopsis that control phenotypic variance instead of mean, which validated the idea of variance-controlling genes. The works in the thesis are accompanied by R packages available online, including a general statistical tool for fitting random effects models (hglm), an efficient generalized ridge regression for high-dimensional data (bigRR), a double-layer mixed model for genomic data analysis (iQTL), a stochastic IBD matrix calculator (MCIBD), a computational interface for QTL mapping (qtl.outbred), and a GWAS analysis tool for mapping variance-controlling loci (vGWAS). statistical genetics quantitative trait loci genome-wide association study genomic selection genetic variance hierarchical generalized linear model linear mixed model random effect heteroscedastic effects model variance-controlling genes
28	Novel Statistical Methods in Quantitative Genetics : Modeling Genetic Variance for Quantitative Trait Loci Mapping and Genomic Evaluation Shen, Xia January 2012 (has links) This thesis develops and evaluates statistical methods for different types of genetic analyses, including quantitative trait loci (QTL) analysis, genome-wide association study (GWAS), and genomic evaluation. The main contribution of the thesis is to provide novel insights in modeling genetic variance, especially via random effects models. In variance component QTL analysis, a full likelihood model accounting for uncertainty in the identity-by-descent (IBD) matrix was developed. It was found to be able to correctly adjust the bias in genetic variance component estimation and gain power in QTL mapping in terms of precision. Double hierarchical generalized linear models, and a non-iterative simplified version, were implemented and applied to fit data of an entire genome. These whole genome models were shown to have good performance in both QTL mapping and genomic prediction. A re-analysis of a publicly available GWAS data set identified significant loci in Arabidopsis that control phenotypic variance instead of mean, which validated the idea of variance-controlling genes. The works in the thesis are accompanied by R packages available online, including a general statistical tool for fitting random effects models (hglm), an efficient generalized ridge regression for high-dimensional data (bigRR), a double-layer mixed model for genomic data analysis (iQTL), a stochastic IBD matrix calculator (MCIBD), a computational interface for QTL mapping (qtl.outbred), and a GWAS analysis tool for mapping variance-controlling loci (vGWAS). statistical genetics quantitative trait loci genome-wide association study genomic selection genetic variance hierarchical generalized linear model linear mixed model random effect heteroscedastic effects model variance-controlling genes
29	Small Area Estimation for Survey Data: A Hierarchical Bayes Approach Karaganis, Milana 14 September 2009 (has links) Model-based estimation techniques have been widely used in small area estimation. This thesis focuses on the Hierarchical Bayes (HB) estimation techniques in application to small area estimation for survey data. We will study the impact of applying spatial structure to area-specific effects and utilizing a specific generalized linear mixed model in comparison with a traditional Fay-Herriot estimation model. We will also analyze different loss functions with applications to a small area estimation problem and compare estimates obtained under these loss functions. Overall, for the case study under consideration, area-specific geographical effects will be shown to have a significant effect on estimates. As well, using a generalized linear mixed model will prove to be more advantageous than the usual Fay-Herriot model. We will also demonstrate the benefits of using a weighted balanced-type loss function for the purpose of balancing the precision of estimates with their closeness to the direct estimates. Small-area estimation Hierarchical Bayes estimation area-specific geographical effects loss functions generalized linear mixed model weighted balanced-type loss function
30	Small Area Estimation for Survey Data: A Hierarchical Bayes Approach Karaganis, Milana 14 September 2009 (has links) Model-based estimation techniques have been widely used in small area estimation. This thesis focuses on the Hierarchical Bayes (HB) estimation techniques in application to small area estimation for survey data. We will study the impact of applying spatial structure to area-specific effects and utilizing a specific generalized linear mixed model in comparison with a traditional Fay-Herriot estimation model. We will also analyze different loss functions with applications to a small area estimation problem and compare estimates obtained under these loss functions. Overall, for the case study under consideration, area-specific geographical effects will be shown to have a significant effect on estimates. As well, using a generalized linear mixed model will prove to be more advantageous than the usual Fay-Herriot model. We will also demonstrate the benefits of using a weighted balanced-type loss function for the purpose of balancing the precision of estimates with their closeness to the direct estimates. Small-area estimation Hierarchical Bayes estimation area-specific geographical effects loss functions generalized linear mixed model weighted balanced-type loss function

Search results