Global ETD Search

1	Confidence intervals on several functions of the components of variance in a one-way random effects experiment Banasik, Aleksandra Anna January 1900 (has links) Master of Science / Department of Statistics / Dallas E. Johnson / Variability is inherent in most data and often it is useful to study the variability so scientists are able to make more accurate statements about their data. One of the most popular ways of analyzing variance in data is by making use of a one-way ANOVA which consists of partitioning the variability among observations into components of variability corresponding to between groups and within groups. One then has σ(subY)(superscript 2)=σ (sub A) (superscript)2+σ(sub e)(superscript 2). Thus there are two variance components. In certain situations, in addition to estimating these components of variance, it is important to estimate functions of the variance components. This report is devoted to methods for constructing confidence intervals for three particular functions of variance components in the unbalanced One- way random effects models. In order to compare the performance of the methods, simulations were conducted using SAS® and the results were compared across several scenarios based on the number of groups, the number of observations within each group, and the value of sigma (sub A)(superscript 2). statistics variance components Statistics (0463)
2	Production traits and market values of Welsh black cattle Ulutas, Zafer January 1998 (has links) No description available. 338.1
3	Sources of Variability in a Proteomic Experiment Crawford, Scott Daniel 11 August 2006 (has links) (PDF) The study of proteomics holds the hope for detecting serious diseases earlier than is currently possible by analyzing blood samples in a mass spectrometer. Unfortunately, the statistics involved in comparing a control group to a diseased group are not trivial, and these difficulties have led others to incorrect decisions in the past. This paper considers a nested design that was used to quantify and identify the sources of variation in the mass spectrometer at BYU, so that correct conclusions can be drawn from blood samples analyzed in proteomics. Algorithms were developed which detect, align, correct, and cluster the peaks in this experiment. The variation in the m/z values as well as the variation in the intensities was studied, and the nested nature of the design allowed us to estimate the sources of that variation. The variation due to the machine components, including the mass spectrometer itself, was much greater than the variation in the preprocessing steps. This conclusion inspires future studies to investigate which part of the machine steps is causing the most variation. proteomics variance components mass spectrometer Statistics and Probability
4	Robustness of normal theory inference when random effects are not normally distributed Devamitta Perera, Muditha Virangika January 1900 (has links) Master of Science / Department of Statistics / Paul I. Nelson / The variance of a response in a one-way random effects model can be expressed as the sum of the variability among and within treatment levels. Conventional methods of statistical analysis for these models are based on the assumption of normality of both sources of variation. Since this assumption is not always satisfied and can be difficult to check, it is important to explore the performance of normal based inference when normality does not hold. This report uses simulation to explore and assess the robustness of the F-test for the presence of an among treatment variance component and the normal theory confidence interval for the intra-class correlation coefficient under several non-normal distributions. It was found that the power function of the F-test is robust for moderately heavy-tailed random error distributions. But, for very heavy tailed random error distributions, power is relatively low, even for a large number of treatments. Coverage rates of the confidence interval for the intra-class correlation coefficient are far from nominal for very heavy tailed, non-normal random effect distributions. Random Effects Models Non-normal random effects Variance Components Statistics (0463)
5	Inheritance of Soluble Oligosaccharides in Soybean Seeds Huhn, Melissa Rose 14 August 2003 (has links) Sucrose, raffinose, and stachyose make up the majority of the carbohydrates in soybean seeds. While sucrose is a desirable component of soybean seeds, raffinose and stachyose are considered to be anutritive factors and eliminating or reducing them appears to be a beneficial endeavour. The major objective of this study was to determine the genetic mechanism controlling accumulation of soluble saccharides in soybean seeds. An experimental soybean line, V99-5089, with high sucrose (14.6%) content coupled with low amounts of raffinose (0.5%) and stachyose (0.4%) was the center of this study. Three populations were studied and segregation patterns were observed in F2:3 populations. All three sugars were extracted by an aqueous procedure and quantified by high pressure liquid chromatography (HPLC) using a NH2 column and refractive index (RI) detector. Segregation of seeds from F2:3 plants indicated a single, partially recessive gene reduced stachyose content of soybean seeds from about 4% to less than 1%. Estimates of genetic variability indicate the presence of sufficient additive variation in addition to the putative major gene to warrant selection. Raffinose and stachyose were positively correlated to each other and each was negatively correlated with sucrose while there was not a significant correlation between total sugar content and the amount of any of the individual sugars. Agronomic traits evaluated do not appear to be adversely effected by the reduction of stachyose content. Additionally, a negative relationship was observed between inorganic phosphorus and stachyose content of soybean seeds but a relationship was not observed between stachyose and phytate phosphorus or between inorganic phosphorus and phytate phosphorus. / Master of Science soybean phytate phosphorus oligosaccharides HPLC variance components raffinosaccharides stachyose sucrose
6	Efficient Exact Tests in Linear Mixed Models for Longitudinal Microbiome Studies Zhai, Jing January 2016 (has links) Microbiome plays an important role in human health. The analysis of association between microbiome and clinical outcome has become an active direction in biostatistics research. Testing the microbiome effect on clinical phenotypes directly using operational taxonomic unit abundance data is a challenging problem due to the high dimensionality, non-normality and phylogenetic structure of the data. Most of the studies only focus on describing the change of microbe population that occur in patients who have the specific clinical condition. Instead, a statistical strategy utilizing distance-based or similarity-based non-parametric testing, in which a distance or similarity measure is defined between any two microbiome samples, is developed to assess association between microbiome composition and outcomes of interest. Despite the improvements, this test is still not easily interpretable and not able to adjust for potential covariates. A novel approach, kernel-based semi-parametric regression framework, is applied in evaluating the association while controlling the covariates. The framework utilizes a kernel function which is a measure of similarity between samples' microbiome compositions and characterizes the relationship between the microbiome and the outcome of interest. This kernel-based regression model, however, cannot be applied in longitudinal studies since it could not model the correlation between the repeated measurements. We proposed microbiome association exact tests (MAETs) in linear mixed model can deal with longitudinal microbiome data. MAETs can test not only the effect of overall microbiome but also the effect from specific cluster of the OTUs while controlling for others by introducing more random effects in the model. The current methods for multiple variance component testing are based on either asymptotic distribution or parametric bootstrap which require large sample size or high computational cost. The exact (R)LRT tests, an computational efficient and powerful testing methodology, was derived by Crainiceanu. Since the exact (R)LRT can only be used in testing one variance component, we proposed an approach that combines the recent development of exact (R)LRT and a strategy for simplifying linear mixed model with multiple variance components to a single case. The Monte Carlo simulation studies present correctly controlled type I error and provided superior power in testing association between microbiome and outcomes in longitudinal studies. Finally, the MAETs were applied to longitudinal pulmonary microbiome datasets to demonstrate that microbiome composition is associated with lung function and immunological outcomes. We also successfully found two interesting genera Prevotella and Veillonella which are associated with forced vital capacity. Kernel-based regression Longitudinal study Microbiome composition Multiple variance components Public Health Exact tests
7	Estresse térmico e a qualidade do leite em vacas da raça Holandesa: uma abordagem genômica / Heat stress and milk quality in Holtein cows: a genomic approach Carrara, Eula Regina 04 April 2018 (has links) O estresse térmico causa prejuízos para a atividade leiteira. É possível selecionar animais para tolerância ao calor, uma vez que existe variação genética de características produtivas quando avaliadas em diferentes ambientes climáticos. Para uma correta avaliação genética em função de diferentes ambientes, a utilização de modelos que se ajustem aos dados é fundamental. Ao mesmo tempo, ferramentas genômicas podem auxiliar nos processos de avaliação e seleção genética para tolerância ao calor. Nesse contexto, foram desenvolvidos dois estudos. No primeiro, objetivou-se estudar as funções de variância de características de produção e qualidade do leite em relação a um índice de temperatura e umidade (THI), ajustadas por polinômios de Legendre de ordens dois a sete, avaliar qual função melhor se ajusta aos dados e compreender como os componentes de variância e os coeficientes de herdabilidade se comportam em função do THI. Para isso, foram utilizadas 74.470 informações de produção de leite (PROD, kg/dia), escore de células somáticas (ECS), porcentagens de gordura (GOR), proteína (PROT), lactose (LACT), caseína (CAS) e perfil de ácidos graxos (saturados - SAT; insaturados - INSAT; monoinsaturados - MONO; poli-insaturados - POLI; ácido palmítico - C16:0; ácido esteárico - C18:0; e ácido oléico - C18:1) de 5.224 vacas da raça Holandesa avaliadas por meio de modelos de regressão aleatória em 167 valores distintos do THI. À exceção de POLI, houve variação em todos os componentes de variância ao longo do THI para todas as características. As estimativas de herdabilidade variaram de 0,07 a 0,43. Para PROD, GOR, LACT, ECS, SAT e C16:0, as estimativas de herdabilidade diminuíram com o aumento do THI e para CAS, MONO, INSAT e C18:1, aumentaram. Para PROT e C18:0 as estimativas de herdabilidade foram maiores em valores intermediários do THI e menores nos extremos. No segundo estudo, objetivou-se estimar o efeito de marcadores SNP (polimorfismos de nucleotídeo único) sobre as características PROD, CAS, ECS, SAT e INSAT e estimar o valor genético genômico (GEBV) considerando dois valores de THI, um representando o conforto térmico e outro o estresse térmico. Os valores genéticos dos animais para os dois ambientes foram usados como fenótipos nos estudos de associação genômica (GWAS). Foram utilizados genótipos de 1.157 vacas para 60.671 marcadores SNP. Para as análises genômicas, foram utilizadas somente vacas com fenótipo e genótipo. Foi utilizado o método GBLUP (melhor predição linear não-viesada genômica) sob abordagem bicaracterística para realizar o GWAS. Com objetivo de comparar os ambientes, foram considerados os 55 SNP de maior variância aditiva explicada para cada situação e 10% dos animais com maiores valores genéticos genômicos. Em geral, os SNP apresentaram poucas diferenças em suas variâncias aditivas explicadas quando comparados em ambiente de conforto e estresse térmico. Com relação aos valores genéticos genômicos, houve reclassificação dos animais para PROD (correlações 0,90 e 0,90), CAS (correlações 0,88 e 0,86), SAT (correlações 0,88 e 0,87) e INSAT (correlações 0,97 e 0,97). Para ECS, apenas um animal foi reclassificado entre os ambientes (correlações 1,00). O primeiro estudo permitiu avaliar os componentes de variância ao longo de todo o gradiente ambiental. Por sua vez, o segundo estudo permitiu avaliar a variância individual dos marcadores moleculares, complementando o estudo genético das características. Mesmo que pequenas, foi possível verificar diferenças genéticas entre os animais entre os ambientes considerados. / Heat stress causes damage to dairy farming activities. It is possible to select animals for heat tolerance, because there is genetic variation of productive traits when evaluated in different climatic environments. For a correct genetic evaluation according to different environments, the use of models that fit to the data is fundamental. At the same time, genomic tools can aid in the evaluation and genetic selection processes for heat tolerance. In this context, two studies were developed. In the first, the aim was to study the variance functions of milk production and quality traits in relation to a temperature and humidity index (THI), adjusted by Legendre polynomials of orders two to seven, to evaluate which function best fits the data and understand how the variance components and heritability coefficients behave as a function of THI. For this, records of milk yield (MY), somatic cell score (SCS), percentage of fat (FP), protein (PP), lactose (LP), casein (CP) and acid fatty acids (saturated - SAT, unsaturated - UNSAT, monounsaturated - MONO, polyunsaturated - POLY, palmitic acid - C16:0, stearic acid - C18:0; and oleic acid - C18:1) from 5,224 Holstein cows and evaluated using random regression models in 167 different levels of THI were used. With the exception of POLY, there was variation in all variance components along the THI for all traits. Estimates of heritability ranged from 0.07 to 0.43. For MY, FP, LP, SCS, SAT and C16:0, estimates of heritability decreased with increasing THI and for CP, MONO, UNSAT and C18:1 increased. For PP and C18:0, heritability estimates were higher in intermediate THI values and lower at the extremes. In the second study, the objective was to estimate the effect of SNP markers on traits MY, CP, SCS, SAT and UNSAT and to predict the genomic breeding values (GEBV) using these effects, considering two levels of THI: a thermal comfort and an heat stress. The breeding values of the animals for the two environments were used as phenotypes for the genomic wide association studies (GWAS). Genotypes of 1,157 cows to 60,671 SNP markers were used. The GBLUP method (genomic best linear unbiased prediction) was used under a bitrait approach to perform GWAS. With the aim of comparing the environments, the 55 SNP with the greater variance explained for each situation and the 10% animals with the greater GEBV were used. Differences were observed in the variance explained by SNP between the comfort and heat stress environment. There was a re-rankink of animals considering the GEBV for MY (correlations 0.90 and 0.90), CP (correlations 0.88 and 0.86), SAT (correlations 0.88 and 0.87) and UNSAT (correlations 0.97 and 0.97). For SCS, only one animal was reclassified between the environments (correlations 1.00). The first study allowed to evaluate the components of variance along the entire environmental gradient. In turn, the second study allowed to evaluate the individual variance of the molecular markers, complementing the genetic study of the traits. Although small, it was possible to verify genetic differences among the animals among the considered environments. Componentes de variância Dairy cattle Gado leiteiro Genetic markers Marcadores genéticos Random regression Regressão aleatória Variance components
8	Estresse térmico e a qualidade do leite em vacas da raça Holandesa: uma abordagem genômica / Heat stress and milk quality in Holtein cows: a genomic approach Eula Regina Carrara 04 April 2018 (has links) O estresse térmico causa prejuízos para a atividade leiteira. É possível selecionar animais para tolerância ao calor, uma vez que existe variação genética de características produtivas quando avaliadas em diferentes ambientes climáticos. Para uma correta avaliação genética em função de diferentes ambientes, a utilização de modelos que se ajustem aos dados é fundamental. Ao mesmo tempo, ferramentas genômicas podem auxiliar nos processos de avaliação e seleção genética para tolerância ao calor. Nesse contexto, foram desenvolvidos dois estudos. No primeiro, objetivou-se estudar as funções de variância de características de produção e qualidade do leite em relação a um índice de temperatura e umidade (THI), ajustadas por polinômios de Legendre de ordens dois a sete, avaliar qual função melhor se ajusta aos dados e compreender como os componentes de variância e os coeficientes de herdabilidade se comportam em função do THI. Para isso, foram utilizadas 74.470 informações de produção de leite (PROD, kg/dia), escore de células somáticas (ECS), porcentagens de gordura (GOR), proteína (PROT), lactose (LACT), caseína (CAS) e perfil de ácidos graxos (saturados - SAT; insaturados - INSAT; monoinsaturados - MONO; poli-insaturados - POLI; ácido palmítico - C16:0; ácido esteárico - C18:0; e ácido oléico - C18:1) de 5.224 vacas da raça Holandesa avaliadas por meio de modelos de regressão aleatória em 167 valores distintos do THI. À exceção de POLI, houve variação em todos os componentes de variância ao longo do THI para todas as características. As estimativas de herdabilidade variaram de 0,07 a 0,43. Para PROD, GOR, LACT, ECS, SAT e C16:0, as estimativas de herdabilidade diminuíram com o aumento do THI e para CAS, MONO, INSAT e C18:1, aumentaram. Para PROT e C18:0 as estimativas de herdabilidade foram maiores em valores intermediários do THI e menores nos extremos. No segundo estudo, objetivou-se estimar o efeito de marcadores SNP (polimorfismos de nucleotídeo único) sobre as características PROD, CAS, ECS, SAT e INSAT e estimar o valor genético genômico (GEBV) considerando dois valores de THI, um representando o conforto térmico e outro o estresse térmico. Os valores genéticos dos animais para os dois ambientes foram usados como fenótipos nos estudos de associação genômica (GWAS). Foram utilizados genótipos de 1.157 vacas para 60.671 marcadores SNP. Para as análises genômicas, foram utilizadas somente vacas com fenótipo e genótipo. Foi utilizado o método GBLUP (melhor predição linear não-viesada genômica) sob abordagem bicaracterística para realizar o GWAS. Com objetivo de comparar os ambientes, foram considerados os 55 SNP de maior variância aditiva explicada para cada situação e 10% dos animais com maiores valores genéticos genômicos. Em geral, os SNP apresentaram poucas diferenças em suas variâncias aditivas explicadas quando comparados em ambiente de conforto e estresse térmico. Com relação aos valores genéticos genômicos, houve reclassificação dos animais para PROD (correlações 0,90 e 0,90), CAS (correlações 0,88 e 0,86), SAT (correlações 0,88 e 0,87) e INSAT (correlações 0,97 e 0,97). Para ECS, apenas um animal foi reclassificado entre os ambientes (correlações 1,00). O primeiro estudo permitiu avaliar os componentes de variância ao longo de todo o gradiente ambiental. Por sua vez, o segundo estudo permitiu avaliar a variância individual dos marcadores moleculares, complementando o estudo genético das características. Mesmo que pequenas, foi possível verificar diferenças genéticas entre os animais entre os ambientes considerados. / Heat stress causes damage to dairy farming activities. It is possible to select animals for heat tolerance, because there is genetic variation of productive traits when evaluated in different climatic environments. For a correct genetic evaluation according to different environments, the use of models that fit to the data is fundamental. At the same time, genomic tools can aid in the evaluation and genetic selection processes for heat tolerance. In this context, two studies were developed. In the first, the aim was to study the variance functions of milk production and quality traits in relation to a temperature and humidity index (THI), adjusted by Legendre polynomials of orders two to seven, to evaluate which function best fits the data and understand how the variance components and heritability coefficients behave as a function of THI. For this, records of milk yield (MY), somatic cell score (SCS), percentage of fat (FP), protein (PP), lactose (LP), casein (CP) and acid fatty acids (saturated - SAT, unsaturated - UNSAT, monounsaturated - MONO, polyunsaturated - POLY, palmitic acid - C16:0, stearic acid - C18:0; and oleic acid - C18:1) from 5,224 Holstein cows and evaluated using random regression models in 167 different levels of THI were used. With the exception of POLY, there was variation in all variance components along the THI for all traits. Estimates of heritability ranged from 0.07 to 0.43. For MY, FP, LP, SCS, SAT and C16:0, estimates of heritability decreased with increasing THI and for CP, MONO, UNSAT and C18:1 increased. For PP and C18:0, heritability estimates were higher in intermediate THI values and lower at the extremes. In the second study, the objective was to estimate the effect of SNP markers on traits MY, CP, SCS, SAT and UNSAT and to predict the genomic breeding values (GEBV) using these effects, considering two levels of THI: a thermal comfort and an heat stress. The breeding values of the animals for the two environments were used as phenotypes for the genomic wide association studies (GWAS). Genotypes of 1,157 cows to 60,671 SNP markers were used. The GBLUP method (genomic best linear unbiased prediction) was used under a bitrait approach to perform GWAS. With the aim of comparing the environments, the 55 SNP with the greater variance explained for each situation and the 10% animals with the greater GEBV were used. Differences were observed in the variance explained by SNP between the comfort and heat stress environment. There was a re-rankink of animals considering the GEBV for MY (correlations 0.90 and 0.90), CP (correlations 0.88 and 0.86), SAT (correlations 0.88 and 0.87) and UNSAT (correlations 0.97 and 0.97). For SCS, only one animal was reclassified between the environments (correlations 1.00). The first study allowed to evaluate the components of variance along the entire environmental gradient. In turn, the second study allowed to evaluate the individual variance of the molecular markers, complementing the genetic study of the traits. Although small, it was possible to verify genetic differences among the animals among the considered environments. Componentes de variância Gado leiteiro Marcadores genéticos Regressão aleatória Dairy cattle Genetic markers Random regression Variance components
9	Construction of the gametic covariance matrix for quantitative trait loci analyses in outbred populations / Construção da matriz de covariância gamética para análises de QTL em populações exogâmicas Pita, Fabiano Veraldo da Costa 05 September 2003 (has links) Submitted by Marco Antônio de Ramos Chagas (mchagas@ufv.br) on 2017-06-02T16:19:15Z No. of bitstreams: 1 texto completo.pdf: 390406 bytes, checksum: ccedce081a84047d48e29f58c45f1176 (MD5) / Made available in DSpace on 2017-06-02T16:19:15Z (GMT). No. of bitstreams: 1 texto completo.pdf: 390406 bytes, checksum: ccedce081a84047d48e29f58c45f1176 (MD5) Previous issue date: 2003-09-05 / Conselho Nacional de Desenvolvimento Científico e Tecnológico / A aplicação de análises de “Quantitative Trait Loci” (QTL) em populações exogâmicas é desafiadora porque pressuposições simplificadoras não podem ser aplicadas (por exemplo, os alelos QTL não podem ser assumidos fixados em diferentes famílias, o número de alelos QTL segregantes não é conhecido a priori, não há desequilíbrio de ligação entre um dado alelo marcador e um dado alelo QTL). Quando o efeito genotípico do QTL é assumido aleatório no modelo de análise, a matriz de covariância gamética deve ser calculada para a realização das análises em populações exogâmicas. A acurácia dessa matriz é importante para a obtenção de estimativas confiáveis da posição ou efeito do QTL em análises de mapeamento, ou de valores genotípicos em avaliação genética assistida por marcadores. O objetivo do primeiro estudo foi avaliar diferente estratégias já implementadas em programas computacionais (SO- LAR, LOKI, ESIP e MATVEC) para calcular a matriz de coeficientes Idênticos por Descendência (IBD), que é necessária para o mapeamento de QTL em populações exogâmicas. SOLAR utiliza um método baseado em regressão linear, LOKI e ESIP são ambos baseados em “reverse peeling” e o amostrador implementado em MAT VEC amostra indicadores de segregação. Um pedigree com estrutura F2 típica foi simulado com uma família F2 pequena (2 indivíduos) ou grande (20 indivíduos) e marcadores flanqueadores localizados a 2 cM, 5 cM ou 10 cM de distância um do outro, com o QTL localizado no meio do intervalo. A habilidade dessas estratégias em lidar com informações de marcadores perdidas foi avaliada assumindo um dos pais da geração F2 com ou sem informação de marcador. SOLAR nao estimou os coeficientes IBD corretamente para a maior parte das situações simuladas, enquanto que LOKI apre- sentou problemas quando o tamanho da família F2 era grande. ESIP e o amostrador em MATVEC apresentaram bom desempenho em todas as situacões simuladas, com estimativas de coeficientes IBD próximas aos coeficientes verdadeiros. Portanto, ESIP e MATVEC são os softwares mais indicados quando analises genéticas são realizadas em pedigrees com estruturas complexas. O objetivo do segundo estudo foi avaliar o efeito da utilização de uma melhor aproximação da inversa da matriz de covariância gamética para a avaliação genética de grandes populações de animais domésticos. Algoritmos eficientes, baseados no rastreamento dos alelos QTL de um indivíduo em relação aos de seus avós (Probabilidade de Descendência de um QTL - PDQ), podem ser usados para construir a inversa da matriz de covariância gamética diretamente. Mas essa inversa é uma aproximação quando há informação incompleta de marcador. Também, o calculo exato de PDQºs torna-se difícil quando a informação de marcador é incompleta. Nesse estudo, a inversa da matriz de covariãncia gamética para uma pop- ulação exogãmica simulada foi calculada usando o algoritmo eficiente, mas as PDQ's foram calculadas usando um algoritmo Monte Carlo Cadeia de Markov (MCMC). Essa inversa foi utilizada para predizer o valor genético dos indivíduos através de BLUP assistido por marcadores (MABLUP). O efeito dos cálculos de PDQ usando o algoritmo MCMC sobre a acurãcia da MABLUP foi avaliado com base na resposta a seleção realizada, calculada para o pedigree simulado. Os resultados mostraram que quando as PDQ’S foram estimadas usando MCMC a perda em resposta devido ao uso da inversa aproximada pode ser reduzida em aproximadamente 20%, enquanto que em estudos anteriores essa redução foi de 50%. Ainda, quando quatro marcadores bi-alélicos foram utilizados a resposta para MABLUP foi maior e a perda em re- sposta devido a marcadores com informação perdida foi menor, quando comparadas a situação onde apenas dois marcadores bi-alélicos foram utilizados. / The application of Quantitative Trait Loci (QTL) analyses in outbred population is challenging because simplified assumptions do not hold for these populations (e.g., the QTL alleles cannot be assumed fixed in different families, the number of QTL alleles segregating is not known a priori, there is not gametic phase disequilibrium between a given genetic marker allele and a QTL allele). When the QTL genotypic effect is assumed random, the gametic covariance matrix must be calculated to per- form QTL analyses in outbred populations. The accuracy of this matrix is important to obtain reliable estimates of QTL position or effect when applying QTL mapping, or QTL genotypic values when applying Marker Assisted Genetic Evaluation. The objective of the first study was to evaluate the different strategies already imple- mented in softwares (SOLAR, LOKI, ESIP and MATVEC) to calculate the matrix of identical by descent (IBD) coefficients, which is required for QTL mapping anal- ysis in outbred populations. SOLAR uses a regression method, LOKI and ESIP are both based on reverse peeling, and the MAT VEC sampler samples segregation in- dicators. A typical F2 pedigree was simulated with a small (2 offspring) or a large (20 offspring) F2 family, and the flanking markers were simulated 2 CM, 5 CM, or 10 CM apart, with the QTL located in the middle. The ability of these strategies to deal with missing genetic marker information was evaluated assuming one of the F2 parents with or without marker information. SOLAR failed to estimate the correct coefficients at almost all situations simulated, while LOKI showed problems when a large family was present in the pedigree. ESIP and MATVEC sampler performed well at all situations, providing IBD coefficients closed to the true ones. Therefore, ESIP and MATVEC are more indicated when genetic analysis are carried out on complex pedigree structures. The objective of the second study was to evaluate the effect of using a better approximation of the inverse of the gametic covariance matrix on the genetic evaluation of large livestock populations. Efficient algorithms, based on trac- ing the QTL alleles of an individual to its grandmother or grandfather (probability of descent a QTL - PDQ’s), can be used to construct the inverse of the gametic covari- ance matrix directly. But this inverse is an approximation when incomplete marker information is available. Also, computing the exact PDQ’s becomes difficult when marker information is incomplete. In this study, the inverse of the gametic covariance matrix for a simulated outbred pedigree was calculated using the efficient algorithm, but the PDQ’s were calculated using a Markov chain Monte Carlo (MCMC) algo- rithm. This inverse was used to calculate the predicted genetic value of individuals through Marker Assisted Best Linear Unbiased Prediction (MABLUP). The effect of PDQ calculations using the MCMC algorithm on MABLUP accuracy was evaluated based on the realized response to selection for the simulated pedigree. The results showed that by estimating the PDQ’s by MCMC the loss in response because of using an approximate inverse could be reduced to about 20%, while in previous studies this reduction was of 50%. Further, response to MABLUP was greater when four bi-allelic markers were used, and the loss in response due to missing markers was smaller in the case with four markers compared to when only two bi-allelic markers were used. / Tese importada do Alexandria REML QTL mapping Variance components estimation Outbreed populations Interval mapping Ciências Agrárias
10	Optimal statistical design for variance components in multistage variability models Loeza-Serrano, Sergio Ivan January 2014 (has links) This thesis focuses on the construction of optimum designs for the estimation of the variance components in multistage variability models. Variance components are the model parameters that represent the different sources of variability that affect the response of a system. A general and highly detailed way to define the linear mixed effects model is proposed. The extension considers the explicit definition of all the elements needed to construct a model. One key aspect of this formulation is that the random part is stated as a functional that individually determines the form of the design matrices for each random regressor, which gives significant flexibility. Further, the model is strictly divided into the treatment structure and the variability structure. This allows separate definitions of each structure but using the single rationale of combining, with little restrictions, simple design arrangements called factor layouts. To provide flexibility for considering different models, methodology to find and select optimum designs for variance components is presented using MLE and REML estimators and an alternative method known as the dispersion-mean model. Different forms of information matrices for variance components were obtained. This was mainly done for the cases when the information matrix is a function of the ratios of variances. Closed form expressions for balanced designs for random models with 3-stage variability structure, in crossed and nested layouts were found. The nested case was obtained when the information matrix is a function of the variance components. A general expression for the information matrix for the ratios using REML is presented. An approach to using unbalanced models, which requires the use of general formulae, is discussed. Additionally, D-optimality and A-optimality criteria of design optimality are restated for the case of variance components, and a specific version of pseudo-Bayesian criteria is introduced. Algorithms to construct optimum designs for the variance components based on the aforementioned methodologies were defined. These algorithms have been implemented in the R language. The results are communicated using a simple, but highly informative, graphical approach not seen before in this context. The proposed plots convey enough details for the experimenter to make an informed decision about the design to use in practice. An industrial internship allowed some the results herein to be put into practice, although no new research outcomes originated. Nonetheless, this is evidence of the potential for applications. Equally valuable is the experience of providing statistical advice and reporting conclusions to a non statistical audience. 519.5

Search results