Spelling suggestions: "subject:"ariance components"" "subject:"cariance components""
11 |
Estimating the Reliability of Scores from a Social Network Survey Questionnaire in Light of Actor, Alter, and Dyad Clustering EffectsWalker, Timothy Dean 01 June 2018 (has links)
Survey instruments utilized to quantify relationships, or aspects of relationships, may introduce multiple sources of nonindependence"”clustered variance"”into scores, including from actor, alter and dyadic sources. Estimating the magnitude of actor, alter and dyad nonindependence and their impact on the reliability of scores is an important step towards assuring quality data. Multilevel confirmatory factor analysis and the social relations model offer methods for quantifying the influence and estimating the reliability of multiple sources of clustered variance. The use of these methods is illustrated in the analysis of data gathered via a survey designed to quantify relational embeddedness in social network analyses.
|
12 |
Exploring the genetics of the efficiency of fertile AI dose production in rabbitsTusell Palomero, Llibertat 03 October 2011 (has links)
Exploring the genetics of the efficiency of fertile AI dose production in rabbits
The general aim of this thesis has been to analyse sources of variation for some of the most important components of fertile artificial insemination (AI) dose production in order to explore the interest and limitations of different strategies for their genetic improvement in a paternal line of rabbits selected for growth rate. These components refer to seminal production and quality traits, being considered the male reproductive performance (fertility and prolificacy) as the final expression of the effect of the seminal characteristics and the effect of the interaction among them and with the female.
Genetic analyses of the seminal traits involved in AI dose production and growth rate were modelled using threshold and linear multiple-trait mixed models. Relationship between fertility and pH of the semen was analysed either using mixed or recursive mixed models. Male and female genetic contributions to fertility were estimated using additive or product threshold models and both models were compared by its ability of predicting fertility data. Existence of genotype x artificial insemination conditions for male effect on fertility and prolificacy was estimated under a Character state model. Finally, the product threshold model was used for estimating separately the effect of the environmental temperature on male and on female contributions to fertility. All inferences of this thesis have been done under a Bayesian approach.
Male libido and variables related to the quality of the ejaculate such as presence of
urine and calcium carbonates in the ejaculate, individual sperm motility, semen pH and
suitability for AI of the ejaculate (which involves the subjective combination of several semen
quality traits) were found to be lowly heritable, but repeatable. / Tusell Palomero, L. (2011). Exploring the genetics of the efficiency of fertile AI dose production in rabbits [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/11842
|
13 |
A Sequential Modeling Approach to Explain Complex Processes and SystemsBae, Eric 12 August 2024 (has links)
The ability to predict accurately the critical quality characteristics of aircraft engines is essential for modeling the degradation of engine performance over time. The acceptable margins for error grow smaller with each new generation of engines. This paper focuses on turbine gas temperature (TGT). The goal is to improve the first principles predictions through the incorporation of the pure thermodynamics, as well as available information from the engine health monitoring (EHM) data and appropriate maintenance records. The first step in the approach is to develop the proper thermodynamics model to explain and to predict the observed TGTs. The resulting residuals provide the fundamental information on degradation. The current engineering models are ad hoc adaptations of the underlying thermodynamics not properly tuned by actual data. Interestingly, pure thermodynamics model uses only two variables: atmospheric temperature and a critical pressure ratio. The resulting predictions of TGT are at least similar, and sometimes superior to these ad hoc models. The next steps recognize that there are multiple sources of variability, some nested within others. Examples include version to version of the engine, engine to engine within version, route to route across versions and engines, maintenance to maintenance cycles within engine, and flight segment to flight segment within maintenance cycle. The EHM data provide an opportunity to explain the various sources of variability through appropriate regression models. Different EHM variables explain different contributions to the variability in the residuals, which provides fundamental insights as to the causes of the degradation over time. The resulting combination of the pure thermodynamics model with proper modeling based on the EHM data yield significantly better predictions of the observed TGT, allowing analysts to see the impact of the causes of the degradation much more clearly. / Doctor of Philosophy / AEM is major civilian aircraft gas turbine engine manufacturer, serving different airliners and airlines. However, one of its newest models has had performance issues; the engines degraded faster than their in-house model had anticipated, leading to more frequent maintenance and causing significant financial losses to the company. The key objectives of our research project are to produce a model that has higher predictive capabilities than AEM's in-house predictive model (DTGT), and develop a model selection algorithm that allows for direct comparisons among models of vastly different architecture. There are three major components to our research: 1) interdisciplinary studies merging the theory of thermodynamics and regression, 2) the sequential modeling, and 3) the modified Mallows's Cp. We propose a layered sequential approach to the regression modeling, where one regression model is followed by another regression on the residuals of the previous model. We also propose the modified Mallows's Cp, a modification of the Mallows's Cp, as a viable model selection criterion.
Our results demonstrated that the sequential approach both outperformed the AEM's in-house model and was found to be more useful than the traditional multiple linear regression.
Our results also demonstrated that the modified Mallows's Cp prefer smaller number of parameters than other standard model selection criterion without sacrificing predictive capabilities of its models.
|
14 |
Modelos de análise de dados de provas de ganho em peso de bovinos da raça Nelore /Oliveira Junior, Braz Costa de. January 2009 (has links)
Resumo: dados de prova de ganho em peso (PGP) em confinamento, visando aumento na resposta à seleção pela inclusão de informações de parentesco nas estimativas dos parâmetros genéticos, assim como na acurácia dos valores genéticos estimados e na classificação final dos animais. As características analisadas foram o peso ao final da PGP (P378), o ganho em peso após o período de adaptação (G112), um índice considerando P378 e G112 (IPGP), além do peso inicial e dois pesos intermediários. Foram utilizadas 18.825 mensurações de pesos de 4.758 animais. Os modelos de dimensão finita considerados incluíram os efeitos fixos de mês e ano de nascimento (1977 a 2006) e classe de idade da mãe ao parto (2 a ≥12 anos), além do efeito linear da idade do animal no início da PGP como covariável. Quanto aos efeitos genéticos, foram considerados dois modelos, um só com o efeito genético direto e outro incluindo o efeito de ambiente permanente materno. Os modelos de regressão aleatória incluíram, como aleatórios, os efeitos genéticos aditivos direto, de ambiente permanente direto e materno e, como fixos, os efeitos de grupo de contemporâneos, a classe de idade da vaca ao parto e o polinômio ortogonal de Legendre da idade do animal (regressão quadrática), como covariáveis. Para comparar os resultados obtidos pelos modelos de regressão aleatória, foram conduzidas análises multicaracterísticas. Um total de 13 modelos de regressão aleatória, aplicando polinômios de segunda a quinta ordem foram considerados para modelar os efeitos genéticos aditivos direto e de ambiente permanente direto e materno. O resíduo foi modelado considerando 1, 3, 4, 6 e 9 classes de variâncias. O modelo contendo 4 classes de variâncias foi o que melhor descreveu o comportamento da trajetória para o efeito... (Resumo completo, clicar acesso eletrônico abaixo) / Abstract: This work was carried through with the objective of study different analysis forms of datasets regarding weight gain test (PGP) in feedlot, aiming improve in selection response through the inclusion of kinship information on estimates of genetic parameters, as well as in the estimated breeding values accuracy and in the final classification of the animals. We analyzed the characteristics weight at the end of PGP (P378), the weight gain after adaptation period (G112), an index considering P378 and G112 (IPGP), as well as the initial weight and two intermediate weights. 18,825 records of weights from 4.758 animals had been used. The considered models of finite dimension had included the fixed effects of year of birth (1977 to 2006) and cow age at birth class (2 ≥12 years), as well as the linear effect of the age of the animal at the PGP beginning as covariate. In spite the genetic analyses, we considered two models: one only with the direct genetic effect and another one including the maternal permanent environment effect. The random regression models included, as random, the additive direct genetic effect, the permanent direct and maternal environment effect, and, as fixed effects, the contemporary group, age of cow at birth class and the Legendre orthogonal polynomial of the animal age (quadratic regression), as covariates. In order to compare the achieved results for the models of random regression, we precede multicharacteristic analysis. A total of 15 models of random regression, applying polynomials of second to fifth order had been considered to fit the additive direct genetic effect and direct and maternal permanent environment effects. The residual was fitted considering 1, 3, 4, 6 and 9 variance classes. The model containing 4 variance classes was the ones that better described the trajectory... (Complete abstract click electronic access below) / Orientadora: Lúcia Galvão de Albuquerque / Coorientadora: Maria Eugênia Zerlotti Mercadante / Banca: Danísio Prado Munari / Banca: Lenira El Faro Zadra / Mestre
|
15 |
Modelos de regressão aleatória para características de qualidade de leite bovino / Random regression models to quality traits of bovine milkZampar, Aline 02 March 2012 (has links)
O Brasil é um dos maiores produtores de leite do mundo, porém é necessário que se produza não só em quantidade, mas com qualidade adequada ao consumo e ao beneficiamento. Com a entrada em vigor da Instrução Normativa 51 (2002), a qualidade do leite nacional passou a ser monitorada, sendo exigido um padrão mínimo. Dentre os aspectos analisados, estão os teores de proteína e gordura e a contagem de células somáticas. Diante disso, o objetivo desse trabalho foi de estimar componentes de variância, coeficientes de herdabilidade e comparar modelos de diferentes ordens de ajuste por meio de funções polinomiais de Legendre, sob modelos de regressão aleatória, com a finalidade de predizer o modelo mais adequado para descrever as mudanças nas variâncias associadas aos teores de proteína, gordura e à contagem de células somáticas de vacas holandesas de primeira lactação. Foi utilizado um banco de dados com 27.988 dados de teores de gordura e proteína e 27.883 de escore de células somáticas, referentes a 4.945 vacas e a matriz de parentesco continha 30.843 animais. Foram utilizados quatro modelos, com polinômios ortogonais de Legendre de ordens de 3 a 6 e variância residual homogênea. Os modelos que melhor se ajustaram para gordura foram o de 5ª e 6ª ordens, para proteína, o de 4ª ordem e para escore de células somáticas foram os de 4ª e 6ª ordens. As estimativas de herdabilidade variaram de 0,07 a 0,56 para teor de gordura; de 0,13 a 0,66 para teor de proteína e de 0,08 a 0,50 para escore de células somáticas, nos diferentes modelos estudados. De acordo com os resultados, modelos de regressão aleatória são adequados para descrever variações no teor de gordura e proteína e no escore de células somáticas em função do estágio de lactação em que a vaca se encontra. / Brazil is one of the largest milk producers in the world, but it is necessary to produce not only in quantity but in quality suitable for consumption and processing. With the entry into force of the Federal Normative Instruction 51 (IN-51), the national quality of milk started to be monitored, with a required minimum standard. Among the aspects studied are the protein and fat contents and somatic cell count. Thus, the aim of this study was to estimate variance components, heritability coefficients and compare models with different orders of adjustment of Legendre polynomials, by random regression models in order to predict the most appropriate model to describe variances associated with changes in levels of protein, fat and somatic cell count of first lactation Holstein cows. We used a database with 27,988 data from fat and protein content and a database with 27,883 of somatic cell score, relative to 4,945 cows and the relationship matrix contained 30,843 animals. We used four models with orthogonal Legendre polynomials of orders 3-6 and homogeneous residual variance. The models that best adjusted for fat were of the 5th and 6th orders, for protein was of the 4th order and somatic cell score were of the 4th and 6th order. The heritabilities estimated ranged from 0.07 to 0.56 for fat, 0.13 to 0.66 for protein and 0.08 to 0.50 for somatic cell score in the different models studied. According to the results, random regression models are suitable to describe variations in fat and protein contents and somatic cell score according to the stage of lactation.
|
16 |
Os componentes da variância do grau de endividamento de empresas industriais: evidências empíricas na América LatinaVelho, Cassiane Oliveira 31 October 2008 (has links)
Made available in DSpace on 2015-03-05T19:14:42Z (GMT). No. of bitstreams: 0
Previous issue date: 31 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Esta dissertação analisou a composição da variabilidade do endividamento das empresas. O objetivo foi identificar grupos de fatores que influenciam nessa dispersão e a importância
relativa desta influência. Para tanto, o grau de endividamento foi mensurado utilizando-se 10 indicadores diferentes. Com base no banco de dados da empresa Economática®, o estudo foi feito sobre uma amostra internacional de 1.005 empresas localizadas em 7 países da América Latina e Estados Unidos, pertencentes a 21 setores de manufatura diferentes em um período de 1986 a 2006. O método de componentes de variância, normalmente utilizado em pesquisas na área agronômica, agropecuária e de genética, foi adotado para entender a composição dos graus de endividamento das empresas e para analisar a contribuição dos efeitos do País, do
Setor, do Ano e da Empresa sobre esses indicadores de estrutura de capital. Adicionalmente, foram empregados procedimentos de comparações múltiplas através do uso do teste da diferença honestamente significa / This dissertation has analyzed the indebtedness variability composition of companies. The purpose was to identify factors affecting this dispersion and the relative importance of this influence. For such, the indebtedness degree was measured using ten different indicators.
Based on the data base of the Economática® company, a study was conducted on a worldwide sample of 1,005 companies located in seven countries in Latin America and the
United States, belonging to 21 different manufacturing sectors in a period from 1986 to 2006. The variance component method, normally used in researches in the agronomy, cattle-raising and genetic fields, was adopted to understand the composition of the companies’ indebtedness
degrees and to analyze the contribution of the effects in the Country, Sector, Year and Firm on these capital structure indicators. Additionally, multiple comparison procedures were employed through the use of Tukey's honestly significant difference test for factors with fixed
Country and Sector effe
|
17 |
Simulação de dados visando à estimação de componentes de variância e coeficientes de herdabilidade / Simulation of data aiming at the estimation of variance components and heritabilityCoelho, Angela Mello 03 February 2006 (has links)
A meta principal desse trabalho foi comparar métodos de estimação para coeficientes de herdabilidade para os modelos inteiramente ao acaso e em blocos casualizados. Para os dois casos foram utilizadas as definições de coeficiente de herdabilidade (h2) no sentido restrito, dadas respectivamente, por h2=4 σ2t/(σ2+σ2t) e h2=4 σ2t/(σ2+σ2t+σ2b). . Portanto, é preciso estimar os componentes de variância relativos ao erro experimental (σ2) e ao efeito de tratamentos (σ2t) quando se deseja estimar h2 para o modelo inteiramente ao acaso. Para o modelo para blocos casualizados, além de estimar os últimos dois componentes, é necessário estimar o componente de variância relativo ao efeito de blocos (σ2b). Para atingir a meta estabelecida, partiu-se de um conjunto de dados cujo coeficiente de herdabilidade é conhecido, o que foi feito através da simulação de dados. Foram comparados dois métodos de estimação, o método da análise da variância e método da máxima verossimilhança. Foram feitas 80 simulações, 40 para cada ensaio. Para os dois modelos, as 40 simulações foram divididas em 4 casos contendo 10 simulações. Cada caso considerou um valor distinto para h2, esses foram: h2=0,10; 0,20; 0,30 e 0,40; para cada um desses casos foram fixados 10 valores distintos para o σ2, a saber: σ2=10; 20; 30; 40; 50; 60; 70; 80; 90; 100. Os valores relativos ao σ2 foram encontrados através da equação dada para os coeficientes de herdabilidade, sendo que, para o modelo em blocos casualizados, foi fixado σ2b=20 para todas os 40 casos. Após realizadas as 80 simulações, cada uma obtendo 1000 conjunto de dados, e por conseqüência 1000 estimativas para cada componente de variância e coeficiente de herdabilidade relativos a cada um dos casos, foram obtidas estatísticas descritivas e histogramas de cada conjunto de 1000 estimativas. A comparação dos métodos foi feita através da comparação dessas estatísticas descritivas e histogramas, tendo como referência os valores dos parâmetros utilizados nas simulações. Para ambos os modelos observou-se que os dois métodos se aproximam quanto a estimação de σ2. Para o delineamento inteiramente casualizado, o método da máxima verossimilhança forneceu estimativas que, em média, subestimaram os valores de σ2t, e por conseqüência, tendem a superestimar o h2, o que não acontece para o método da análise da variância. Para o modelo em blocos casualizados, ambos os métodos se assemelham, também, quanto à estimação de σ2t, porém o método da máxima verossimilhança fornece estimativas que tendem a subestimar o σ2b, e e por conseqüência, tendem a superestimar o h2, o que não acontece para o método da análise da variância. Logo, o método da análise da variância se mostrou mais confiável quando se objetiva estimar componentes de variância e coeficientes de herdabilidade para ambos os modelos considerados. / The main aim of this work was to compare methods of estimation of heritability for the 1- way classification and the 2-way crossed classification without interaction. For both cases the definition of heritability (h2) in the narrow sense was used, given respectively, by h2=4σ2t/(σ2+σ2t) e h2=4σ2t/(σ2+σ2t+σ2b). Therefore, there is a need to estimate the components of variance related to the residual (σ2) and the effect of treatments (σ2t) in order to estimate (h2) for the 1-way classification. For the 2-way classification without interaction, there is a need to estimate the component of variance related to the effect of blocks (σ2b) as well as the other two components. To achieve the established aim, a data set with known heritability was used, produced by simulation. Two methods of estimation were compared: the analysis of variance method and the maximum likelihood method. 80 simulations were made, 40 for each classification. For both models, the 40 simulations were divided into 4 different groups containing 10 simulations. Each group considered a different value for h2 (h2=0,10; 0,20; 0,30 e 0,40) and for each one of those cases there were 10 different values fixed for) σ2 (σ2=10; 20; 30; 40; 50; 60; 70; 80; 90; 100). The values for σ2t were found using the equations for the heritability, and for the 2-way crossed classification without interaction, σ2b=20 for all the 40 cases. After the 80 simulations were done, each one obtaining 1000 data sets, and therefore 1000 estimates of each component of variance and the heritability, descriptive statistics and histograms were obtained for each set of 1000 estimates. The comparison of the methods was made based on the descriptive statistics and histograms, using as references the values of the parameters used in the simulations. For both models, the estimates of σ2 were close to the true values. For the 1-way classification, the maximum likelihood method gave estimates that, on average, underestimated the values of σ2t, and therefore the values of h2. This did not happen with the analysis of variance method. For the 2-way crossed classification without interaction, both methods gave similar estimates of σ2t, although the maximum likelihood method gave estimates that tended to underestimate σ2b and therefore to overestimate h2. This did not happen with the analysis of variance method. Hence, the analysis of variance method proved to be more accurate for the estimation of variance components and heritability for both classifications considered in this work.
|
18 |
Componentes de variância e valores genéticos para as produções de leite do dia do controle e da lactação na raça holandesa com diferentes modelos estatísticos. / Variance components and breeding value for test day and lactation milk yields in holstein cattle with different statistical models.Melo, Claudio Manoel Rodrigues de 15 July 2003 (has links)
Foram utilizados 263.390 registros de produção de leite do dia do controle (PDC) de 32.448 primeiras lactações de vacas da raça Holandesa obtidas no período de 1991 a 2001 para estimar componentes de variância e parâmetros genéticos, usando diferentes modelos estatísticos e a metodologia REML. Compararam-se as estimativas de valores genético (EVG) dos modelos de repetibilidade (MR) e de regressão aleatória (MRA) com às do modelo para as produções da lactação (P305). Nos MRA utilizaram-se duas curvas para descrever a trajetória da lactação: a polinomial logarítmica de Ali e Schaeffer (AS) e a exponencial de Wilmink (W), sob duas formas: a padrão e com uma modificação para reduzir a amplitude das covariáveis e contornar problemas de convergência (W Ú ). No ajuste da curva AS considerou-se heterogeneidade de variâncias residuais (VR) entre classes de dias em lactação (cDEL). A estimativa de herdabilidade para as P305 (0,27) foi menor do que àquelas para as PDC obtidas com MR, incluindo ou não a curva AS como sub modelo (0,30 e 0,43, repectivamente). As herdabilidades para as PDC por análises uni-caráter (0,22-0,36) e bi-caráter (0,23-0,33) foram menores no início e fim da lactação. As correlações genéticas entre produções de controles consecutivos foram superiores às estimadas entre controles do ínicio e do fim da lactação. As estimativas de herdabilidade por MRA com as curva AS (0,29-0,42) e W Ú (0,33-0,40) foram semelhantes, mas aquelas estimadas com a curva W (0,25-0,65) foram maiores do que as estimadas com as outras curvas pricipalmente no fim da lactação. Com os MRA as correlações genéticas foram próximas da unidade entre produções de controles consecutivos, mas reduziram com o aumento do intervalo entre controles. As estimativas de VR entre cDEL foram muito semelhantes variando de 4,15 a 5,11 para a curvas AS. Os desvios padrão (DP) para as EVG para produção de leite dos touros foram semelhantes entre os modelos AS, W Ú e MR. Entretando, os DP para as EVG foram maiores nos modelos para PDC do que no modelo a P305. As correlações entre as EVG para touros com o modelo P305 e os demais modelos aumentaram com o aumento no número de filhas e variaram de 0,66 (P305-W) a 0,92 (P305-AS e P305- W Ú ). As estimativas de tendência genética foram maiores para os MRA e menores para o MR se comparadas à estimativa obtida pelo modelo para P305. As estimativas de herdabilidade superiores para as PDC e as altas correlações (0,86-0,99) entre estas e a P305 indicam um potêncial de uso das PDC nas avaliações genéticas. Correlações genéticas heterogêneas (0,64-1,00) entre as PDC, medidas ao longo da lactação, não confimam a suposição de que elas são medidas repetidas do mesmo caráter. O MRA com a curva AS e VR homogênea foi o de melhor ajuste entre os avaliados, mas o modelo W Ú resultou em estimativas de herdabilidade mais estáveis ao longo da lactação. Na comparação dos resultados dos modelos conclui-se que o MRA com a curva AS e homonogeneidade de VR é a melhor alternativa, dentre as estudadas, para avaliação genética para produção de leite de gado Holandês no Brasil. / Covariance components and genetic parameters for milk yield from 263,390 test-day records of 32,448 first lactation Holstein cows were estimated using animal models by REML. Test-day repeatability (RM) and random regression (RR) models were compared to a 305-d lactation model (P305) to estimate breeding values. Random regression involved the five-parameter logarithmic Ali and Schaeffer function (AS) and the three-parameter exponential Wilmink function in standard (W) and modified (W*, to reduce the range of covariates and avoid convergence problems) form to model the shape of the lactation curve. Heterogeneous error variance (EV) for classes of days in milk (cDIM) was considered in adjusting the AS function. Heritability for milk yield by P305 (0.27) was smaller than those estimated for daily milk yield by RM including or not including a logarithmic sub-model (0.30 and 0.43, respectively). Heritability estimates for univariate (0.22-0.36) and bivariate models (0.23-0.33) for test-day milk yields were smallest during early and late lactation. Genetic correlations were higher for daily milk yield between consecutive test-days than between test-days at the beginning and end of lactation. Heritability estimates for AS (0.29-0.42) and W* (0.33-0.40) RR models were similar, but heritability estimates obtained for W (0.25-0.65) were higher than those estimated by other functions, particularly at the end of lactation. Genetic correlations between daily milk yield on consecutive test-days were close to unity, but they decreased with an increase of the interval between test-days. Estimates of EV for cDIM were quite similar, rating from 4.15 to 5.11 for the AS function. Standard deviations (SD) of bullss EBVs for milk yield were similar for AS, W* models and RM. However, SD of EBVs for bulls and cows were larger for test-day models than for P305 and for bulls they differed by -33.64 to 321.95 from the P305 depending on progeny number. SD of EBVs for bulls and cows for the W model were the largest ones. Correlation between EBVs among P305 and the other models for bulls increased as progeny number increased and ranged from 0.66 (W-P305) to 0.92 (AS-P305, W*-P305). Genetic trends were larger for RR models and smaller for RM than for P305. Larger heritability estimates for test-day models and large genetic correlations between test-day and lactation milk yields (0.86-0.99) indicated a potential use of test-day records in genetic evaluations. Heterogeneous genetic correlations (0.64-1.00) for test-day milk yields across lactation did not support the assumption that test-day records are repeated measures of the same trait. The AS homogeneous EV model was the most parsimonious and the best fit among those evaluated, but the W* model resulted in more stable heritability estimates for daily milk yield across lactation. RR models provide more information than the RM and describe the shape of the lactation curve from which EBVs for persistency can be derived. These results indicated AS as an alternative model for genetic evaluation for milk yield using test-day records of Holstein cattle in Brazil.
|
19 |
Assessing variance components of multilevel models pregnancy dataLetsoalo, Marothi Peter January 2019 (has links)
Thesis (M. Sc. (Statistics) / Most social and health science data are longitudinal and additionally multilevel in nature, which means that response data are grouped by attributes of some cluster. Ignoring the differences and similarities generated by these clusters results to misleading estimates, hence motivating for a need to assess variance components (VCs) using multilevel models (MLMs) or generalised linear mixed models (GLMMs). This study has explored and fitted teenage pregnancy census data that were gathered from 2011 to 2015 by the Africa Centre at Kwa-Zulu Natal, South Africa. The exploration of these data revealed a two level pure hierarchy data structure of teenage pregnancy status for some years nested within female teenagers. To fit these data, the effects that census year (year) and three female characteristics (namely age (age), number of household membership (idhhms), number of children before observation year (nch) have on teenage pregnancy were examined. Model building of this work, firstly, fitted a logit gen eralised linear model (GLM) under the assumption that teenage pregnancy measurements are independent between females and secondly, fitted a GLMM or MLM of female random effect. A
better fit GLMM indicated, for an additional year on year, a 0.203 decrease on the log odds of teenage pregnancy while GLM suggested a 0.21 decrease and 0.557 increase for each additional year on age and year, respectively. A GLM with only year effect uncovered a fixed estimate which is higher, by 0.04, than that of a better fit GLMM. The inconsistency in the effect of year was caused by a significant female cluster variance of approximately 0.35 that was used to compute the VCs. Given the effect of year, the VCs suggested that 9.5% of the differences in teenage pregnancy lies between females while 0.095 similarities (scale from 0 to 1) are for the same female. It was also revealed that year does not vary within females. Apart from the small differences between observed estimates of the fitted GLM and GLMM, this work produced evidence that accounting for cluster effect improves accuracy of estimates.
Keywords: Multilevel Model, Generalised Linear Mixed Model, Variance Components, Hier
archical Data Structure, Social Science Data, Teenage Pregnancy
|
20 |
Latent variable models for longitudinal twin dataDominicus, Annica January 2006 (has links)
<p>Longitudinal twin data provide important information for exploring sources of variation in human traits. In statistical models for twin data, unobserved genetic and environmental factors influencing the trait are represented by latent variables. In this way, trait variation can be decomposed into genetic and environmental components. With repeated measurements on twins, latent variables can be used to describe individual trajectories, and the genetic and environmental variance components are assessed as functions of age. This thesis contributes to statistical methodology for analysing longitudinal twin data by (i) exploring the use of random change point models for modelling variance as a function of age, (ii) assessing how nonresponse in twin studies may affect estimates of genetic and environmental influences, and (iii) providing a method for hypothesis testing of genetic and environmental variance components. The random change point model, in contrast to linear and quadratic random effects models, is shown to be very flexible in capturing variability as a function of age. Approximate maximum likelihood inference through first-order linearization of the random change point model is contrasted with Bayesian inference based on Markov chain Monte Carlo simulation. In a set of simulations based on a twin model for informative nonresponse, it is demonstrated how the effect of nonresponse on estimates of genetic and environmental variance components depends on the underlying nonresponse mechanism. This thesis also reveals that the standard procedure for testing variance components is inadequate, since the null hypothesis places the variance components on the boundary of the parameter space. The asymptotic distribution of the likelihood ratio statistic for testing variance components in classical twin models is derived, resulting in a mixture of chi-square distributions. Statistical methodology is illustrated with applications to empirical data on cognitive function from a longitudinal twin study of aging. </p>
|
Page generated in 0.0742 seconds