• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 88
  • 64
  • 13
  • 12
  • 2
  • 2
  • 1
  • Tagged with
  • 212
  • 212
  • 92
  • 88
  • 88
  • 54
  • 46
  • 42
  • 38
  • 31
  • 28
  • 26
  • 25
  • 21
  • 20
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
111

Modélisation de la composante génétique des maladies humaines : Données familiales et Modèles Mixtes / Modelisation of Genetic Risk in Human Diseases : Family Data and Mixed Model

Dandine-Roulland, Claire 04 October 2016 (has links)
Le modèle linéaire mixte a été formalisé il y a plus de 60 ans. Celui-ci permet d'estimer un modèle avec des effets fixes équivalents à ceux du modèle linéaire classique et des effets aléatoires. Ce type de modélisation, d'abord utilisé en génétique animale, est depuis quelques années largement utilisé en génétique humaine. Les utilisations de ce modèle sont nombreuses. En effet, il peut être utilisé en étude de liaison, d'association, pour l'estimation de l'héritabilité ou encore dans la recherche d'empreinte parentale et peut s'adapter à des données familiales ou en population.Le but de mon doctorat est d'exploiter différentes méthodes basées sur les modèles mixtes d'abord sur des données génétiques en population puis sur des données génétiques familiales.Dans un premier temps, nous explorons dans ce manuscrit la théorie des modèles linéaires mixtes et leur utilisation en génétique. Nous adaptons aussi certaines méthodes pour les appliquer à notre recherche. Ce travail a donné lieu au développement informatique d'un package R permettant d'utiliser ces modèles dans le cadre des études génétiques.Dans un deuxième temps, nous utilisons les modèles linéaires mixtes pour l'estimation de l'héritabilité dans une étude en population française, l'étude Trois-Cités. Nous disposons dans cette étude des génotypes des tag-SNPs habituellement utilisés dans les études d'association ainsi que des lieux de naissance et de plusieurs traits anthropométriques quantitatifs tels que la taille. L'objectif est alors d'étudier la présence et la prise en compte dans l'analyse de stratification de population dans cette étude. Dans ce manuscrit, nous analysons les coordonnées géographiques des lieux de naissance. Nos résultats mettent en évidence la difficulté pour corriger correctement la stratification de population avec les méthodes classiques dans certains cas. Nous analysons ensuite les traits anthropométriques en particulier la taille dont nous estimons l'héritabilité à 39% dans la population de l'étude Trois-Cités.Dans la dernière partie de ce manuscrit, nous nous concentrons sur les données familiales. Nous montrons le gain d’information que peut apporter ce type de données dans la recherche des variants causaux. Puis, nous explorons l'utilisation des modèles mixtes sur des données familiales en appliquant certaines des méthodes associées dans la recherche de signaux d'association pour la Sclérose en Plaques, une maladie auto-immune, en utilisant un échantillon d’une centaine de familles nucléaires avec au moins deux germains atteints. Nous avons alors mis en évidence l’inadéquation des méthodes classiques basées sur les modèles mixtes à ce type de données. Afin de mieux comprendre ce biais de sélection et de le corriger, plus d’investigations sont nécessaires. / Linear mixed models have been formalized 60 years ago. These models allow to estimate fixed effects, as in the linear models, and random effects. First used in animal genetics, this type of modelling have been widely used in human genetics since a few years. Mixed models can be used in many genetic analysis; linkage and association studies, heritability estimations and Parent-of Origin effects studies for population or familial data.My thesis’ aim is to investigate mixed models based methods, for genetic data in population and, for familial genetic data.In the first part of my thesis, we investigated the mixed model statistical theory and their multiple uses in human genetics. We also adapted methods for our own work. An R package have been created which permits to analyze genetic data in R environment with mixed models.In a second part, we applied mixed models on Three-Cities data, a French longitudinal study, to estimate heritability of several traits. For this analysis, we have access to tag-SNPs typically used in genome-wide association studies, birthplaces and several anthropometric traits. The aim of our study is to analyze presence of population stratification and evaluate methods to correct it. In the one hand, we analyzed birthplace geographic coordinates and showed that the correction for population stratification by classical method is not sufficient in this case. In the other hand, we analyzed anthropometric traits, in particular the height for which we estimated heritability to 39% in Three-Cities study population.In the last part, we focused on family data. In a first work, we exploited familial information in causal variant research. In a second work, we explored mixed models uses for familial data, in particular association study, on Multiple Sclerosis data. We showed that mixed model methods can not be used without taking account the ascertainment scheme: in our data, all families have at least two affected sibs. To understand and correct this phenomenon, more investigations are needed.
112

Novel statistical models for ecological momentary assessment studies of sexually transmitted infections

He, Fei 18 July 2016 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / The research ideas included in this dissertation are motivated by a large sexually trans mitted infections (STIs) study (IU Phone study), which is also an ecological momentary assessment (EMA) study implemented by Indiana University from 2008 to 2013. EMA, as a group of methods used to collect subjects’ up-to-date behaviors and status, can increase the accuracy of this information by allowing a participant to self-administer a survey or diary entry, in their own environment, as close to the occurrence of the behavior as possible. IU Phone study’s high reporting level shows one of the benefits gain from introducing EMA in STIs study. As a prospective study lasting for 84 days, participants in IU Phone study undergo STI testing and complete EMA forms with project-furnished cellular telephones according to the predetermined schedules. At pre-selected eight-hour intervals, participants respond to a series of questions to identify sexual and non-sexual interactions with specific partners including partner name, relationship satisfaction and sexual satisfaction with this partner, time of each coital event and condom use for each event. etc. STIs lab results of all the participants are collected weekly as well. We are interested in several variables related to the risk of infection and sexual or non-sexual behaviors, especially the relationship among the longitudinal processes of those variables. New statistical models and applications are established to deal with the data with complex dependence and sampling data structures. The methodologies covers various of statistical aspect like generalized mixed models, mul tivariate models and autoregressive and cross-lagged model in longitudinal data analysis, misclassification adjustment in imperfect diagnostic tests, and variable-domain functional regression in functional data analysis. The contribution of our work is we bridge the meth ods from different areas with EMA data in the IU Phone study and also build up a novel understanding of the association among all the variables of interest from different perspec tives based on the characteristic of the data. Besides all the statistical analyses included in this dissertation, variety of data visualization techniques also provide informative support in presenting the complex EMA data structure.
113

Generalized Estimating Equations for Mixed Models

Alnaji, Lulah A. 23 July 2018 (has links)
No description available.
114

Genetic evaluation models and strategies for potato variety selection.

Paget, Mark Frederick January 2014 (has links)
A series of studies are presented on the genetic evaluation of cultivated potato (Solanum tuberosum L.) to improve the accuracy and efficiency of selection at various stages of a breeding programme. The central theme was the use of correlated data, such as relationship information and spatial and across-trial correlations, within a linear mixed modelling framework to enhance the evaluation of candidate genotypes and to improve the genetic response to selection. Analyses focused on several social and economically-important traits for the enhancement of the nutritional value, disease resistance and yield of potato tubers. At the formative stages of a breeding scheme, devising a breeding strategy requires an improved understanding of the genetic control of target traits for selection. To guide a strategy that aims to enhance the micronutrient content of potato tubers (biofortification), univariate and multivariate Bayesian models were developed to estimate genetic parameters for micronutrient tuber content from a breeding population generated from crosses between Andean landrace cultivars. The importance of the additive genetic components and extent of the narrow-sense heritability estimates indicated that genotypic 'individual' recurrent selection based on empirical breeding values rather than family-based selection is likely to be the most effective strategy in this breeding population. The magnitude of genetic correlations also indicated that simultaneous increases in important tuber minerals, iron and zinc, could be achieved. Optimising selection efficiency is an important ambition of plant breeding programmes. Reducing the level of candidate replication in field trials may, under certain circumstances, contribute to this aim. Empirical field data and computer simulations inferred that improved rates of genetic gain with p-rep (partially replicated) testing could be obtained compared with testing in fully replicated trials at the early selection stages, particularly when testing over two locations. P-rep testing was able to increase the intensity of selection and the distribution of candidate entries across locations to account for G×E effects was possible at an earlier stage than is currently practised. On the basis of these results, it was recommended that the full replication of trials (at the first opportunity, when enough planting material is available) at a single location in the early stages of selection should be replaced with the partial replication of selection candidates that are distributed over two locations. Genetic evaluation aims to identify genotypes with high empirical breeding values (EBVs) for selection as parents. Using mixed models, spatial parameters to target greater control of localised field heterogeneity were estimated and variance models to account for across-trial genetic heterogeneity were tested for the evaluation of soil-borne powdery scab disease and tuber yield traits at the early stages of a selection programme. When spatial effects improved model fit, spatial correlations for rows and columns were mostly small for powdery scab, and often small and negative for marketable and total tuber yield suggesting the presence of interplot competition in some years for tuber yield traits. For the evaluation of powdery scab, genetic variance structures were tested using data from 12 years of long-term potato breeding METs (multi-environment trials). A simple homogeneous correlation model for the genetic effects was preferred over a more complex factor analytic (FA) model. Similarly, for the MET evaluation of tuber yield at the early stages, there was little benefit in using more complex FA models, with simple correlation structures generally the most favourable models fitted. The use of less complex models will be more straightforward for routine implementation of potato genetic evaluations in breeding programmes. Evaluations for (marketable) tuber yield were extended to multi-location MET data to characterise both genotypes and environments, allowing a re-evaluation of New Zealand MET selection strategies aimed at broad adaptation. Using a factor analytic mixed model, results indicated that the programme’s two main trial locations in the North and the South Islands optimised differentiation between genotypes in terms of G×E effects. There was reasonable performance stability of genotypes across test locations and evidence was presented for some, but limited, genetic progress of cultivars and advanced clonal selections for tuber marketable yield in New Zealand over recent years. The models and selection strategies investigated and developed in this thesis will allow an improved and more systematic application of genetic evaluations in potato selection schemes. This will provide the basis for well informed decisions to be made on selection candidates for the genetic improvement of potato in breeding programmes.
115

Sex-specific Habitat Use and Responses to Fragmentation in an Endemic Chameleon Fauna

Shirk, Philip 25 July 2012 (has links)
Chameleons are an understudied taxon facing many threats, including collection for the international pet trade and habitat loss and fragmentation. A recent field study reports a highly female-biased sex ratio in the Eastern Arc Endemic Usambara three horned chameleon, Trioceros deremensis, a large, sexually dimorphic species. This species is collected for the pet trade, and local collectors report males bring a higher price because only this sex has horns. Thus, sex ratios may vary due to differential rates of survival or harvesting. Alternatively, they may simply appear to be skewed if differences in habitat use biases detection of the sexes. Another threat facing chameleons is that of habitat loss and fragmentation. Despite enormous amounts of research, the factors of fragmentation that different species respond to is still under debate. Understanding these responses is important for current mitigation efforts as well as predicting how species will respond to future habitat alteration and climate change. My study suggests that differences in survival and detection may explain much of the observed seasonal sex skew in adult T. deremensis. Within fragmented habitat chameleons consistently responded more to edge effects and vegetative characteristics associated with fragmentation than to area or isolation effects. This may bode poorly for chameleon populations in the coming decades as climate change further alters vegetative communities and exacerbates edge effects.
116

Rezervování škod v rámci panelových dat / Claims reserving within the panel data framework

Gerthofer, Michal January 2015 (has links)
In the presented thesis the issue of dependency between response variables within the subjects in the generalized linear models framework is investigated. Reserving in non-life insurance is a key factor for the financial position of a company. The text introduces the basic actuarial notation, terminology and methods. The main part is focused on panel data framework, especially Generalized Linear Mixed Models (GLMM) as well as Generalized Estimating Equations (GEE), and their application on claims reserving. The aim of this thesis is to show the advantages, disadvantages, limitations and the comparison of these approaches on representative datasets, which were chosen according to results obtained from whole database analysis. Significant focus is on model selection and diagnostics used for this purpose. Finally, the obtained results are summarized in tables, figures and the comparison of the methods is provided. Powered by TCPDF (www.tcpdf.org)
117

Modélisation de la variabilité inter-individuelle dans les modèles de croissance de plantes et sélection de modèles pour la prévision / Modelling inter-individual variability in plant growth models and model selection for prediction

Baey, Charlotte 28 February 2014 (has links)
La modélisation de la croissance des plantes a vu le jour à la fin du XXème siècle, à l’intersection de trois disciplines : l’agronomie, la botanique et l’informatique. Après un premier élan qui a donné naissance à un grand nombre de modèles, un deuxième courant a vu le jour au cours de la dernière décennie pour donner à ces modèles un formalisme mathématique et statistique rigoureux. Les travaux développés dans cette thèse s’inscrivent dans cette démarche et proposent deux axes de développement, l’un autour de l’évaluation et de la comparaison de modèles, et l’autre autour de l’étude de la variabilité inter-plantes.Dans un premier temps, nous nous sommes intéressés à la capacité prédictive des modèles de croissance de plantes, en appliquant une méthodologie permettant de construire et d’évaluer des modèles qui seront utilisés comme outils prédictifs. Une première étape d’analyse de sensibilité permet d’identifier les paramètres les plus influents afin d’élaborer une version plus robuste de chaque modèle, puis les capacités prédictives des modèles sont comparées à l’aide de critères appropriés. Cette étude a été appliquée au cas de la betterave sucrière mais peut se généraliser à d’autres plantes.La deuxième partie de la thèse concerne la prise en compte de la variabilité inter-individuelle dans les populations de plantes. Il existe en effet une forte variabilité entre plantes, d’origine génétique ou environnementale, dont il est nécessaire de tenir compte. Nous proposons dans cette thèse une approche basée sur l’utilisation de modèles (non linéaires) à effets mixtes pour caractériser la variabilité inter- individuelle. L’estimation paramétrique par maximum de vraisemblance nécessite l’utilisation de versions stochastiques de l’algorithme d’Espérance Maximisation basées sur des simulations de type Monte Carlo par Chaîne de Markov. Après une première application au cas de l’organogenèse chez la betterave sucrière, nous proposons une extension du modèle structure-fonction Greenlab à l’échelle de la population, appliqué aux cas de la betterave sucrière et du colza. / The modelling of plant growth and development was born at the end of the XXth century at the intersection of three disciplines: agronomy, botany and computer science. After a first period corresponding to the emergence of a lot of different models, a new trend has been initiated in the last decade to give these models a rigorous mathematical and statistical formalism. This thesis focuses on two main areas of development: (i) models evaluation and comparison, and (ii) inter-individual variability in plant populations.In the first part of the thesis, we study the predictive capacity of plant growth models, and we apply a two-step methodology to build and evaluate different models in a predictive perspective. In a first step, a sensitivity analysis is conducted to identify the most influential parameters and elaborate a more robust version of each model, and in a second step the predictive capacities of the models are compared using appropriate criteria. This study is carried out on sugar beet crops but can be easily generalized to other species.The second part of this thesis concerns the inter-individual variability in plant populations, which can be very high due to genetics or environmental varying conditions. This variability is rarely accounted for despite the major impact it can have at the agrosystem level. We proposed to take it into account using (nonlinear) mixed models, for which parameter estimation using maximum likelihood method relies on the use of stochastic variants of the Expectation-Maximization algorithm, based on Markov Chain Monte Carlo simulation techniques. We first apply this approach to the case of organogenesis in sugar beet populations, and secondly, we develop an extension of the functional-structural plant growth model Greenlab, from the individual to the population scale.
118

Méthodes de méta-analyse pour l’estimation des émissions de N2O par les sols agricoles / Meta-analysis methods to estimate N2O emissions from agricultural soils.

Philibert, Aurore 16 November 2012 (has links)
Le terme de méta-analyse désigne l'analyse statique d'un large ensemble de résultats provenant d'études individuelles pour un même sujet donné. Cette approche est de plus en plus étudiée dans différents domaines, notamment en agronomie. Dans cette discipline, une revue bibliographique réalisée dans le cadre de la thèse a cependant montré que les méta-analyses n'étaient pas toujours de bonne qualité. Les méta-analyses effectuées en agronomie étudient ainsi très rarement la robustesse de leurs conclusions aux données utilisées et aux méthodes statistiques. L'objectif de cette thèse est de démontrer et d'illustrer l'importance des analyses de sensibilité dans le cadre de la méta-analyse en s'appuyant sur l'exemple de l'estimation des émissions de N2O provenant des sols agricoles. L'estimation des émissions de protoxyde d'azote (N2O) est réalisée à l'échelle mondaile par le Groupe d'experts intergouvernemental sur l'évolution du climat (GIEC). Le N2O est un puissant gaz à effet de serre avec un pouvoir de réchauffement 298 fois plus puissant que le CO2 sur une période de 100 ans. Les émissions de N2O ont la particularité de présenter une forte variabilité spatiale et temporelle. Deux bases de données sont utilisées dans ce travail : la base de données de Rochette et Janzen (2005) et celle de Stehfest et Bouwman (2006). Elles recensent de nombreuses mesures d'émissions de N2O réparties dans le monde provenant d'études publiées et ont joué un rôle important lors des estimations d'émissions de N2O réalisées par le GIEC. Les résultats montrent l'intérêt des modèles à effets aléatoires pour estimer les émissions de NO2 issues de sols agricoles. Ils sont bien adaptés à la structure des données (observations répétées sur un même site pour différentes doses d'engrais, avec plusieurs sites considérés). Ils permettent de distinguer la variabilité inter-sites de la variabilité intra-site et d'estimer l'effet de la dose d'engrais azoté sur les émissions de NO2. Dans ce mémoire, l'analyse de la sensibilité des estimations à la forme de la relation "Emission de N2O / Dose d'engrais azoté" a montré qu'une relation exponentielle était plus adaptée. Il apparait ainsi souhaitable de remplacer le facteur d'émission constant du GIEC (1% d'émission quelque soit la dose d'engrais azoté) par un facteur variable qui augmenterait en fonction de la dose. Nous n'avons par contre pas identifié de différence importante entre les méthodes d'inférence fréquentiste et bayésienne. Deux approches ont été proposées pour inclure des variables de milieu et de pratiques culturales dans les estimations de N2O. La méthode Random Forest permet de gérer les données manquantes et présente les meilleures prédictions d'émission de N2O. Les modèles à effets aléatoires permettent eux de prendre en compte ces variables explicatives par le biais d'une ou plusieurs mesures d'émission de N2O. Cette méthode permet de prédire les émissions de N2O pour des doses non testées comme le cas non fertilisé en parcelles agricoles. Les résultats de cette méthode sont cependant sensibles au plan d'expérience utilisé localement pour mesurer les émissions de N2O. / The term meta-analysis refers to the statistical analysis of a large set of results coming from individual studies about the same topic. This approach is increasingly used in various areas, including agronomy. In this domain however, a bibliographic review conducted by this thesis, showed that meta-analyses were not always of good quality. Meta-analyses in agronomy very seldom study the robustness of their findings relative to data quality and statistical methods.The objective of this thesis is to demonstrate and illustrate the importance of sensitivity analysis in the context of meta-analysis and as an example this is based on the estimation of N2O emissions from agricultural soils. The estimation of emissions of nitrous oxide (N2O) is made at the worldwide level by the Intergovernmental Panel on Climate Change (IPCC). N2O is a potent greenhouse gas with a global warming power 298 times greater than the one of CO2 over a 100 year period. The key characteristics of N2O emissions are a significant spatial and time variability. Two databases are used for this work: the database of Rochette and Janzen (2005) and the one of Stehfest and Bouwman (2006). They collect numerous worldwide N2O emissions measurements from published studies and have played a significant role in the estimation of N2O emissions produced by the IPCC. The results show the value of random effects models in order to estimate N2O emissions from agricultural soils. They are well suited to the structure of the data (repeated observations on the same site for different doses of fertilizers, with several sites considered). They allow to differentiate the inter-site and intra-site variability and to estimate the effect of the rate of nitrogen fertilize on the N2O emissions. In this paper, the analysis of the sensitivity of the estimations to the shape of the relationship "Emission of N2O / N fertilizer dose" has shown that an exponential relationship would be the most appropriate. Therefore it would be appropriate to replace the constant emission factor of the IPCC (1% emission whatever the dose of nitrogen fertilizer) by a variable factor which would increase with the dose. On the other hand we did not identify significant differences between frequentist and Bayesian inference methods. Two approaches have been proposed to include environmental variables and cropping practices in the estimates of N2O. The first one using the Random Forest method allows managing missing data and provides the best N2O emissions predictions. The other one, based on random effects models allow to take into account these explanatory variables via one or several measurements of N2O. They allow predicting N2O emissions for non-tested doses in unfertilized farmer's field. However their results are sensitive to the experimental design used locally to measure N2O emissions.
119

Tropical forage breeding from classic to new genomic tools: an example with interspecific tetraploid Urochloa spp. hybrids / Melhoramento de forrageiras tropicais do clássico as modernas ferramentas genômicas: um exemplo em híbridos interespecíficos tetraploides de Urochloa spp.

Matias, Filipe Inácio 05 December 2018 (has links)
A tropical forage breeding program contains several peculiarities, especially when it involves polyploid species and facultative apomixis. Despite their importance, there is still a lack of information on genetic studies of critical forage traits and on the employment of genomic tools when compared to other crops and temperate forages. The genus Brachiaria is the most important for forage in tropical regions mainly beef production. The commercial species in this genus are excellent perennial forage, and the identification of superior genotypes depends on the selection of many characteristics under complex genetic control, with high cost and time-consuming evaluation. Therefore, the knowledge about uses and applications of classic and genomic tools in forage traits may be useful to support breeding programs and the development of new cultivars. In this context, the aim was to evaluate several different classic and genomic tools to be employed as selection strategies in a traditional tropical forage breeding program. A panel of tetraploid hybrids obtained from crossing Urochloa brizantha x Urochloa ruziziensis was phenotyped and genotyped to evaluate genetic parameters and perform genomic studies. The classic phenotypic analysis showed no clear trend of the importance of additive and non-additive genetics effects for agronomical and nutritional traits. The Mulamba and Mock index should be used in the univariate level, due to the promotion of a more balanced response to selection for all traits in the multivariate selection. In the genomic extraction and evaluations, the reads that were aligned to a \'mock\' reference genome, created from GBS data of the cultivar \'Marandu\', had more SNP discovered compared to the closest true reference genomes, Setaria viridis and S. italica. We recommended different thresholds of sample depth and genotype quality (GQ) to eliminate poor quality reads without introducing genotype bias. Cross-validation revealed that missing genotypes were imputed with a median accuracy of 0.85 using Random Forest algorithm to produce a complete genotype matrix, regardless of heterozygote frequency. The genome-wide association analysis (GWAS) revealed candidate genes associated with many tropical forage traits across all cutting seasons, which could be the first step toward marker-assisted selection (MAS). Moreover, our results suggest that accounting for allele dosage is essential, since the tetraploid level provided more information about the true biological state. Therefore, our findings revealed the complexity of the genetic architecture of Urochloa spp. traits and provided important insights towards the application of GWAS in polyploids species. The genomic selection analysis revealed that GBLUP-A (additive) and GBLUP-AD (additive + dominance) showed similar prediction abilities considering both single and multi-trait models. Conversely, combining GBLUP-AD and tetraploid information could improve the selection coincidence. Furthermore, the multi-trait validation scheme 2 (VS2), where one trait is not evaluated for some individuals, provided an increment of up to 30% to the prediction ability. Therefore, it is an useful strategy for traits with low heritability. Overall, all genomic selection models considered provided greater genetic gains than the phenotypic selection. Similarly, the allele dosage associated with additive, dominance and multi-trait factors increased the accuracy of genomic prediction models for interspecific polyploid hybrids. Finally, genomic tools should be used in forages breeding programs in order to reduce cost and time. / Um programa de melhoramento de forragem tropical contém várias peculiaridades, especialmente quando se trata de espécies poliplóides e de apomixia facultativa. Apesar de sua importância, atualmente, faltam informações sobre estudos genéticos de características forrageiras e sobre o emprego de ferramentas genômicas quando comparadas a outras culturas e forragens de clima temperado. O gênero Brachiaria é o mais importante para formação de pastagens nas regiões tropicais, principalmente para produção de carne bovina. As espécies comerciais deste gênero são excelentes forrageiras perenes, e a identificação de genótipos superiores depende da seleção de muitas características sob controle genético complexo, com alto custo e avaliação demorada. Portanto, o conhecimento sobre usos e aplicações de ferramentas clássicas e genômicas em características forrageiras pode ser útil para apoiar programas de melhoramento e o desenvolvimento de novas cultivares. Nesse contexto, objetivou-se avaliar diversas ferramentas clássicas e genômicas a serem empregadas como estratégias de seleção em um programa tradicional de melhoramento de forrageiras tropicais. Um painel de híbridos tetraplóides obtidos do cruzamento Urochloa brizantha x Urochloa ruziziensis foi fenotipado e genotipado para avaliar parâmetros genéticos e realizar estudos genômicos. Para a análise fenotípica clássica, concluímos que não havia uma tendência clara da importância dos efeitos genéticos aditivos e não-aditivos para características agronômicas e nutricionais. O índice de Mulamba e Mock deve ser usado no nível univariado, devido à promoção de uma resposta mais equilibrada à seleção para todas as características na seleção multivariada. Na extração e nas avaliações genômicas, as leituras que foram alinhadas ao genoma de referência \'simulado\', criado a partir dos dados de GBS da cultivar \'Marandu\', tiveram a maior porcentagem de descoberta de marcadores SNP comparado aos genomas de referência mais próximos, Setaria viridis e S. italica. Recomendamos diferentes limiares de profundidade de leitura e qualidade de genótipo (GQ) para eliminar leituras de baixa qualidade sem introduzir viés de chamada de genótipo. A validação cruzada revelou que os genótipos ausentes foram imputados com uma precisão mediana de 0,85 pelo algoritmo Random Forest para produzir uma matriz genotípica completa, independentemente da frequência de heterozigotos. A análise de associação genômica ampla (GWAS) revelou genes candidatos associados a muitas características forrageiras tropicais, o que poderia ser o primeiro passo em direção à seleção assistida por marcadores (MAS). Além disso, nossos resultados sugerem que a contabilização da dosagem alélica é essencial, uma vez que o nível tetraploide fornece mais informações sobre o verdadeiro estado biológico. Portanto, nossos achados revelam a complexidade da arquitetura genética de características de Urochloa spp. e fornecem informações importantes para a aplicação de GWAS em espécies poliploides. A análise de seleção genômica revela que o GBLUP-A (aditivo) e o GBLUP-AD (aditivo + dominância) mostraram capacidades de predição semelhantes, considerando tanto os modelos simples quanto os multi-característica. Por outro lado, combinando-se GBLUP-AD e informação tetraploide foi possível melhorar a coincidência de seleção. Além disso, o esquema de validação multi-característica 2 (VS2), onde uma característica não é avaliada para alguns indivíduos, pode fornecer um incremento de até 30% da capacidade de previsão. Portanto, é uma estratégia útil para características com baixa herdabilidade. No geral, todos os modelos de seleção genômica considerados proporcionaram maiores ganhos genéticos do que a seleção fenotípica tradicional. Da mesma forma, a dosagem do alelo associado a fatores aditivos, de dominância e multicaracteres aumentou a acurácia dos modelos genômicos de predição para híbridos poliploides interespecíficos. Finalmente, ferramentas genômicas devem ser utilizadas em programas de melhoramento de forragens para reduzir custos e tempo.
120

snpReady and BGGE: R packages to prepare datasets and perform genome-enabled predictions / snpReady e BGGE: pacotes do R para preparar dados genômicos e realizar predições genômicas

Granato, Italo Stefanine Correia 07 February 2018 (has links)
The use of molecular markers allows an increase in efficiency of the selection as well as better understanding of genetic resources in breeding programs. However, with the increase in the number of markers, it is necessary to process it before it can be ready to use. Also, to explore Genotype x Environment (GE) in the context of genomic prediction some covariance matrices needs to be set up before the prediction step. Thus, aiming to facilitate the introduction of genomic practices in the breeding program pipelines, we developed two R-packages. The former is called snpReady, which is set to prepare data sets to perform genomic studies. This package offers three functions to reach this objective, from organizing and apply the quality control, build the genomic relationship matrix and a summary of a population genetics. Furthermore, we present a new imputation method for missing markers. The latter is the BGGE package that was built to generate kernels for some GE genomic models and perform predictions. It consists of two functions (getK and BGGE). The former is helpful to create kernels for the GE genomic models, and the latter performs genomic predictions with some features for GE kernels that decreases the computational time. The features covered in the two packages presents a fast and straightforward option to help the introduction and usage of genome analysis in the breeding program pipeline. / O uso de marcadores moleculares permite um aumento na eficiência da seleção, bem como uma melhor compreensão dos recursos genéticos em programas de melhoramento. No entanto, com o aumento do número de marcadores, é necessário o processamento deste antes de deixa-lo disponível para uso. Além disso, para explorar a interação genótipo x ambiente (GA) no contexto da predição genômica, algumas matrizes de covariância precisam ser obtidas antes da etapa de predição. Assim, com o objetivo de facilitar a introdução de práticas genômicas nos programa de melhoramento, dois pacotes em R foram desenvolvidos. O primeiro, snpReady, foi criado para preparar conjuntos de dados para realizar estudos genômicos. Este pacote oferece três funções para atingir esse objetivo, organizando e aplicando o controle de qualidade, construindo a matriz de parentesco genômico e com estimativas de parâmetros genéticos populacionais. Além disso, apresentamos um novo método de imputação para marcas perdidas. O segundo pacote é o BGGE, criado para gerar kernels para alguns modelos genômicos de interação GA e realizar predições genômicas. Consiste em duas funções (getK e BGGE). A primeira é utilizada para criar kernels para os modelos GA, e a última realiza predições genômicas, com alguns recursos especifico para os kernels GA que diminuem o tempo computacional. Os recursos abordados nos dois pacotes apresentam uma opção rápida e direta para ajudar a introdução e uso de análises genômicas nas diversas etapas do programa de melhoramento.

Page generated in 0.0917 seconds