Spelling suggestions: "subject:"oon genomic"" "subject:"soon genomic""
261 |
Predição genômica de híbridos de milho para caracteres de arquitetura oligogênica e sob diferentes parâmetros de penalização e correção de fenótipo / Genomic prediction of maize hybrids for traits with oligogenic architecture and under distinct shrinkage factors and phenotypic correctionGalli, Giovanni 29 June 2016 (has links)
O alcance de altas produtividades em milho (Zea mays L.) depende do desenvolvimento de híbridos, o principal produto explorado nos programas de melhoramento. O sucesso na obtenção deste tipo de cultivar é conseguido com extensivo cruzamento de linhagens, seguido de avaliações para identificação das combinações de maior potencial. Geralmente, o melhorista tem à sua disponibilidade grande número de linhagens, possibilitando a realização de centenas a milhares de cruzamentos distintos, dos quais apenas uma pequena quantidade pode ser avaliada experimentalmente devido a limitação de tempo e recursos. Com o advento da Seleção Genômica (GS) tornou-se possível predizer o comportamento destes indivíduos não avaliados com base em seu genoma. No decorrer do processo de consolidação da GS várias metodologias foram propostas. A aptidão destas em predizer desempenhos fenotípicos é dependente da sua capacidade de acomodar a arquitetura genética das características e lidar com a multicolinearidade das matrizes genômicas. Neste sentido, métodos baseados em modelos mistos podem apresentar menor eficiência na predição de características oligogênicas devido à não capacidade de representar a distribuição real do efeito dos QTL. Além disso, a regularização das predições na presença de multicolinearidade é realizada por meio de um parâmetro de penalização (λ), o qual pode ser estimado de várias formas e consequentemente modificar a acurácia dos modelos. Além do aprimoramento dos métodos, outro aspecto importante é o procedimento de correção dos dados fenotípicos previamente à GS, o qual não é consenso na comunidade científica. Diante do exposto, este trabalho objetivou: verificar o efeito das formas de obtenção do λ (via REML na GS e pela herdabilidade da característica) e da correção do fenótipo (valor genotípico e média ajustada) na GS e avaliar a eficiência da modelagem diferencial de QTL de maior efeito na capacidade preditiva da metodologia G-BLUP, comparando-a ao LASSO Bayesiano, BayesB e G-BLUP convencional. Para isso foram utilizadas informações de híbridos simples de milho tropical avaliados em cinco locais para produtividade de grãos, altura de planta e espiga no ano de 2015. Os dados genômicos foram obtidos com a plataforma Affymetrix® Axiom® Maize Genotyping Array de 616.201 SNPs. Foram estudados diferentes cenários de GS considerando os fatores supracitados, sendo estes comparados entre si por suas capacidades preditivas e seletivas. Os resultados obtidos indicam que a correção do fenótipo e a forma de estimação de λ afetam a capacidade preditiva. O uso de valores genotípicos como correção dos fenótipos e estimação de λ via REML apresentaram os melhores resultados. Foi também observado que a modelagem de SNPs de maior efeito como fator fixo aumenta discretamente a capacidade preditiva da metodologia G-BLUP para as características oligogênicas avaliadas (altura de planta e espiga), sendo indicado o uso do G-BLUP convencional. Complementarmente, observou-se que a GS apresentou modesta eficiência na seleção de híbridos superiores sob intensidades moderadas. Entretanto, a sua alta capacidade de selecionar sob baixa intensidade pode ser amplamente explorada nos programas de melhoramento de milho visando a seleção precoce direta. / The achievement of high yield in maize (Zea mays L.) relies on the development of hybrids, which is the main product of breeding programs. The success in obtaining this kind of cultivar is achieved through extensive crossing of inbred lines followed by field trials to identify the combinations with greatest potential. Generally, breeders have a large number of inbred lines on their hands, being able to perform hundreds to thousands of different crosses, of which only a small portion can be experimentally evaluated due to time and resource limitations. Genomic Selection (GS) has made it possible to predict phenotypes of unevaluated individuals based on their genome. Throughout the establishment process of GS many approaches have been proposed. The ability of these approaches at predicting phenotypic performance depends on their capacity of accommodating the genetic architecture of the traits and dealing with the multicollinearity of the genomic matrices. Hence, methods based on mixed model equations may present lower prediction efficiency for oligogenic traits due to their inability of depicting the real distribution of the QTL effects. Moreover, the prediction regularization in the presence of multicollinearity is done by a shrinkage factor (λ), which can be estimated in a number of ways and may affect the accuracy of the models. In addition to the improvement of the models, the correction of the phenotype utilized in the predictions is also important, which is not a consensus among researchers. Based on these facts, this study aimed to assess the effect of estimation of λ (by REML in the GS model and by the heritability of the traits) and the correction of the phenotype (genotypic value and adjusted mean) on the GS. It also targeted to evaluate the effect of differential modeling of major makers on the prediction accuracy of G-BLUP, comparing it to Bayesian LASSO, BayesB and ordinary G-BLUP. To those ends, tropical maize single-crosses evaluated at five sites for grain yield, plant and ear height in 2015 were utilized. The genomic data was obtained with the Affymetrix® Axiom® Maize Genotyping Array of 616,201 SNPs. Distinct GS scenarios were studied considering the aforementioned factors which were compared by their prediction and selection accuracy. The results suggest that the correction of the phenotype and the way of estimation of λ do affect prediction accuracies. The use of genotypic values as the correction of phenotypes and the estimation of λ by REML showed best results. It was also observed that modeling major SNPs as fixed effect factors had little improvement on the prediction accuracy of G-BLUP for the oligogenic traits evaluated (plant and ear height). Thereby, ordinary G-BLUP should be the method of choice to predict these traits. Additionally, it was observed that GS presented modest efficiency for selecting superior hybrids under moderate intensities. However, its high effectiveness at selecting under low intensities might be exploited on maize breeding programs for early direct selection.
|
262 |
Algorithm Optimizations in Genomic Analysis Using Entropic DissectionDanks, Jacob R. 08 1900 (has links)
In recent years, the collection of genomic data has skyrocketed and databases of genomic data are growing at a faster rate than ever before. Although many computational methods have been developed to interpret these data, they tend to struggle to process the ever increasing file sizes that are being produced and fail to take advantage of the advances in multi-core processors by using parallel processing. In some instances, loss of accuracy has been a necessary trade off to allow faster computation of the data. This thesis discusses one such algorithm that has been developed and how changes were made to allow larger input file sizes and reduce the time required to achieve a result without sacrificing accuracy. An information entropy based algorithm was used as a basis to demonstrate these techniques. The algorithm dissects the distinctive patterns underlying genomic data efficiently requiring no a priori knowledge, and thus is applicable in a variety of biological research applications. This research describes how parallel processing and object-oriented programming techniques were used to process larger files in less time and achieve a more accurate result from the algorithm. Through object oriented techniques, the maximum allowable input file size was significantly increased from 200 mb to 2000 mb. Using parallel processing techniques allowed the program to finish processing data in less than half the time of the sequential version. The accuracy of the algorithm was improved by reducing data loss throughout the algorithm. Finally, adding user-friendly options enabled the program to use requests more effectively and further customize the logic used within the algorithm.
|
263 |
Genetics of disease resistance : application to bovine tuberculosisTsairidou, Smaragda January 2016 (has links)
Bovine Tuberculosis (bTB) is a disease of significant economic importance, being one of the most persistent animal health problems in the UK and the Republic of Ireland and increasingly constituting a public health concern especially for the developing world. Limitations of the currently available diagnostic and control methods, along with our incomplete understanding of bTB transmission, prevent successful eradication. This Thesis addresses the development of a complementary control strategy which will be based on animal genetics and will allow us to identify animals genetically predisposed to be more resistant to disease. Specifically, the aim of my PhD project is to investigate the genetic architecture of resistance to bTB and demonstrate the feasibility of whole genome prediction for the control of bTB in cattle. Genomic selection for disease resistance in livestock populations will assist with the reduction of the in herd-level incidence and the severity of potential outbreaks. The first objective was to explore the estimation of breeding values for bTB resistance in UK dairy cattle, and test these genomic predictions for situations when disease phenotypes are not available on selection candidates. Through using dense SNP chip data the results of Chapter 2 demonstrate that genomic selection for bTB resistance is feasible (h2 = 0.23(SE = 0.06)) and bTB resistance can be predicted using genetic markers with an estimate of prediction accuracy of r(g, ĝ) = 0.33 in this data. It was shown that genotypes help to predict disease state (AUC ≈ 0.58) and animals lacking bTB phenotypes can be selected based on their genotypes. In Chapter 3, a novel approach is presented to identify loci displaying heterozygote (dis)advantage associated with resistance to M. bovis, hypothesising underlying non-additive genetic variation, and these results are compared with those obtained from standard genome scans. A marker was identified suggesting an association between locus heterozygosity and increased susceptibility to bTB i.e. a heterozygote disadvantage, with the heterozygotes being significantly more in the cases than in the controls (x2 = 11.50, p < 0.001). Secondly, this thesis focused on conducting a meta-analysis on two dairy cattle populations with bTB phenotypes and SNP chip genotypes, identifying genomic regions underlying bTB resistance and testing genomic predictions by means of cross-validation. In Chapter 4, exploration of the genetic architecture of the trait revealed that bTB resistance is a moderately polygenic, complex trait with clusters of causal variants spread across a few major chromosomes collectively controlling the trait. A region was identified on chromosome 6, putatively associated with bTB resistance and this chromosome as a whole was shown to contribute a major proportion (hc 2= 0.051) of the observed variation in this dataset. Genomic prediction for bTB was shown to be feasible even when only distantly related populations are combined (r(g,ĝ)=0.33 (SE = 0.05)), with the chromosomal heritability results suggesting that the accuracy arises from the SNPs capturing linkage disequilibrium between markers and QTL, as well as additive relationships between animals (~80% of estimated genomic h2 is due to relatedness). To extend the analysis, in Chapter 5, high density genotypes were inferred by means of genotype imputation, anticipating that these analyses will allow the identification of genomic regions associated with bTB resistance more closely, and that would increase the prediction accuracy. Genotype imputation was successful, however, using all imputed genotypes added little information. The limiting factor was found to be the number of animals and the trait definitions rather than the density of genotypes. Thirdly, a quantitative genetic analysis of actual Single Intradermal Comparative Cervical Test (SICCT) values collected during bTB herd testing was conducted aiming to investigate if selection for bTB resistance is likely to have an impact on the SICCT diagnostic test. This analysis demonstrated that the SICCT has a negligibly low heritability (h2=0.0104 (SE = 0.0032)) and any effect on the responsiveness to the test is likely to be small. In conclusion, breeding for disease resistance in livestock is feasible and we can predict the risk of bTB in cattle using genomic information. Further, putative QTLs associated with bTB resistance were identified, and exploration of the genetic architecture of bTB resistance revealed a moderately polygenic trait. These results suggest that given that larger datasets with more phenotyped and genotyped animals will be available, we can breed for bTB resistance and implement the genomic selection technology in breeding programmes aiming to improve the disease status and overall health of the livestock population. Using the genomics this can be continued as the epidemic declines.
|
264 |
Make inferences about bacterial gene functions with the concept of neighborhood in silico / Faire des inférences sur les fonctions des gènes bactériens avec le concept de voisinage in silicoWang, Tingzhang 15 December 2010 (has links)
Avec l'accroissement du nombre de génomes séquencés, l'organisation de ces données brutes et des données dérivées, l'extraction de l'information et des connaissances associées défie l'imagination. La notion de voisinage a été d'abord été introduite pour l'organisation des données dans des bases de données relationnelles. Pour extraire des informations pertinentes à partir de données massives, différents types de voisinages ont été étudiés ici. Tout d'abord, avec l'analysedes correspondances (CA) et en utilisant le regroupement supervisé ("model clustering" MBC), la proximité mutuelle des éléments formant deux entités biologiques centrales, les gènes (codant les protéines) et les acides aminés a été analysée. Nous montrons par exemple que les protéines de Psychromonas ingrahamii, bactérie psychrophile extrêmes, sont regroupées en six classes, et qu'il y a une forte opposition entre le comportement de l'asparagine (N) et des acides aminés sensibles à l'oxygène, ce que nous expliquons en terms de résistance au froid. Ensuite, nous avons analysé la répartition entre les îlots génomiques (GI) et le squelette du génome de base à partir d'une nouvelle méthode combinant composition en bases et en gènes, caractéristiques GI et de briser les synténies. L'application de cette approche à E. coli et B. subtilis a révélé que cette nouvelle méthode permet d'extraire certaines régions significative, non publiées auparavant.Enfin, pour illustrer un voisinage fin, la régulation de l'expression d'un gène et son évolution, nous avons étudié la relation entre les régions en amont du gène et la zone codante du gène thrS de façon approfondie. Nous avons constaté que ces deux régions associées à un gène, se sont comportés différemment dans l'histoire évolutive. Certaines des régions en amont porteuses de la fonction non-essentielle de régulation (qui contrôle l'expression de gène) ont évoluédifféremment de la région codante. / With more and more genomes being sequenced, the organization of those raw data and the derived data, the extraction of information and knowledge from these data has become a challenge. A key concept in this field is that of the neighborhood, especially with respect to the organization of data in relational databases. To extract information from bulk data, different kinds of neighborhoods were studied and each show interesting results in current study. .Firstly, through the Correspondence Analysis (CA) and later Model Based Clustering (MBC), two kinds of neighbors i.e. the genes (proteins) and amino acids were analyzed respectively, and it was found that proteins from Psychromonas ingrahamii are clustered into six classes, and there is strong opposition between asparagine (N) and the oxygen-sensitive amino acids. Secondly, the relationship between genomic islands and core genome (i.e. two closely linked neighbors withlarge range on the chromosome) was studied by a new method combining composition, GI features and synteny break. On applying to E. coli and B. subtilis it was revealed that this new method can extract some meaningful regions not published before. Thirdly, the relationship between upstream and coding regions of thrS gene (i.e. a case for two closely linked neighbors with small range on the chromosome) was studied extensively. It was found that these two regions associated to one gene, behaved differently in the evolutionary history.. Some of the upstream regions bearing non-essential function (i.e. regulation of gene expression) evolved more slowly than the coding region.
|
265 |
Du chromosome au gène par un criblage global des altérations génomiques dans la malignité pour isoler de nouvelles cibles thérapeutiques / From Chromosome to Gene by Mapping Chromosomal Abnormalities in Cancer in Order to Find Targeted Pharmaceutical AgentsToujani, Saloua 05 June 2012 (has links)
Le cancer est désormais considéré comme une maladie génomique de la cellule. Les moyens d’étude de l’oncogénome étaient basés sur les différentes modalités du caryotype, peu résolutif. L’application des techniques de micromatrices d’oligonucléotides, notamment l’aCGH, a permis une avancée majeure dans la caractérisation des génomes des cancers.La première partie de notre travail a porté sur les lymphomes de Burkitt (LB), caractérisés par une translocation entre un gène d'immunoglobuline et MYC. L’étude portait sur 12 tumeurs primaires et 15 lignées cellulaires. L’aCGH (44K et 244K), concordait avec les cytogénétiques morphologique et moléculaire (FISH) sauf pour les translocations. Plus de la moitié des variations du nombre de copies (<2Mb) étaient des polymorphismes (CNV). Les anomalies pathologiques (CNA) (n=136) intéressaient les régions suivantes : gains 1q, 13q, 7q, 8q, 2p, 11q et 15q ; pertes 3p, 4p, 4q, 9p, 6p, 17p, 6q, 11pterp13 et 14q12q21.3. Vingt régions minimales critiques (MCR) d’une taille varie entre 0.07-71.36 Mb, étaient délimitées. Trois MCR étaient identifiées sur le 1q : 1q21.1q25.2, 1q32.1 et 1q44. La région proximale de 1q21.1q25.2 était le siège d’une amplification, contenant entre autres les gènes BCA2, PIAS, BCL9. L’étude par transcriptome, sur 15 lignées, a démontré la surexpression uniquement de BCL9, remanié dans les LAL B et faisant partie de la voie de signalisation de MYC. Sur la région 11q23.1, le gain intéressait le gène POU2AF1 dont le messager était élevé. La MCR 13q31.3q32.1 était le siège d’une amplification contenant ABCC4, et le polycistron miR17-92. La corrélation des résultats d’aCGH à ceux du transcriptome et du mirnome ont démontré une surexpression du miR17-92 qui contrôle le développement des lymphocytes B et intervient dans la voie de signalisation de MYC. Sur le 9p21.3, la perte emportait le locus p16INK4A/p15INK4B. Le transcriptome avait démontré une sous expression de p15INK4B. Le locus p16INK4A/p15INK4B contrôle les 2 voies majeures, pRB et p53.La seconde partie de notre travail a consisté à étudier 17 tumeurs congelées de carcinomes adénoïdes kystiques (CAK) par aCGH 44K. Les CNA étaient validées par FISH et/ou MLPA. L’expression protéique était étudiée par immunohistochimie. Les pertes excédaient les gains (41 versus 24). La t(6;9)(q23;p22) récurrente dans les CAK était indétectable car équilibrée. Dans un seul cas, le der(6)t(6;9) est probablement présent sur le profil aCGH. Les MCR les plus fréquentes (-6q22 et -6q24) n’incluaient pas 6q23. Treize MCR étaient identifiées. La MCR délétée en 8q impliquait le miR-124A2 qui régule les gènes CDK6 et MMP2. Sur le 9p21.3, le locus p16INK4A/p15INK4B était de nouveau perdu. Des gains isolés étaient observés au niveau des locus CCND1, KIT/PDGFRA/KDR, MDM2 et JAK2. Le gène MDM2, qui était amplifié sous forme de double minutes, est un élément clé de l’axe p16INK4A-ARF-p53.Pour la troisième partie de notre travail nous avons étudié 60 tumeurs primaires d’adénocarcinomes pulmonaires (AD) de non fumeur par aCGH (244K). Dans 50/60 tumeurs, le nombre de MCR était de 14. Cinq MCR contenaient un seul gène (MOCS2, NSUN3, KHDRBS2, SNTG1 et ST18). Une MCR gagnée, 5q35, contenait le gène NSD1. Une amplification, sous forme de HSR et mise en évidence par FISH, intéressait l’oncogène FUS. Une PCR quantitative avait permis de confirmer la surexpression FUS. A notre connaissance, c’est la première étude qui incrimine le gène FUS dans la carcinogenèse de l’AD du non-fumeur. D’autres gènes étaient également impliqués : ARNT, BCL9, CDK4, p15INK4B, EGFR, ERBB2, MDM2, MDM4, MET, MYC, NKX2-1 et KRAS. Un clustering non supervisé avait permis de dégager un groupe avec un gain de MYC ; un autre groupe caractérisé par la perte des gènes suppresseurs RB et WRN et un dernier groupe caractérisé par un gain 7p et 7q, et présentait une fréquence élevée de mutations de l’EGFR. Dans 10/60, le nombre de CNA était très rare et aucune MCR n’était détectée. / Much of our current understanding of cancer is based on the hypothesis that it is a genetic disease, arising as a clone of cells that expand in an unregulated fashion because of somatically acquired mutations. High-throughput tools for nucleic acid characterization, such as array comparative genomic hybridization (aCGH), now provide the means to conduct comprehensive analyses of somatic anomalies in the oncogenome.In the first part of our work we have carried out a fine mapping of additional chromosomal anomalies in Burkitt lymphoma (BL). The hallmark of this disease is the translocation t(MYC;IG). We have applied whole-genome 244K and 44k oligonucléotides aCGH to 15 cells lines and 12 primary tumors of BL respectively. Karyotype and FISH analysis were used to validate aCGH results. As expected, all translocations remained undetectable with aCGH. More than half of the copy number alterations (CNAs) < 2 Mb were mapped to Mendelian CNVs, including GSTT1, and BIRC6. Somatic cell line-specific CNVs localized to the IG locus were consistently observed with the 244 K aCGH platform. Among 136 CNAs, gains were found in 1q, 13q, 7q, 8q, 2p, 11q and 15q. Losses were found in 3p, 4p, 4q, 9p, 13q, 6p, 17p, 6q,11pterp13 and 14q12q21.3. Twenty one minimal critical regions (MCR), (range 0.04–71.36 Mb), were delineated in tumors and cell lines. Three MCRs were localized to 1q: 1q21.1q25.2, 1q32.1 et 1q44. The proximal one was mapped to 1q21.1q25.2 with a 6.3 Mb amplicon (1q21.1q21.3) harboring BCA2, BCL9 and PIAS3. Only BCL9 high level transcrit was noted on oligonucleotide microarray gene expression that was done on 15 cells lines. BCL9, was implicated in a LAL B translocation t(1;14)(q21;q32) and it is a member of MYC pathway. The 13q31.3q32.1, 89.58–96.81 Mb MCR contained an amplicon with several genes. The miR-17-92 cluster, upregulated on mirnome analysis that was done on 15 cells lines, is the gene driver of 13q MCR. The miR-17-92 cluster is a member of MYC pathway. The 9p21.3 MCR harbored p16INK4A/p15INK4B locus which is downregulated. MYC activates ARF,a protein encodes by p16INK4A/p15INK4B locus. . On the second part of our work, a 44k aCGH was applied on 17 frozen adenoid cystic carcinoma (ACC) to delineate with a high resolution the CNA associated with ACC. aCGH results were validated with FISH and/or MLPA. Protein expression was screened with immunohistochemistry analysis. The translocation t(6;9)(q23;p23p24)/ MYB-NFIB recurrent in ACC, was not detected with aCGH. In one case, the der(6)t(6;9) was suspected in the aCGH pattern. There were recurrent gains at 7p15.2, 17q21–25, 22q11–13, and recurrent losses at 1p35, 6q22–25, 8q12–13, 9p21, 12q12–13, and 17p11–13. Thirteen MCR were detected. The recurrent deletion at 8q12.3–13.1 contained miRN124A2 gene, whose product regulates MMP2 and CDK6. The 9p21.3 MCR harbored p16INK4A/p15INK4B locus which was deleted. On 17p11p13, the MCR contained several genes and TP53 was deleted in 2 cases. The MDM2 gene, a member of p16INK4A-ARF-p53 pathway, was amplified and overexpressed in one case. Among the other unique CNAs, gains harbored CCND1, KIT/PDGFRA/KDR, and JAK2. On the third part of this these, a high-resolution 244K aCGH was conducted on 60 frozen lung adenocarcinoma (AD) of never smokers patients in order to establish a catalog of CNA. In 50/60 tumors, fourteen new MCR of gain or loss was noted. One larger MCR of gain contained NSD1.One focal amplification and nine gains contained FUS. NSD1 and FUS are oncogenes hitherto not known to be associated with lung cancer. FUS was over-expressed in 10 tumors with gain of 16p11.2 compared to 30 tumors without that gain. A FUS hsr was observed with FISH screening. FUS was over-expressed in 10 tumors with gain of 16p11.2 compared to 30 tumors without that gain. Other cancer genes present in aberrations included ARNT, BCL9, CDK4, p15INK4B, EGFR, ERBB2, MDM2, MDM4, MET, MYC, NKX2-1 and KRAS.
|
266 |
Métabolisme secondaire de Streptomyces ambofaciens : exploration génomique et étude du groupe de gènes dirigeant la synthèse du sphydrofurane / Secondary metabolism of Streptomyces ambofaciens : genome mining and study of the gene cluster involved in sphydrofuran biosynthesisHaas, Drago 10 April 2015 (has links)
Les bactéries du genre Streptomyces produisent de nombreux métabolites secondaires, dont certains possèdent des propriétés intéressantes en agriculture et en pharmaceutique. Avec le développement de la génomique, de nombreux outils bioinformatiques de recherche de groupes de gènes du métabolisme secondaire ont été développés au cours de la dernière décennie pour explorer les génomes. Ces outils sont basés sur la recherche de similarité de séquences et de ce fait, les clusters atypiques, constitués de gènes non caractérisés, ne peuvent être détectés par ces approches. L'isolement de tels clusters nécessite donc la mise en œuvre de nouvelles stratégies. La comparaison d’espèces d'Actinomycetes proches a révélé que les îlots génomiques, régions présentes dans un seul génome, sont très souvent enrichis en gènes du métabolisme secondaire. Nous avons participé (en collaboration avec les équipes d’Olivier Lespinet et de Pierre Leblond et Bertrand Aigle) au développement d’un outil, Break Viewer, permettant de localiser les îlots génomiques en comparant des génomes proches de Streptomyces. Cet outil a permis l'identification d'un îlot non détecté par les approches classiques, îlot dont l'étude a montré qu'il contenait un groupe de gènes du métabolisme secondaire. L’étude de ce groupe de gènes a montré qu'il dirige la synthèse de trois composés, le produit majoritaire étant le sphydrofurane. Une analyse fonctionnelle du cluster sphydrofurane a permis de déterminer les gènes impliqués dans la biosynthèse et la régulation de la biosynthèse du sphydrofurane et de proposer un modèle préliminaire pour la biosynthèse de ce métabolite. / Streptomyces are soil-dwelling bacteria that produce numerous secondary metabolites, some of which have interesting properties in agriculture and pharmaceuticals. With the development of genomics, many bioinformatics tools to search genomes for secondary metabolism gene clusters have been developed over the last decade. These tools are based on sequence similarity searches and therefore atypical clusters, consisting of uncharacterized genes, cannot be detected by these approaches. The isolation of such atypical clusters therefore requires the implementation of new strategies.Comparing closely related Actinomycetes species revealed that genomic islands (regions that are present in one genome only), are often enriched in secondary metabolite genes. We participated (in collaboration with the team of Olivier Lespinet and the team of Pierre Leblond and Bertrand Aigle) to the development of a new tool, Break Viewer, to locate genomic islands by comparing the genomes of closely related Streptomyces. This tool allowed the identification of an island, undetected by conventional approaches, island whose study showed that it contained a secondary metabolism gene cluster. The study of this cluster has shown that it directs the synthesis of three compounds, the major product being sphydrofuran. A functional analysis of the sphydrofuran gene cluster allowed us to identify the genes involved in the biosynthesis and regulation of sphydrofuran and to propose a preliminary model for the biosynthesis of this metabolite.
|
267 |
The genomic health of human pluripotent stem cellsHenry, Marianne Patricia January 2018 (has links)
Human pluripotent stem cells are increasingly used for cell-based regenerative therapies worldwide, with the use of embryonic and induced pluripotent stem cells as potential treatments for a range of debilitating and chronic conditions. However, with the level of chromosomal aneuploidies the cells may generate in culture, their safety for therapeutic use could be in question. This study aimed to develop sensitive and high-throughput assays for the detection and quantification of human pluripotent stem cell aneuploidies, to assess any changes in their positioning in nuclei, as well as investigate the possible roles of lamins in the accumulation of aneuploidies. Using Droplet Digital PCR™, we optimised the detection of aneuploid cells in a predominantly diploid background. An assay was established for the sensitive detection of up to 1% of mosaicism and was used for the monitoring of low-level chromosome copy number changes across different cell lines, conditions and passages in the human pluripotent stem cells. In addition, fluorescence in-situ hybridisation was used to map genes ALB and AMELX on chromosomes 4 and X, respectively, in karyotype-stable chromosome X aneuploid lymphoblastoid cell lines. Our results demonstrated significant alternations in the gene loci positioning in the chromosome X aneuploid cell lines. Using the same established method, the positioning of ALB and AMELX was monitored, alongside the genomic instability with ddPCR™, in the different human pluripotent stem cell lines, conditions and passage. We demonstrated a highly plastic nuclear organisation in the pluripotent stem cells with many changes occurring within a single passage. Furthermore, these results were not exclusive to a single cell line or condition, regardless of the presence or absence of feeder cells and of passage number, and the flexibility of the chromatin organisation remained throughout the duration of the study. We demonstrated high levels of genomic instability with recurrent gains and losses in the AMELX copy number in the human embryonic stem cells during the course of our study, however no significant changes in their gene loci positioning from these abnormalities were observed. xvi | P a g e Additionally, we observed reduced levels of lamin B2 in the aneuploid lymphoblastoid cell lines and complete loss in some hPSC samples. Our results support recent findings that suggest a link between lamin B2 loss and the formation of chromosome aneuploidies in cell culture. In conclusion, our data demonstrates several key novel findings. Firstly, we have established a sensitive technique for the detection of up to 1% mosaicism, which to our knowledge is the most sensitive assay currently available. Secondly, we showed significant changes in the gene loci positioning between aneuploid and diploid cell lines. Thirdly, utilising our novel ddPCR™ assay, we demonstrated the karyotypical instability of hPCSs with consistent gains and/or loses of gene copy numbers in a short period of time in culture. When studying the effects of different growth conditions, we showed that the karyotypical instability was not exclusive to a single condition or a combination of conditions, and what is more, the karyotypical abnormalities detected were not observed to change the gene positioning of hPSCs significantly, with the genome organisation remaining plastic. Finally, our results support a potential association of lamin B2 loss and karyotypical instability. We conclude that more sensitive and robust techniques need to be readily used by clinicians for the screening of potential therapeutic hPSCs.
|
268 |
Parametric and semi-parametric models for predicting genomic breeding values of complex traits in Nelore cattle / Modelos estatísticos paramétricos e semiparamétricos para a predição de valores genéticos genômicos de características complexas em bovinos da raça NeloreEspigolan, Rafael [UNESP] 23 February 2017 (has links)
Submitted by RAFAEL ESPIGOLAN (espigolan@yahoo.com.br) on 2017-03-17T22:04:14Z
No. of bitstreams: 1
Tese_Rafael_Espigolan.pdf: 1532864 bytes, checksum: c79ad7471b25137c47529f25762a83a2 (MD5) / Approved for entry into archive by Juliano Benedito Ferreira (julianoferreira@reitoria.unesp.br) on 2017-03-22T12:50:50Z (GMT) No. of bitstreams: 1
espigolan_r_dr_jabo.pdf: 1532864 bytes, checksum: c79ad7471b25137c47529f25762a83a2 (MD5) / Made available in DSpace on 2017-03-22T12:50:50Z (GMT). No. of bitstreams: 1
espigolan_r_dr_jabo.pdf: 1532864 bytes, checksum: c79ad7471b25137c47529f25762a83a2 (MD5)
Previous issue date: 2017-02-23 / Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP) / O melhoramento genético animal visa melhorar a produtividade econômica das futuras gerações de espécies domésticas por meio da seleção. A maioria das características de interesse econômico na pecuária é de expressão quantitativa e complexa, isto é, são influenciadas por vários genes e afetadas por fatores ambientais. As análises estatísticas de informações de fenótipo e pedigree permite estimar os valores genéticos dos candidatos à seleção com base no modelo infinitesimal. Uma grande quantidade de dados genômicos está atualmente disponível para a identificação e seleção de indivíduos geneticamente superiores com o potencial de aumentar a acurácia de predição dos valores genéticos e, portanto, a eficiência dos programas de melhoramento genético animal. Vários estudos têm sido conduzidos com o objetivo de identificar metodologias apropriadas para raças e características específicas, o que resultará em estimativas de valores genéticos genômicos (GEBVs) mais acurados. Portanto, o objetivo deste estudo foi verificar a possibilidade de aplicação de modelos semiparamétricos para a seleção genômica e comparar a habilidade de predição com os modelos paramétricos para dados reais (características de carcaça, qualidade da carne, crescimento e reprodutiva) e simulados. As informações fenotípicas e de pedigree utilizadas foram fornecidas por onze fazendas pertencentes a quatro programas de melhoramento genético animal. Para as características de carcaça e qualidade da carne, o banco de dados continha 3.643 registros para área de olho de lombo (REA), 3.619 registros para espessura de gordura (BFT), 3.670 registros para maciez da carne (TEN) e 3.378 observações para peso de carcaça quente (HCW). Um total de 825.364 registros para peso ao sobreano (YW) e 166.398 para idade ao primeiro parto (AFC) foi utilizado para as características de crescimento e reprodutiva. Genótipos de 2.710, 2.656, 2.749, 2.495, 4.455 e 1.760 animais para REA, BFT, TEN, HCW, YW e AFC foram disponibilizados, respectivamente. Após o controle de qualidade, restaram dados de, aproximadamente, 450.000 polimorfismos de base única (SNP). Os modelos de análise utilizados foram BLUP genômico (GBLUP), single-step GBLUP (ssGBLUP), Bayesian LASSO (BL) e as abordagens semiparamétricas Reproducing Kernel Hilbert Spaces (RKHS) e Kernel Averaging (KA). Para cada característica foi realizada uma validação cruzada composta por cinco “folds” e replicada aleatoriamente trinta vezes. Os modelos estatísticos foram comparados em termos do erro do quadrado médio (MSE) e acurácia de predição (ACC). Os valores de ACC variaram de 0,39 a 0,40 (REA), 0,38 a 0,41 (BFT), 0,23 a 0,28 (TEN), 0,33 a 0,35 (HCW), 0,36 a 0,51 (YW) e 0,49 a 0,56 (AFC). Para todas as características, os modelos GBLUP e BL apresentaram acurácias de predição similares. Para REA, BFT e HCW, todos os modelos apresentaram ACC similares, entretanto a regressão RKHS obteve o melhor ajuste comparado ao KA. Para características com maior quantidade de registros fenotípicos comparada ao número de animais genotipados (YW e AFC) o modelo ssGBLUP é indicado. Considerando o desempenho geral, para todas as características estudadas, a regressão RKHS é, particularmente, uma alternativa interessante para a aplicação na seleção genômica, especialmente para características de baixa herdabilidade. No estudo de simulação, genótipos, pedigree e fenótipos para quatro características (A, B, C e D) foram simulados utilizando valores de herdabilidade baseados nos obtidos com os dados reais (0,09, 0,12, 0,36 e 0,39 para cada característica, respectivamente). O genoma simulado consistiu de 735.293 marcadores e 1.000 QTLs distribuídos aleatoriamente por 29 pares de autossomos, com comprimento variando de 40 a 146 centimorgans (cM), totalizando 2.333 cM. Assumiu-se que os QTLs explicavam 100% da variação genética. Considerando as frequências do alelo menor maiores ou iguais a 0,01, um total de 430.000 marcadores foram selecionados aleatoriamente. Os fenótipos foram obtidos pela soma dos resíduos (aleatoriamente amostrados de uma distribuição normal com média igual a zero) aos valores genéticos verdadeiros, e todo o processo de simulação foi replicado 10 vezes. A ACC foi calculada por meio da correlação entre o valor genético genômico estimado e o valor genético verdadeiro, simulados da 12a a 15a geração. A média do desequilíbrio de ligação, medido entre os pares de marcadores adjacentes para todas as características simuladas foi de 0,21 para as gerações recentes (12a, 13a e 14a), e 0,22 para a 15a geração. A ACC para as características simuladas A, B, C e D variou de 0,43 a 0,44, 0,47 a 0,48, 0,80 a 0,82 e 0,72 a 0,73, respectivamente. Diferentes metodologias de seleção genômica implementadas neste estudo mostraram valores similares de acurácia de predição, e o método mais adequado é dependente da característica explorada. Em geral, as regressões RKHS obtiveram melhor desempenho em termos de ACC com menor valor de MSE em comparação com os outros modelos. / Animal breeding aims to improve economic productivity of future generations of domestic species through selection. Most of the traits of economic interest in livestock have a complex and quantitative expression i.e. are influenced by a large number of genes and affected by environmental factors. Statistical analysis of phenotypes and pedigree information allows estimating the breeding values of the selection candidates based on infinitesimal model. A large amount of genomic data is now available for the identification and selection of genetically superior individuals with the potential to increase the accuracy of prediction of genetic values and thus, the efficiency of animal breeding programs. Numerous studies have been conducted in order to identify appropriate methodologies to specific breeds and traits, which will result in more accurate genomic estimated breeding values (GEBVs). Therefore, the objective of this study was to verify the possibility of applying semi-parametric models for genomic selection and to compare their ability of prediction with those of parametric models for real (carcass, meat quality, growth and reproductive traits) and simulated data. The phenotypic and pedigree information used were provided by farms belonging to four animal breeding programs which represent eleven farms. For carcass and meat quality traits, the data set contained 3,643 records for rib eye area (REA), 3,619 records for backfat thickness (BFT), 3,670 records for meat tenderness (TEN) and 3,378 observations for hot carcass weight (HCW). A total of 825,364 records for yearling weight (YW) and 166,398 for age at first calving (AFC) were used as growth and reproductive traits of Nelore cattle. Genotypes of 2,710, 2,656, 2,749, 2,495, 4,455 and 1,760 animals were available for REA, BFT, TEN, HCW, YW and AFC, respectively. After quality control, approximately 450,000 single nucleotide polymorphisms (SNP) remained. Methods of analysis were genomic BLUP (GBLUP), single-step GBLUP (ssGBLUP), Bayesian LASSO (BL) and the semi-parametric approaches Reproducing Kernel Hilbert Spaces (RKHS) regression and Kernel Averaging (KA). A five-fold cross-validation with thirty random replicates was carried out and models were compared in terms of their prediction mean squared error (MSE) and accuracy of prediction (ACC). The ACC ranged from 0.39 to 0.40 (REA), 0.38 to 0.41 (BFT), 0.23 to 0.28 (TEN), 0.33 to 0.35 (HCW), 0.36 to 0.51 (YW) and 0.49 to 0.56 (AFC). For all traits, the GBLUP and BL models showed very similar prediction accuracies. For REA, BFT and HCW, models provided similar prediction accuracies, however RKHS regression had the best fit across traits considering multiple-step models and compared to KA. For traits which have a higher number of animals with phenotypes compared to the number of those with genotypes (YW and AFC), the ssGBLUP is indicated. Judged by overall performance, across all traits, the RKHS regression is particularly appealing for application in genomic selection, especially for low heritability traits. Simulated genotypes, pedigree, and phenotypes for four traits A, B, C and D were obtained using heritabilities based on real data (0.09, 0.12, 0.36 and 0.39 for each trait, respectively). The simulated genome consisted of 735,293 markers and 1,000 QTLs randomly distributed over 29 pairs of autosomes, with length varying from 40 to 146 centimorgans (cM), totaling 2,333 cM. It was assumed that QTLs explained 100% of genetic variance. Considering Minor Allele Frequencies greater or equal to 0.01, a total of 430,000 markers were randomly selected. The phenotypes were generated by adding residuals, randomly drawn from a normal distribution with mean equal to zero, to the true breeding values and all simulation process was replicated 10 times. ACC was quantified using correlations between the predicted genomic breeding value and true breeding values simulated for the generations of 12 to 15. The average linkage disequilibrium, measured between pairs of adjacent markers for all simulated traits was 0.21 for recent generations (12, 13 and 14), and 0.22 for generation 15. The ACC for simulated traits A, B, C and D ranged from 0.43 to 0.44, 0.47 to 0.48, 0.80 to 0.82 and 0.72 to 0.73, respectively. Different genomic selection methodologies implemented in this study showed similar accuracies of prediction, and the optimal method was sometimes trait dependent. In general, RKHS regressions were preferable in terms of ACC and provided smallest MSE estimates compared to other models. / FAPESP: 2014/00779-0 / FAPESP: 2015/13084-3
|
269 |
Estudo do desequilíbrio de ligação e estimativa do tamanho efetivo em uma população da raça gir selecionada para crescimento pós-desmama / Linkage disequilibrium and effective size on population of gir zebu breed selected for post-weaning weightsToro Ospina, Alejandra Maria [UNESP] 24 February 2017 (has links)
Submitted by ALEJANDRA MARIA TORO OSPINA null (alejita-t_92@hotmail.com) on 2017-03-18T16:50:07Z
No. of bitstreams: 1
dissertação_Alejandra_Toro.pdf: 1073618 bytes, checksum: 4de34349c23cb909c3128081fe41cc42 (MD5) / Approved for entry into archive by Juliano Benedito Ferreira (julianoferreira@reitoria.unesp.br) on 2017-03-22T12:59:36Z (GMT) No. of bitstreams: 1
toroospina_am_me_jabo.pdf: 1073618 bytes, checksum: 4de34349c23cb909c3128081fe41cc42 (MD5) / Made available in DSpace on 2017-03-22T12:59:36Z (GMT). No. of bitstreams: 1
toroospina_am_me_jabo.pdf: 1073618 bytes, checksum: 4de34349c23cb909c3128081fe41cc42 (MD5)
Previous issue date: 2017-02-24 / O objetivo deste estudo foi estimar o desequilíbrio de ligação (r2) nas distâncias de 25-50kb, 50-100kb, 100-500kb, 0,5-1Mb e o tamanho efetivo (Ne) nas gerações 0, 5, 10, 15, 20 em população da raça Gir selecionada para crescimento pós-desmama. Os animais utilizados no presente estudo foram provenientes do rebanho fechado do Instituto de Zootecnia, Sertãozinho, SP. Foram obtidos os genótipos de 155 animais com o painel BovineDL 33kb e 18 com painel HD imputado onde realizou-se controle de qualidade (CQ) para alelo de menor frequência (MAF) < 0,02 e call rate < 0,1. Depois do CQ permaneceram 27.236 SNPs e 155 animais do painel de 33 kb e 732.962 SNPs e 173 animais do painel HD Imputado. As análises de r2 foram realizadas pelo programa Plink e programa estatístico R Studio e o Ne por meio do DL. Os resultados das distâncias 25-50kb, 50-100kb, 100-500kb e 0,5-1Mb do r2 para o painel 33kb foram iguais a 0,29, 0,25, 0,16 e 0,032 respectivamente, e 0,35, 0,29, 0,18, 0,032 para o painel HD imputado demostrando que o DL permaneceu nas distâncias menores a 100kb, decaindo com o aumento das distâncias. Estes resultados foram maiores aos descritos na literatura para animais zebuínos, sugerindo como causa os segmentos longos de haplótipos que compartilham os animais aparentados. O Ne foi igual a 9, 17, 24, 30 e 30 animais nas gerações 0, 5, 10, 15, 20, observa-se que o Ne é maior na geração 20, com 30 animais, e decai drasticamente a partir da 5 geração com 17 animais, e sendo de 9 animais a última geração, um tamanho pequeno para uma população. Os valores encontrados neste estudo mostram alto DL e baixo Ne, provavelmente pelo sistema de seleção e a estrutura da população da raça Gir avaliada, que apresenta alto nível de endogamia, perda da variabilidade genética, uso intensivo de pequeno número de reprodutores, conduzindo a diminuição da deriva genética da população, ocasionando dificuldades na seleção dos animais. / The aim of this study was to estimate the linkage disequilibrium (r2) at distances of 25-50kb, 50-100kb, 100-500kb, 0,5-1Mb and the effective population size (Ne) in generations 0, 5, 10, 15, 20 in population of the selected Gir for yearling growth. The animals used in this study were from the closed herd Animal Science Institute, Sertãozinho, SP. the genotypes of 155 animals were obtained with BovineDL 33kb and 18 animals of panel HD, where quality control was held (QC) for minor allele frequency (MAF) <0.02 and call rate <0.1. After QC remained 27,236 SNPs and 155 animals to panel 33 kb, 732.962 SNPs and 173 the panel HD imputation. The r2 analyzes were performed by Plink program and R Studio statistical program and Ne through the DL. The results of r2 for distances 25-50kb, 50-100kb, 100-500kb and 0,5-1Mb were equal to 0.29, 0.25, 0.16 and 0.032, respectively, showing that the DL remained in smaller distances 100kb, decreasing with increasing distances. These results were higher than those reported in the literature for Zebu animals, suggesting a cause to long haplotype segments that share the related animals. Ne is equal to 9, 17, 24, 30 and 30 in the generations 0, 5, 10, 15, 20, it is observed that Ne is higher in generation 20 with 30 animals and decays sharply from 5 Generation 17 animals, and with 9 animals the latest generation, small size for a population. The values found in this study to DL and Ne, explain the selection system and the structure of the population of Gir evaluated, which has a high level of inbreeding, loss of genetic variability, intensive small number of players, leading to decreased drift population genetics, causing difficulties in the selection of the next generations.
|
270 |
Interaction entre la bactérie endosymbiotique Wolbachia et les moustiques du complexe Culex pipiens : Des génomes bactériens à la structuration des populations d’hôtes / Interaction between the endosymbiotic bacteria Wolbachia and mosquitoes of the Culex pipiens complex : from bacterial genomes to host population’s structuringDumas, Emilie 11 December 2013 (has links)
Wolbachia est une bactérie endosymbiotique, intracellulaire et exclusivement transmise maternellement qui infecterait au moins 106 espèces d'insectes. Wolbachia manipule fréquemment la reproduction de ses hôte à son avantage, notamment en induisant une forme de stérilité conditionnelle appelée incompatibilité cytoplasmique (IC). Chez les moustiques du complexe Culex pipiens, une grande diversité de souches de Wolbachia et de types d'IC a été précédemment identifiée, mais plusieurs aspects de la biologie de cette association restaient peu connus. Les travaux présentés dans cette thèse ont notamment permis de caractériser (i) l'impact de Wolbachia sur la structuration génétique des populations hôtes et (ii) la diversité des souches de Wolbachia et, plus précisément d'appréhender le mécanisme de l'IC. Par un suivi de populations naturelles, nous avons mis en évidence que Wolbachia induisait une forte structuration de la diversité mitochondriale, mais aussi qu'elle participait à des événements répétés d'introgression cytoplasmique entre les différents membres du complexe Cx. pipiens. Nous avons également mené une étude de génomique comparative basée sur le séquençage de quatre génomes complets de Wolbachia très proches phylogénétiquement. Pour cela, nous avons mis en place une série d'analyses approfondies utilisant un large panel d'outils bioinformatiques couplés à des vérifications moléculaires. Nous avons montré qu'il existait peu de polymorphisme entre les groupes de Wolbachia infectant Cx. pipiens. De plus, ces études nous ont permis de mettre en évidence des gènes candidats qui pourraient être directement impliqués dans le mécanisme de l'IC. / Wolbachia is an intracellular bacterial symbiont, exclusively maternally inherited, infecting at least 106 species of insects. Wolbachia commonly manipulates insect reproduction to its own advantage, as well illustrated by a phenomenon of conditional sterility called cytoplasmic incompatibility (CI). In mosquitoes of Culex pipiens complex, a great diversity of Wolbachia strains and of CI types was previously identified, but several aspects of the biology of this symbiotic association remained unknown. The aim of the studies presented in this thesis is to characterize (i) the impact of Wolbachia on the host genetic structure and (ii) the Wolbachia strains diversity in order to attempt an identification of CI molecular basis. By a survey of natural populations, we highlighted that Wolbachia deeply impacts the population structure of mitochondrial diversity, but is also associated with repeated events of cytoplasmic introgression between members of complex Cx. pipiens. We also conducted a study of comparative genomics based on the sequencing of four complete genomes of very closely related Wolbachia strains. For that purpose, we performed a series of analyses using a wide panel of bioinformatic tools coupled with molecular validations. We showed a low polymorphism between two groups of Wolbachia infecting Cx. pipiens. These studies also allowed us to highlight promising candidate genes which could be directly involved in the CI mechanism.
|
Page generated in 0.0639 seconds