• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 210
  • 117
  • 64
  • 32
  • 22
  • 20
  • 14
  • 7
  • 5
  • 4
  • 4
  • 3
  • 3
  • 2
  • 2
  • Tagged with
  • 588
  • 98
  • 98
  • 95
  • 87
  • 86
  • 53
  • 51
  • 51
  • 47
  • 43
  • 40
  • 40
  • 38
  • 38
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
511

SEARCHING THE EDGES OF THE PROTEIN UNIVERSE USING DATA SCIENCE

Mengmeng Zhu (8775917) 30 April 2020 (has links)
<p>Data science uses the latest techniques in statistics and machine learning to extract insights from data. With the increasing amount of protein data, a number of novel research approaches have become feasible.</p><p>Micropeptides are an emerging field in the protein universe. They are small proteins with <= 100 amino acid residues (aa) and are translated from small open reading frames (sORFs) of <= 303 base pairs (bp). Traditionally, their existence was ignored because of the technical difficulties in isolating them. With technological advances, a growing number of micropeptides have been characterized and shown to play vital roles in many biological processes. Yet, we lack bioinformatics methods for predicting them directly from DNA sequences, which could substantially facilitate research in this field with minimal cost. With the increasing amount of data, developing new methods to address this need becomes possible. We therefore developed MiPepid, a machine-learning-based method specifically designed for predicting micropeptides from DNA sequences by curating a high-quality dataset and by training MiPepid using logistic regression with 4-mer features. MiPepid performed exceptionally well on holdout test sets and performed much better than existing methods. MiPepid is available for downloading, easy to use, and runs sufficiently fast.</p><p>Long noncoding RNAs (LncRNAs) are transcripts of > 200 bp and does not encode a protein. Contrary to their “noncoding” definition, an increasing number of lncRNAs have been found to be translated into functional micropeptides. Therefore, whether most lncRNAs are translated is an open question of great significance. To address this question, by harnessing the availability of large-scale human variation data, we have explored the relationships between lncRNAs, micropeptides, and canonical regular proteins (> 100 aa) from the perspective of genetic variation, which has long been used to study natural selection to infer functional relevance. Through rigorous statistical analyses, we find that lncRNAs share a similar genetic variation profile with proteins regarding single nucleotide polymorphism (SNP) density, SNP spectrum, enrichment of rare SNPs, etc., suggesting lncRNAs are under similar negative selection strength with proteins. Our study revealed similarities between micropeptides, lncRNAs, and canonical proteins and is the first attempt to explore the relationships between the three groups from a genetic variation perspective.</p><p>Deep learning has been tremendously successful in 2D image recognition. Protein binding ligand prediction is fundamental topic in protein research as most proteins bind ligands to function. Proteins are 3D structures and can be considered as 3D images. Prediction of binding ligands of proteins can then be converted to a 3D image classification problem. In addition, a large number of protein structure data are available now. We therefore utilized deep learning to predict protein binding ligands by designing a 3D convolutional neural network from scratch and by building a large 3D image dataset of protein structures. The trained model achieved an average F1 score of over 0.8 across 151 classes on the holdout test set. Compared to existing methods, our model performed better. In summary, we showed the feasibility of deploying deep learning in protein structure research.</p><p>In conclusion, by exploring various edges of the protein universe from the perspective of data science, we showed that the increasing amount of data and the advancement of data science methods made it possible to address a wide variety of pressing biological questions. We showed that for a successful data science study, the three components – goal, data, method – all of them are indispensable. We provided three successful data science studies: the careful data cleaning and selection of machine learning algorithm lead to the development of MiPepid that fits the urgent need of a micropeptide prediction method; identifying the question and exploring it from a different angle lead to the key insight that lncRNAs resemble micropeptides; applying deep learning to protein structure data lead to a new approach to the long-standing question of protein-ligand binding. The three studies serve as excellent examples in solving a wide range of data science problems with a variety of issues.</p>
512

Développement de méthodes bio-informatiques pour la découverte de variants codants et non codants dans le cadre des traits sanguins

Méric de Bellefon, Sébastian 04 1900 (has links)
La santé cardiovasculaire, la fonction immunitaire, l'hémostase et la réponse à d'autres maladies dépendent de l'abondance et des caractéristiques spécifiques des cellules sanguines. Au fil des années, un effort considérable a été fait pour trouver les variants génétiques, les gènes et les mécanismes de régulation impliqués dans la création de ces cellules. L'inactivation d'un allèle, appelée "perte de fonction" (LoF), est un type de variant codant que nous aimerions associer aux phénotypes sanguins. Comme ces mutations ne peuvent pas être artificiellement induites chez l'humain, pour des raisons éthiques évidentes, nous observons les occurences naturelles de ces pertes de fonction et espérons que la taille des cohortes sera suffisante pour trouver des associations statistiquement significatives. L'inactivation des deux allèles, appelée "knockout" (KO), peut avoir des conséquences plus fortes qu'une simple perte de fonction. Nous espérons également trouver des KO d'origine naturelle grâce à la taille des cohortes. La combinaison de deux variants LoF différents sur les deux allèles est appelée knockout hétérozygote composé. Nous nous intéressons également aux variants non codants qui affectent l'expression des gènes impliqués dans l'hématopoïèse. Certains de ces variants créent ou perturbent des sites de liaison des facteurs de transcription (TF), ces protéines qui se lient à des séquences d'ADN spécifiques et régulent l'expression des gènes. Les sites de liaison (TFBS) des facteurs de transcription se trouvent dans les promoteurs des gènes et dans les amplificateurs spécifiques au type cellulaire. Alors que certaines de ces mutations peuvent être bénignes ou même bénéfiques, la présence d'un LoF ou d'un KO peut être trop nuisible à la survie de l'individu. Les résultats de cette étude sont limités par le biais de survie. Comparée à une étude d'association pangénomique, cette étude se concentre sur un plus petit nombre de variants génétiques pour augmenter la puissance statistique et offrir une interprétation pour les résultats statistiquement significatifs. Le programme Trans-Omics for Precision Medicine (TOPMed) recueille et garantit la qualité des 45 000 séquences du génome entier que nous avons utilisées dans cette étude, ainsi que les bilans sanguins correspondants. Grâce à ces données, nous avons pu trouver plusieurs associations connues et nouvelles entre des variants rares et des phénotypes sanguins. / Cardiovascular health, immune function, hemostasis and the response to other illnesses depend on the abundance and specific features of blood cells. Over the years, a considerable effort has been made to find which genetic variants, genes and regulatory mechanisms are involved in the creation of these cells. The inactivation of an allele, called a loss-of-function (LoF), is a type of coding variant we would like to associate with blood phenotypes. For obvious ethical reasons, these mutations cannot be artificially induced in human, so we fall back on natural occurrences and hope that large cohorts will provide enough samples to find statistically significant associations. The inactivation of both alleles, called a knockout (KO), may have stronger consequences than a simple loss-of-function. We also hope to find naturally occurring knockouts thanks to the size of a large cohort. The combination of two different LoF variants is called a compound heterozygote knockout. We are also interested in non-coding variants that affect the expression of genes that are involved in hematopoiesis. Some of these variants create or disrupt the binding sites of transcription factors (TF), the proteins that bind to specific DNA sequences and regulate gene expression. Transcription factors binding sites (TFBS) are found in gene promoters and cell type specific enhancers. While some of these mutations can be benign or even beneficial, the presence of a LoF or KO may be too detrimental for the individual to survive. The results of this study are limited by survival bias. Compared to a genome-wide association study, this study focuses on a smaller number of genetic variants to increase statistical power and give an interpretation to the statistically significant findings. The Trans-Omics for Precision Medicine (TOPMed) program collects and ensures the quality of the 45,000 whole-genome sequences we used in this study, as well as the corresponding complete blood counts. Thanks to this raw data, we were able to find several known and novel associations between rare variants and blood phenotypes.
513

Eine Analyse ausgewählter genomischer Varianten im FIGF- und ACE2-Gen und deren Bedeutung in der molekularen Pathogenese intrakranieller Aneurysmen: Eine Analyse ausgewählter genomischer Varianten im FIGF- und ACE2-Gen und deren Bedeutung in der molekularen Pathogenese intrakranieller Aneurysmen

Leonhardt, Mareike 22 September 2009 (has links)
In der vorliegenden Arbeit untersuchten wir an einer europäischen Population ausgewählte Polymorphismen zweier Gene auf eine Assoziation zu IA. Beide Gene FIGF und ACE2 sind lokalisiert auf Chromosom Xp22 und stellen damit positionelle Kandidatengene dar, aber auch funktionell sind sie von Interesse, da sie v.a. in Prozesse des Gefäßwachstums (FIGF) und der Blutdruckregulierung (ACE2) involviert sind; Vorgänge also, die möglicherweise in die pathophysiologische Erklärung der IA Entstehung mit hineinspielen. In keinem der insgesamt neun analysierten Polymorphismen konnten wir jedoch eine signifikante Assoziation zu IA finden. Auch eine Analyse möglicher intra- und intergenetischer Haplotypen aller untersuchten Varianten erbrachte kein signifikantes Ergebnis.
514

Investigation génétique de NAFLD dans le diabète de type 2 via construction d’un modèle de prédiction de la maladie et par criblage du locus PNPLA3-SAMM50

Attaoua, Redha 07 1900 (has links)
La stéatose hépatique non-alcoolique (NAFLD) est une altération hépatique fréquente dans le diabète de type 2 (DT2) et est associée à diverses complications telles que la mortalité. L’établissement d’outils de prédiction non-invasifs de NAFLD est primordial. Mon projet de maîtrise avait pour objectif d’établir des marqueurs génétiques de NAFLD dans le DT2 via deux stratégies : 1) une sélection non-ciblée des marqueurs génétiques (SNPs) via la méthode LASSO et 2) une sélection ciblée de SNPs rapportés comme liés à la maladie ou à des altérations associées. Une population de 4098 patients avec DT2 d’origine caucasienne (ADVANCE) a été utilisée. Des données statistiques sommaires d’études pangénomiques ont été exploitées pour sélectionner, via LASSO, les marqueurs génétiques (SNPs) à inclure dans le score de risque polygénique (PRS). J’ai également développé un modèle de 3210 SNPs ajusté par des covariables capable de prédire les taux élevés de ALT (AUC=0,69) et la mortalité non-cardiovasculaire (AUC=0,66). Le criblage du locus candidat PNPLA3-SAMM50 a mis en avant une diversité des associations génétiques aux différentes altérations métaboliques comme les taux de ALT (substitut du diagnostic de NAFLD) (rs2294915, P = 1,83x10-7), à la mortalité non-cardiovasculaire (rs2294917, P = 3,9x10-4) et à l’efficacité de la thérapie intensive antidiabétique chez certains patients de la population (porteurs GG de rs16991236, P=0,007). Mes travaux ont permis de mieux comprendre le fond génétique de NAFLD dans le DT2 et laissent envisager l’établissement d’outils de diagnostic et de suivi de la maladie plus adéquats. / Non-alcoholic fatty liver disease (NAFLD) is a liver disorder more frequent in type 2 diabetes (T2D) and is associated with complications such as mortality. For this reason, establishing non-invasive tools for predicting NAFLD is crucial. My master’s project aimed to establish genetic markers for NAFLD in T2D using two strategies: 1) a non-targeted selection of genetic markers (SNPs) by the LASSO method and 2) a targeted selection of SNPs reported as associated with the disease or its related abnormalities. A population involving 4098 patients with T2D and Caucasian ancestry was used. Summary statistics data of pangenomic studies were exploited for the selection of SNPs to be involved in the polygenic risk score (PRS). I also designed a model of 3210 SNPs adjusted by covariates and able to predict the high rates of ALT (AUC=0.69) and non-cardiovascular death (AUC=0.66). Mapping of the candidate locus PNPLA3-SAMM50 allowed the observation of diversity in terms of genetic association with the metabolic abnormalities such as ALT (surrogate of NAFLD) (rs2294915, P = 1.83x10-7), non-cardiovascular death (rs2294917, P = 3.9x10-4) and the efficiency of the intensive antidiabetic therapy within a subgroup in the population (individuals with GG of rs16991236, P = 0.007). My studies allowed for a better understanding of the genetic background of NAFLD in T2D and open perspectives for establishing more adequate tools for diagnosis and follow-up of the disease.
515

Genetické faktory ovlivňující průběh vybraných forem nefrotického syndromu / Genetic factors affecting course of selected forms of nephrotic syndrome

Šafaříková, Markéta January 2011 (has links)
Nephrotic syndrome (NS) is characterized by proteinuria, hypalbuminemia and edemas. It occurs during first and second glomerulopathies. This disease can be divided into two groups: primary (idiopathic) and secondary. The heredity of the familial nephrotic syndrome is autosomal dominant and autosomal recessive. There are four most important genes that condition the formation of hereditary nephrotic syndrome in adult patienst. These genes are ACTN4, CD2AP, NPHS2 and TRPC6. The gene ACTN4, which encodes protein α-actinin 4, is responsible for the autosomal dominant form of focal segmental glomerulosclerosis (FSGS). FSGS is included in first glomerulopathies. α-Actinin 4 was also researched for some types of carcinomas. There was performed the mutational analysis of the gene ACTN4 on the set of 48 patients with nephrotic syndrome in this diploma thesis. High resolution melting (HRM) analysis and sequencing selected samples were used during this mutation detection. During this process many published and unpublished SNPs and one unpublished candidate mutation that could have causal associations with FSGS were found.
516

Narrating Scotland: in pursuit of a nation : A case study of nation and nationalism as utilized in the Scottish National Party

Berggren, Evelina January 2021 (has links)
The nationalist party in Scotland, the Scottish National Party (SNP), has attracted attention through the years for its election successes as a party and as a movement utilizing a modern type of nationalist approach. This thesis seeks an answer to the research question “How does the leader of the Scottish National Party depict the nation of Scotland?” to explore what nation of Scotland this modern nationalist party depicts. The answer lies in what is called “civic nationalism”, an approach void of ethnocentrism. The depiction revealed a nation of Scotland where anyone can belong, and where an approach of openness and inclusion in civic interests from democratic concerns, social issues, economy, business, immigration to the outside world ruled the narration. The great aims driving this approach is the vision of realizing Scotland’s “great potential” and role as an equal partner in the world arena.
517

Genetics of Nutrient Consumption and an Evolutionary Perspective of Eating Disorders

Mayhew, Alexandra Jean 11 1900 (has links)
Obesity prevalence continues to increase worldwide, yet few safe and effective treatment options are available suggesting there needs to be a greater emphasis on preventing rather than treating obesity. This research investigated the association of obesity predisposing SNPs and a gene score with nutrient consumption patterns including total energy intake and macronutrient distribution in a European ancestry population as well as discussing an evolutionary perspective on eating disorders using current epidemiological evidence to identify genes which may be involved. The association of two of the 14 obesity predisposing SNPs and the gene score with BMI was confirmed in the EpiDREAM population. Novel associations between two SNPs located in or near BDNF (rs6265 and rs1401635) were found with total fat, MUFA, and PUFA intake. Rs1401635 was also associated with total energy and trans fat intake. Novel associations of rs6235 (PCSK1) and the gene score were found with total energy intake. The novel associations found indicate that food related behaviours are one of the mechanisms of action through which obesity predisposing SNPs cause obesity and therefore warrant further investigation. The lack of association among all genes and the modest association of the gene score show that mechanisms other than food consumption are important. The investigation of the evolutionary history of eating disorders revealed that the adapted to flee famine hypothesis is a plausible theory explaining anorexia nervosa while the thrifty genotype hypothesis provides a possible explanation for bulimia nervosa and binge eating disorder. These evolutionary theories can be applied to identify new candidate genes as well as phenotypic traits to investigate to better understand the genetic architecture of eating disorders. Understanding genes associated with disordered eating patterns may highlight future areas for obesity prevention. / Thesis / Master of Science (MSc) / A large percentage of the risk of developing obesity or an eating disorder (anorexia nervosa, bulimia nervosa, and binge eating disorder) is determined by genetics. For obesity, many genes have been identified as influencing risk, but the mechanisms through which the genes work are largely unknown. For eating disorders, gene identification efforts have been mostly unsuccessful and no mechanisms of action have been determined. In the first component of this thesis we found an association between previously identified obesity risk genes and food intake, specifically the total number of calories consumed per day and the percentage of calories from total fat and fat subtypes. These results support that food related behaviours are possible mechanisms of action which need to be further investigated. In the second half of the thesis we viewed eating disorder behaviours from an evolutionary perspective. We concluded that there are theories that possibly explain eating disorder behaviours including being able to live off of small quantities of food as well as binging. These evolutionary theories can be applied to identify new genes to study in the context of eating disorders as well as different definitions of eating disorders.
518

Developing saddleback and emperor tamarin SNP set for in situ genotyping

López Clinton, Samantha January 2022 (has links)
Many countries in the global south - which harbour the majority of the world’s biodiversity - face serious resource limitations and a lack of access to affordable sequencing services. Furthermore, biodiversity research and monitoring of non-model, threatened and/or cryptic species often relies on low-quality non-invasive genetic samples. In situ conservation genomics approaches optimised for field conditions and low-quality DNA can help empower local researchers and meet their needs. To do so, however, accessible and reproducible sequencing and genotyping alternatives are needed. I designed a SNP panel as a field-friendly genotyping approach for two species of Amazonian primates using both high- and low-quality DNA samples, and two different sequencing platforms, Illumina and Nanopore. I used 14 high-quality genomes to identify a set of 210 SNPs that allow for identification of species (twelve SNPs), sex (twelve SNPs) and individual identity (186 SNPs) in two species of tamarins, Leontocebus weddelli and Saguinus imperator. Primers, adapters and indexes were designed in a Genotyping-in-Thousands by sequencing approach that is compatible with both sequencing platforms. This approach is based on sequencing multiplexed PCR products of a few hundred target SNPs to genotype thousands of individuals in a single sequencing run. In an effort to make conservation genomics more accessible, the reproducible pipeline to obtain the informative SNPs is being modulated with Snakemake, a workflow management system. / Muchos países en el sur global - los cuales poseen la mayoría de la biodiversidad mundial - enfrentan serias limitaciones de recursos y una falta de acceso a servicios económicos de secuenciación. Con frecuencia, la investigación y el monitoreo de biodiversidad y especies no-modelo, amenazadas y/o crípticas, dependen de muestras genéticas no-invasivas de baja calidad. La genómica de la conservación in situ optimizada para condiciones de campo y ADN de baja calidad puede empoderar a investigadorxs locales y ayudarles a responder a sus necesidades. Para ello, sin embargo, se requieren alternativas accesibles y reproducibles de secuenciación y genotipado. Diseñé un panel de SNPs como una aproximación de genotipado apta para el campo y dirigida a dos especies de primates amazónicos con el uso de ADN de baja y alta calidad, y dos plataformas de secuenciación (Illumina y Nanopore). Usé 14 genomas de alta calidad para encontrar 210 SNPs que permiten la identificación de la especie (doce SNPs), del sexo (doce SNPs) y de la identidad individual (186 SNPs) en dos especies de pichicos, Leontocebus weddelli y Saguinus imperator. Los cebadores, adaptadores e índices fueron diseñados con un enfoque de Genotyping-in-Thousands by sequencing (Genotipado en los miles por secuenciación) que es compatible con ambas plataformas de secuenciación. Este método está basado en la secuenciación de productos de PCR multiplexados de unos cientos de SNPs para genotipar miles de individuos en una sola corrida de secuenciación. En un intento de mejorar la accesibilidad de la genómica de la conservación, el proceso reproducible para obtener a los SNPs informativos está siendo modulado con Snakemake, un sistema de manejo de flujos de trabajo.
519

Search for functional alleles in the human genome with focus on cardiovascular disease candidate genes

Johnson, Andrew Danner 30 August 2007 (has links)
No description available.
520

Organización de la diversidad genética de los cítricos

García Lor, Andrés 29 July 2013 (has links)
Citrus es el género de la subfamilia Aurantioideae de mayor importancia económica. Su origen es la región sureste de Asia, en un área que incluye China, India y la península de Indochina y los archipiélagos de los alrededores. Aunque se han realizado múltiples estudios, la taxonomía del género Citrus aun no está bien definida, debido al alto nivel de diversidad morfológica encontrado en este grupo, la compatibilidad sexual entre sus especies y la apomixis de muchos genotipos. En la presente tesis doctoral se ha estudiado una amplia diversidad del género Citrus, especies relacionadas y otros taxones de la subfamilia Aurantioideae, para poder aclarar su organización y filogenia mediante el empleo de diferentes tipos de marcadores moleculares y métodos de genotipado. Más concretamente, el germoplasma de mandarino juega un papel muy importante en la mejora de variedades y patrones, pero su organización genética no está bien definida. Por lo tanto, se ha realizado un análisis en profundidad de su diversidad y organización genética. El desarrollo de marcadores moleculares de Inserción-Deleción (indel), por primera vez en cítricos, ha permitido demostrar su utilidad para estudios de diversidad y filogenia en el género Citrus. En combinación con los marcadores de tipo microsatélite (SSR), se ha cuantificado la contribución de los tres principales taxones de cítricos (C. reticulata, C. maxima and C. medica) a los genomas de las especies secundarias y cultivares modernos. También se ha definido su estructura genética a partir de los datos obtenidos en la secuenciación de 27 fragmentos de genes nucleares relacionados con la biosíntesis de compuestos que determinan la calidad de los cítricos y genes relacionados con la respuesta de la planta a estreses abióticos. El análisis de la filogenia nuclear ha permitido determinar la relación existente entre la especie C. reticulata y Fortunella, que se diferencian claramente del grupo formado por las otras dos principales especies de cítricos (C. maxima y C. medica). Este resultado está en concordancia con el origen geográfico de las especies estudiadas. A partir de este estudio, se han desarrollado marcadores moleculares de tipo SNP con un alto valor filogenético, que han sido transferidos a géneros relacionados de los cítricos. Estos marcadores han dado un resultado muy positivo en el género Citrus y serán de gran utilidad para el establecimiento de la huella genética del germoplasma en un nivel de diversidad más amplio. Se ha estudiado la organización genética dentro del germoplasma mandarino (198 genotipos de tipo mandarino pertenecientes a dos colecciones, INRA-CIRAD e IVIA), así como la introgresión de otros genomas mediante el uso de 50 y 24 marcadores de tipo SSR y indel, respectivamente, además de cuatro marcadores InDel mitocondrial (ADNmt). Se ha observado que muchos genotipos, que se creía que eran mandarinos puros, presentan introgresión de otros genomas ancestrales. Dentro del germoplasma de mandarino, se han identificado a nivel nuclear cinco grupos parentales, a partir de los cuales se originaron muchos genotipos, dando lugar a estructuras hibridas complejas. Se ha observado incluso, genotipos con un origen maternal no mandarino, determinado por los marcadores de ADNmt. La presente tesis doctoral ha aportado nueva información sobre las relaciones filogenéticas entre las especies del género Citrus, géneros cercanos, así como de las especies secundarias. Además, se han desarrollado nuevos marcadores moleculares que se complementan entre sí. Se ha establecido una nueva organización genética del germoplasma mandarino y se han caracterizado adecuadamente las dos colecciones de cítricos en estudio. Por lo tanto, todas estas contribuciones, ayudarán a los programas de mejora para la obtención de nuevas variedades de cítricos de alta calidad y permitirán optimizar la conservación y uso de los recursos genéticos existentes, así como su caracterización genética y fenotípica. / García Lor, A. (2013). Organización de la diversidad genética de los cítricos [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/31518

Page generated in 0.0456 seconds