Global ETD Search

1	BioBridge: Bringing Data Exploration to Biologists Boyd, Joseph 01 May 2014 (has links) Since the completion of the Human Genome Project in 2003, biologists have become exceptionally good at producing data. Indeed, biological data has experienced a sustained exponential growth rate, putting effective and thorough analysis beyond the reach of many biologists. This thesis presents BioBridge, an interactive visualization tool developed to bring intuitive data exploration to biologists. BioBridge is designed to work on omics style tabular data in general and thus has broad applicability. This work describes the design and evaluation of BioBridge's Entity View primary visualization as well the accompanying user interface. The Entity View visualization arranges glyphs representing biological entities (e.g. genes, proteins, metabolites) along with related text mining results to provide biological context. Throughout development the goal has been to maximize accessibility and usability for biologists who are not computationally inclined. Evaluations were done with three informal case studies, one of a metabolome dataset and two of microarray datasets. BioBridge is a proof of concept that there is an underexploited niche in the data analysis ecosystem for tools that prioritize accessibility and usability. The use case studies, while anecdotal, are very encouraging. These studies indicate that BioBridge is well suited for the task of data exploration. With further development, BioBridge could become more flexible and usable as additional use case datasets are explored and more feedback is gathered. visualization omics data information visualization interactive visualization exploratory visualization biology
2	Dissecting the multi-functional role of heterogeneous nuclear ribonucleoprotein H1 in methamphetamine addiction traits Ruan, Qiu T. 24 March 2021 (has links) Both genetic and environment factors influence susceptibility to substance use disorders. However, the genetic basis of these disorders is largely unknown. We previously identified Hnrnph1 (heterogeneous nuclear ribonucleoprotein H1) as a quantitative trait gene for reduced methamphetamine (MA) stimulant sensitivity. Mutation (heterozygous deletion of a small region in the first coding exon) in Hnrnph1 also decreased MA reinforcement, reward, and dopamine release. 5’UTR genetic variants in Hnrnph1 support reduced 5’UTR usage and hnRNP H protein expression as a molecular mechanism underlying the reduced MA-induced psychostimulant response. Interestingly, Hnrnph1 mutant mice show a two-fold increase in hnRNP H protein in the striatal synaptosome with no change in whole tissue level. Proteome profiling of the synaptosome identified an increase in mitochondrial complex I and V proteins that rapidly decreased with MA in Hnrnph1 mutants. In contrast, the much lower level of basal mitochondrial proteins in the wild-type mice showed a rapid, MA-induced increase. Altered mitochondrial proteins associated with the Hnrnph1 mutation may contribute to reductions in MA behaviors. hnRNP H1 is an abundant RNA-binding protein in the brain, involved in all aspect of post-transcriptional regulation. We examined both baseline and MA-induced changes in hnRNP H-RNA interactions to identify targets of hnRNP H that could comprise the neurobiological mechanisms of cellular adaptations occurring following MA exposure. hnRNP H post-transcriptionally regulates a set of mRNA transcripts in the striatum involved in psychostimulant-induced synaptic plasticity. MA treatment induced opposite changes in binding of hnRNP H to these mRNA transcripts between Hnrnph1 mutants versus wild-types. RNA-binding, transcriptome, and spliceome analyses triangulated on hnRNP H binding to the 3’UTR of Cacna2d2, an upregulation of Cacna2d2 transcript, and decreased 3’UTR usage of Cacna2d2 in response to MA in the Hnrnph1 mutants. Cacna2d2 codes for a presynaptic, voltage-gated calcium channel subunit that could plausibly regulate MA-induced dopamine release and behavior. The multi-omics datasets point to a dysregulation of mitochondrial function and interrelated calcium signaling as potential mechanisms underlying MA-induced dopamine release and behavior in Hnrnph1 mutants. Neurobiology Addiction Genetics Methamphetamine Omics data RNA binding protein Splicing
3	Tissue-dependent analysis of common and rare genetic variants for Alzheimer's disease using multi-omics data Patel, Devanshi 21 January 2021 (has links) Alzheimer’s disease (AD) is a complex neurodegenerative disease characterized by progressive memory loss and caused by a combination of genetic, environmental, and lifestyle factors. AD susceptibility is highly heritable at 58-79%, but only about one third of the AD genetic component is accounted for by common variants discovered through genome-wide association studies (GWAS). Rare variants may contribute to some of the unexplained heritability of AD and have been demonstrated to contribute to large gene expression changes across tissues, but conventional analytical approaches pose challenges because of low statistical power even for large sample sizes. Recent studies have demonstrated by expression quantitative trait locus (eQTL) analysis that changes in gene expression could play a key role in the pathogenesis of AD. However, regulation of gene expression has been shown to be context-specific (e.g., tissue and cell-types), motivating a context dependent approach to achieve more precise and statistically significant associations. To address these issues, I applied a strategy to identify new AD risk or protective rare variants by examining mutations occurring only in cases or only controls, observing that different mutations in the same gene or variable dose of a mutation may result in distinct dementias. I also evaluated the impact of rare variation on expression at the gene and gene pathway levels in blood and brain tissue, further strengthening the rare variant findings with functional evidence and finding evidence for a large immune and inflammatory component to AD. Lastly, I identified cell-type specific eQTLs in blood and brain tissue to explain underlying genetic associations of common variants in AD, and also discovered additional evidence for the role of myeloid cells in AD risk and potential novel blood and brain AD biomarkers. Collectively, these findings further explain the genetic basis of AD risk and provide insight about mechanisms leading to this disorder. / 2022-01-21T00:00:00Z Bioinformatics Alzheimer's disease Bioinformatics Genetics Genomics Multi-omics data Statistics
4	Deep Learning for Enhancing Precision Medicine Oh, Min 07 June 2021 (has links) Most medical treatments have been developed aiming at the best-on-average efficacy for large populations, resulting in treatments successful for some patients but not for others. It necessitates the need for precision medicine that tailors medical treatment to individual patients. Omics data holds comprehensive genetic information on individual variability at the molecular level and hence the potential to be translated into personalized therapy. However, the attempts to transform omics data-driven insights into clinically actionable models for individual patients have been limited. Meanwhile, advances in deep learning, one of the most promising branches of artificial intelligence, have produced unprecedented performance in various fields. Although several deep learning-based methods have been proposed to predict individual phenotypes, they have not established the state of the practice, due to instability of selected or learned features derived from extremely high dimensional data with low sample sizes, which often results in overfitted models with high variance. To overcome the limitation of omics data, recent advances in deep learning models, including representation learning models, generative models, and interpretable models, can be considered. The goal of the proposed work is to develop deep learning models that can overcome the limitation of omics data to enhance the prediction of personalized medical decisions. To achieve this, three key challenges should be addressed: 1) effectively reducing dimensions of omics data, 2) systematically augmenting omics data, and 3) improving the interpretability of omics data. / Doctor of Philosophy / Most medical treatments have been developed aiming at the best-on-average efficacy for large populations, resulting in treatments successful for some patients but not for others. It necessitates the need for precision medicine that tailors medical treatment to individual patients. Biological data such as DNA sequences and snapshots of genetic activities hold comprehensive information on individual variability and hence the potential to accelerate personalized therapy. However, the attempts to transform data-driven insights into clinical models for individual patients have been limited. Meanwhile, advances in deep learning, one of the most promising branches of artificial intelligence, have produced unprecedented performance in various fields. Although several deep learning-based methods have been proposed to predict individual treatment or outcome, they have not established the state of the practice, due to the complexity of biological data and limited availability, which often result in overfitted models that may work on training data but not on test data or unseen data. To overcome the limitation of biological data, recent advances in deep learning models, including representation learning models, generative models, and interpretable models, can be considered. The goal of the proposed work is to develop deep learning models that can overcome the limitation of omics data to enhance the prediction of personalized medical decisions. To achieve this, three key challenges should be addressed: 1) effectively reducing the complexity of biological data, 2) generating realistic biological data, and 3) improving the interpretability of biological data. Deep learning (Machine learning) Precision Medicine Omics data
5	Telomere analysis based on high-throughput multi-omics data Nersisyan, Lilit 20 September 2017 (has links) Telomeres are repeated sequences at the ends of eukaryotic chromosomes that play prominent role in normal aging and disease development. They are dynamic structures that normally shorten over the lifespan of a cell, but can be elongated in cells with high proliferative capacity. Telomere elongation in stem cells is an advantageous mechanism that allows them to maintain the regenerative capacity of tissues, however, it also allows for survival of cancer cells, thus leading to development of malignancies. Numerous studies have been conducted to explore the role of telomeres in health and disease. However, the majority of these studies have focused on consequences of extreme shortening of telomeres that lead to telomere dysfunction, replicative arrest or chromosomal instability. Very few studies have addressed the regulatory roles of telomeres, and the association of genomic, transcriptomic and epigenomic characteristics of a cell with telomere length dynamics. Scarcity of such studies is partially conditioned by the low-throughput nature of experimental approaches for telomere length measurement and the fact that they do not easily integrate with currently available high-throughput data. In this thesis, we have attempted to build algorithms, in silico pipelines and software packages to utilize high-throughput –omics data for telomere biology research. First, we have developed a software package Computel, to compute telomere length from whole genome next generation sequencing data. We show that it can be used to integrate telomere length dynamics into systems biology research. Using Computel, we have studied the association of telomere length with genomic variations in a healthy human population, as well as with transcriptomic and epigenomic features of lung cancers. Another aim of our study was to develop in silico models to assess the activity of telomere maintenance machanisms (TMM) based on gene expression data. There are two main TMMs: one based on the catalytic activity of ribonucleoprotein complex telomerase, and the other based on recombination events between telomeric sequences. Which type of TMM gets activated in a cancer cell determines the aggressiveness of the tumor and the outcome of the disease. Investigation into TMM mechanisms is valuable not only for basic research, but also for applied medicine, since many anticancer therapies attempt to inhibit the TMM in cancer cells to stop their growth. Therefore, studying the activation mechanisms and regulators of TMMs is of paramount importance for understanding cancer pathomechanisms and for treatment. Many studies have addressed this topic, however many aspects of TMM activation and realization still remain elusive. Additionally, current data-mining pipelines and functional annotation approaches of phenotype-associated genes are not adapted for identification of TMMs. To overcome these limitations, we have constructed pathway networks for the two TMMs based on literature, and have developed a methodology for assessment of TMM pathway activities from gene expression data. We have described the accuracy of our TMM-based approach on a set of cancer samples with experimentally validated TMMs. We have also applied it to explore TMM activity states in lung adenocarcinoma cell lines. In summary, recent developments of high-throughput technologies allow for production of data on multiple levels of cellular organization – from genomic and transcriptiomic to epigenomic. This has allowed for rapid development of various directions in molecular and cellular biology. In contrast, telomere research, although at the heart of stem cell and cancer studies, is still conducted with low-throughput experimental approaches. Here, we have attempted to utilize the huge amount of currently accumulated multi-omics data to foster telomere research and to bring it to systems biology scale. info:eu-repo/classification/ddc/000 ddc:000
6	Characterizing vaginal microbiome regulation of progesterone receptor expression via secondary analysis of host and microbiome multi-omics data Nina Marie Render (18370176) 16 April 2024 (has links) <p dir="ltr">The vaginal microbiome and female sex hormones are both involved in the development and progression of gynecological pathologies. The individual mechanisms by which the vaginal microbiome leads to disease progression and how female sex hormones are known. However, the mechanisms by which the vaginal microbiome regulates female sex hormones, such as progesterone, are not well understood. This study seeks to understand how the vaginal microbiome regulates progesterone receptor (PGR) expression via secondary analysis of host and vaginal microbiome multi-omics data from the Partners PrEP cohort. This dataset consists of cervicovaginal samples of women enrolled in the Partners PrEP study. Partial Least Squares Regression (PLSR) models were created for each biological data type (microbial composition, metabolomics, metaproteomics) to assess how these factors regulate PGR expression. Significant factors were identified through variable importance of projection (VIP) and correlation analysis. Partial correlation analysis and follow-up PLSR models incorporating clinical and demographic variables were performed to assess the robustness of the vaginal microbiome-PGR associations. The PLSR models indicated lower PGR expression was associated with <i>G. vaginalis,</i> and higher PGR expression was associated with <i>Lactobacillus </i>species. Cytosine, guanine, and tyrosine were among metabolites significantly associated with higher PGR expression and experimentally determined to be produced by <i>Lactobacillus</i> species. Conversely, citrulline and succinate were associated with lower PGR expression and experimentally determined to be produced by <i>G. vaginalis</i>. The models indicated that bacterial metabolic pathways involved in glucose metabolism, such as glucagon signaling and starch and sugar metabolism, may regulate PGR expression. Demographic phenotypes were also considered from the dataset and did not significantly alter the association between the biological explanatory variables and PGR expression. The results indicate that guanine, cytosine, succinate, starch and sucrose metabolism, and glycolysis gluconeogenesis may be regulators of PGR abundance and function. The models suggest vaginal microbiome factors could play a role in gynecological conditions where progesterone signaling is suppressed. Future experimental work is needed to validate the results of these models and support their use as predictive tools to understand the role of the vaginal microbiome.</p> Computational physiology vaginal microbiome systems biology omics data progesterone receptor expression
7	Multi-omics Data Integration for Identifying Disease Specific Biological Pathways Lu, Yingzhou 05 June 2018 (has links) Pathway analysis is an important task for gaining novel insights into the molecular architecture of many complex diseases. With the advancement of new sequencing technologies, a large amount of quantitative gene expression data have been continuously acquired. The springing up omics data sets such as proteomics has facilitated the investigation on disease relevant pathways. Although much work has previously been done to explore the single omics data, little work has been reported using multi-omics data integration, mainly due to methodological and technological limitations. While a single omic data can provide useful information about the underlying biological processes, multi-omics data integration would be much more comprehensive about the cause-effect processes responsible for diseases and their subtypes. This project investigates the combination of miRNAseq, proteomics, and RNAseq data on seven types of muscular dystrophies and control group. These unique multi-omics data sets provide us with the opportunity to identify disease-specific and most relevant biological pathways. We first perform t-test and OVEPUG test separately to define the differential expressed genes in protein and mRNA data sets. In multi-omics data sets, miRNA also plays a significant role in muscle development by regulating their target genes in mRNA dataset. To exploit the relationship between miRNA and gene expression, we consult with the commonly used gene library - Targetscan to collect all paired miRNA-mRNA and miRNA-protein co-expression pairs. Next, by conducting statistical analysis such as Pearson's correlation coefficient or t-test, we measured the biologically expected correlation of each gene with its upstream miRNAs and identify those showing negative correlation between the aforementioned miRNA-mRNA and miRNA-protein pairs. Furthermore, we identify and assess the most relevant disease-specific pathways by inputting the differential expressed genes and negative correlated genes into the gene-set libraries respectively, and further characterize these prioritized marker subsets using IPA (Ingenuity Pathway Analysis) or KEGG. We will then use Fisher method to combine all these p-values derived from separate gene sets into a joint significance test assessing common pathway relevance. In conclusion, we will find all negative correlated paired miRNA-mRNA and miRNA-protein, and identifying several pathophysiological pathways related to muscular dystrophies by gene set enrichment analysis. This novel multi-omics data integration study and subsequent pathway identification will shed new light on pathophysiological processes in muscular dystrophies and improve our understanding on the molecular pathophysiology of muscle disorders, preventing and treating disease, and make people become healthier in the long term. / Master of Science / Identification of biological pathways play a central role in understanding both human health and diseases. A biological pathway is a series of information processing steps via interactions among molecules in a cell that partially determines the phenotype of a cell. Specifically, identifying disease-specific pathway will guide focused studies on complex diseases, thus potentially improve the prevention and treatment of diseases. To identify disease-specific pathways, it is crucial to develop computational methods and statistical tests that can integrate multi-omics (multiple omes such as genome, proteome, etc) data. Compared to single omics data, multi-omics data will help gaining a more comprehensive understanding on the molecular architecture of disease processes. In this thesis, we propose a novel data analytics pipeline for multi-omics data integration. We test and apply our method on/to the real proteomics data sets on muscular dystrophy subtypes, and identify several biologically plausible pathways related to muscular dystrophies. Biological Pathways Multi-omics Data Integration Muscular Dystrophy Statistical significance test Gene set enrichment analysis
8	Robust methods for multivariate analysis of correlated genetics and genomics data Song, Zeyuan 11 February 2025 (has links) 2024 / This dissertation focuses on the development of advanced multivariate analysis methods for the analysis of genetics and genomics data with multiple sources of correlations. The dissertation describes three novel topics: (1) a method to learn partial correlation networks, also known as Gaussian Graphical Models, to analyze multi-omics data (2) a sparse network method to reduce network complexity, and (3) a Genome-Wide Association Study pipeline to analyze genome-wide genotype data in longitudinal and familial settings. In the first part of my dissertation I propose a cluster-based Bootstrap algorithm for learning Gaussian Graphical Models from correlated data. The Bootstrap algorithm is validated to effectively control Type I errors without compromising statistical power compared to alternative solutions through extensive simulations in family-based studies. Additionally the algorithm is applied to learn the partial correlation networks of 47 Polygenic Risk Scores generated from genome-wide genotype data in the Long Life Family Study to unveil the complex relationships of these Polygenic Risk Scores. The second part of the dissertation extends the Bootstrap algorithm to learn sparse Gaussian Graphical Models in correlated data. Simulation studies shows that this extended Bootstrap algorithm maintains control over the Type I errors. By varying the values of the tuning parameter, the dynamic changes of networks reveal their contraction and dissection as edges with small partial correlations are systematically removed. The application of this method in real data analysis identifies meaningful clusters in the dynamic changes of the Polygenic Risk Scores and lipids networks. In the third part, I developed a Nextflow Genome-Wide Association Study pipeline, providing a fully automated analysis tool for managing, analyzing, and visualizing genome-wide genotype data for continuous and binary traits with correlated genetics data. Applying this pipeline to investigate processing speed in the Long Life Family Study leads to the identification of 17 rare protective Single Nucleotide Polymorphisms located in/near Retinoic Acid Receptor Beta and Thyroid Hormone Receptor Beta genes on chromosome 3. These findings shed light on potential mechanisms supporting the preservation of processing speed in aging individuals. / 2026-02-11T00:00:00Z Biostatistics Bootstrap Correlated data Gaussian Graphical Models Genetics Genomics Multi-omics data integration
9	Réponse du grain de blé à la nutrition azotée et soufrée : étude intégrative des mécanismes moléculaires mis en jeu au cours du développement du grain par des analyses -omiques / Wheat grain response to nitrogen and sulfur supply : integrative study of molecular mechanisms involved during the grain development using -omics analyses Bonnot, Titouan 09 December 2016 (has links) L’augmentation des rendements est un enjeu majeur chez les céréales. Dans cet objectif, il est nécessaire de maintenir la qualité du grain de blé, qui est principalement déterminée par sa teneur et sa composition en protéines de réserve. En effet, une forte relation négative existe entre le rendement et la teneur en protéines. Par ailleurs, la qualité du grain est fortement influencée par la disponibilité en azote et en soufre dans le sol. La limitation des apports d’intrants azotés à la culture et la carence en soufre récemment observée dans les sols représentent ainsi des difficultés supplémentaires pour maitriser cette qualité. Une meilleure connaissance des mécanismes moléculaires impliqués dans le contrôle du développement du grain et la mise en place de ses réserves protéiques en réponse à la nutrition azotée et soufrée est donc primordiale. L’objectif de cette thèse a ainsi été d’apporter de nouveaux éléments à la compréhension de ces processus de régulation, aujourd’hui peu connus. Pour cela, les approches -omiques sont apparues comme une stratégie de choix pour identifier les acteurs moléculaires mis en jeu. Le protéome nucléaire a été une cible importante dans les travaux menés. L’étude de ces protéines nucléaires a révélé certains régulateurs transcriptionnels qui pourraient être impliqués dans le contrôle de la mise en place des réserves du grain. Dans une approche combinant des données de protéomique, transcriptomique et métabolomique, une vision intégrative de la réponse du grain à la nutrition azotée et soufrée a été obtenue. L’importance d’un apport de soufre dans le contrôle de la balance azote/soufre du grain, déterminante pour la composition du grain en protéines de réserve, a été clairement vérifiée. Parmi les changements observés au niveau du métabolisme cellulaire, certains des gènes affectés par la modification de cette balance pourraient orchestrer l’ajustement de la composition du grain face à des situations de carences nutritionnelles. Ces nouvelles connaissances devraient permettre de mieux maitriser la qualité du grain de blé dans un contexte d’agriculture durable. / Improving the yield potential of cereals represents a major challenge. In this context, wheat grain quality has to be maintained. Indeed, grain quality is mainly determined by the content and the composition of storage proteins, but there is a strongly negative correlation between yield and grain protein concentration. In addition, grain quality is strongly influenced by the availability of nitrogen and sulfur in soils. Nowadays, the limitation of nitrogen inputs, and also the sulfur deficiency recently observed in soils represent major difficulties to control the quality. Therefore, understanding of molecular mechanisms controlling grain development and accumulation of storage proteins in response to nitrogen and sulfur supply is a major issue. The objective of this thesis was to create knowledge on the comprehension of these regulatory mechanisms. For this purpose, the best strategy to identify molecular actors involved in these processes consisted of -omics approaches. In our studies, the nuclear proteome was an important target. Among these proteins, we revealed some transcriptional regulators likely to be involved in the control of the accumulation of grain storage compounds. Using an approach combining proteomic, transcriptomic and metabolomic data, the characterization of the integrative grain response to the nitrogen and sulfur supply was obtained. Besides, our studies clearly confirmed the major influence of sulfur in the control of the nitrogen/sulfur balance that determines the grain storage protein composition. Among the changes observed in the cell metabolism, some genes were disturbed by the modification of this balance. Thus these genes could coordinate the adjustment of grain composition in response to nutritional deficiencies. These new results contribute in facing the challenge of maintaining wheat grain quality with sustainable agriculture. Blé Grain Protéines de réserve Azote Soufre Protéines nucléaires Données omiques Réseaux biologiques Wheat Grain Storage proteins Nitrogen Sulfur Nuclear protein -omics data Biological network
10	Approche intégrative du développement musculaire afin de décrire le processus de maturation en lien avec la survie néonatale / Integrative approach of muscular development to describe the maturation process related to the neonatal survival Voillet, Valentin 29 September 2016 (has links) Depuis plusieurs années, des projets d'intégration de données omiques se sont développés, notamment avec objectif de participer à la description fine de caractères complexes d'intérêt socio-économique. Dans ce contexte, l'objectif de cette thèse est de combiner différentes données omiques hétérogènes afin de mieux décrire et comprendre le dernier tiers de gestation chez le porc, période influençant la mortinatalité porcine. Durant cette thèse, nous avons identifié les bases moléculaires et cellulaires sous-jacentes de la fin de gestation, en particulier au niveau du muscle squelettique. Ce tissu est en effet déterminant à la naissance car impliqué dans l'efficacité de plusieurs fonctions physiologiques comme la thermorégulation et la capacité à se déplacer. Au niveau du plan expérimental, les tissus analysés proviennent de foetus prélevés à 90 et 110 jours de gestation (naissance à 114 jours), issus de deux lignées extrêmes pour la mortalité à la naissance, Large White et Meishan, et des deux croisements réciproques. Au travers l'application de plusieurs études statistiques et computationnelles (analyses multidimensionnelles, inférence de réseaux, clustering et intégration de données), nous avons montré l'existence de mécanismes biologiques régulant la maturité musculaire chez les porcelets, mais également chez d'autres espèces d'intérêt agronomique (bovin et mouton). Quelques gènes et protéines ont été identifiées comme étant fortement liées à la mise en place du métabolisme énergétique musculaire durant le dernier tiers de gestation. Les porcelets ayant une immaturité du métabolisme musculaire seraient sujets à un plus fort risque de mortalité à la naissance. Un second volet de cette thèse concerne l'imputation de données manquantes (tout un groupe de variables pour un individu) dans les méthodes d'analyses multidimensionnelles, comme l'analyse factorielle multiple (AFM) (ou multiple factor analysis (MFA)). Dans notre contexte, l'AFM fut particulièrement intéressante pour l'intégration de données d'un ensemble d'individus sur différents tissus (deux ou plus). Afin de conserver ces individus manquants pour tout un groupe de variables, nous avons développé une méthode, appelée MI-MFA (multiple imputation - MFA), permettant l'estimation des composantes de l'AFM pour ces individus manquants. / Over the last decades, some omics data integration studies have been developed to participate in the detailed description of complex traits with socio-economic interests. In this context, the aim of the thesis is to combine different heterogeneous omics data to better describe and understand the last third of gestation in pigs, period influencing the piglet mortality at birth. In the thesis, we better defined the molecular and cellular basis underlying the end of gestation, with a focus on the skeletal muscle. This tissue is specially involved in the efficiency of several physiological functions, such as thermoregulation and motor functions. According to the experimental design, tissues were collected at two days of gestation (90 or 110 days of gestation) from four fetal genotypes. These genotypes consisted in two extreme breeds for mortality at birth (Meishan and Large White) and two reciprocal crosses. Through statistical and computational analyses (descriptive analyses, network inference, clustering and biological data integration), we highlighted some biological mechanisms regulating the maturation process in pigs, but also in other livestock species (cattle and sheep). Some genes and proteins were identified as being highly involved in the muscle energy metabolism. Piglets with a muscular metabolism immaturity would be associated with a higher risk of mortality at birth. A second aspect of the thesis was the imputation of missing individual row values in the multidimensional statistical method framework, such as the multiple factor analysis (MFA). In our context, MFA was particularly interesting in integrating data coming from the same individuals on different tissues (two or more). To avoid missing individual row values, we developed a method, called MI-MFA (multiple imputation - MFA), allowing the estimation of the MFA components for these missing individuals. Intégration de données omiques Réseaux biologiques Analyses multidimensionnelles Porc Maturité Mortalité néonatale Omics data integration Biological networks Multidimensional analysis Pigs Maturity Neonatal mortality

Search results