Global ETD Search

1	Network Based Integration of Proteomic and Transcriptomic Data: Study of BCR and WNT11 Signaling Pathways in Cancer Cells Sitte, Maren 08 May 2020 (has links) No description available. 510 integrative analysis Informatik (PPN619939052)
2	Sparse Models For Multimodal Imaging And Omics Data Integration January 2015 (has links) 1 / DONGDONG LIN
3	Informatics Approaches for Integrative Analysis of Disparate High-Throughput Genomic Datasets in Cancer January 2014 (has links) abstract: The processes of a human somatic cell are very complex with various genetic mechanisms governing its fate. Such cells undergo various genetic mutations, which translate to the genetic aberrations that we see in cancer. There are more than 100 types of cancer, each having many more subtypes with aberrations being unique to each. In the past two decades, the widespread application of high-throughput genomic technologies, such as micro-arrays and next-generation sequencing, has led to the revelation of many such aberrations. Known types and subtypes can be readily identified using gene-expression profiling and more importantly, high-throughput genomic datasets have helped identify novel sub-types with distinct signatures. Recent studies showing usage of gene-expression profiling in clinical decision making in breast cancer patients underscore the utility of high-throughput datasets. Beyond prognosis, understanding the underlying cellular processes is essential for effective cancer treatment. Various high-throughput techniques are now available to look at a particular aspect of a genetic mechanism in cancer tissue. To look at these mechanisms individually is akin to looking at a broken watch; taking apart each of its parts, looking at them individually and finally making a list of all the faulty ones. Integrative approaches are needed to transform one-dimensional cancer signatures into multi-dimensional interaction and regulatory networks, consequently bettering our understanding of cellular processes in cancer. Here, I attempt to (i) address ways to effectively identify high quality variants when multiple assays on the same sample samples are available through two novel tools, snpSniffer and NGSPE; (ii) glean new biological insight into multiple myeloma through two novel integrative analysis approaches making use of disparate high-throughput datasets. While these methods focus on multiple myeloma datasets, the informatics approaches are applicable to all cancer datasets and will thus help advance cancer genomics. / Dissertation/Thesis / Ph.D. Biomedical Informatics 2014 Bioinformatics Cancer Genomics Integrative Analysis Myeloma Sequencing
4	A systems biological approach towards the molecular basis of heterosis in Arabidopsis thaliana Andorf, Sandra January 2011 (has links) Heterosis is defined as the superiority in performance of heterozygous genotypes compared to their corresponding genetically different homozygous parents. This phenomenon is already known since the beginning of the last century and it has been widely used in plant breeding, but the underlying genetic and molecular mechanisms are not well understood. In this work, a systems biological approach based on molecular network structures is proposed to contribute to the understanding of heterosis. Hybrids are likely to contain additional regulatory possibilities compared to their homozygous parents and, therefore, they may be able to correctly respond to a higher number of environmental challenges, which leads to a higher adaptability and, thus, the heterosis phenomenon. In the network hypothesis for heterosis, presented in this work, more regulatory interactions are expected in the molecular networks of the hybrids compared to the homozygous parents. Partial correlations were used to assess this difference in the global interaction structure of regulatory networks between the hybrids and the homozygous genotypes. This network hypothesis for heterosis was tested on metabolite profiles as well as gene expression data of the two parental Arabidopsis thaliana accessions C24 and Col-0 and their reciprocal crosses. These plants are known to show a heterosis effect in their biomass phenotype. The hypothesis was confirmed for mid-parent and best-parent heterosis for either hybrid of our experimental metabolite as well as gene expression data. It was shown that this result is influenced by the used cutoffs during the analyses. Too strict filtering resulted in sets of metabolites and genes for which the network hypothesis for heterosis does not hold true for either hybrid regarding mid-parent as well as best-parent heterosis. In an over-representation analysis, the genes that show the largest heterosis effects according to our network hypothesis were compared to genes of heterotic quantitative trait loci (QTL) regions. Separately for either hybrid regarding mid-parent as well as best-parent heterosis, a significantly larger overlap between the resulting gene lists of the two different approaches towards biomass heterosis was detected than expected by chance. This suggests that each heterotic QTL region contains many genes influencing biomass heterosis in the early development of Arabidopsis thaliana. Furthermore, this integrative analysis led to a confinement and an increased confidence in the group of candidate genes for biomass heterosis in Arabidopsis thaliana identified by both approaches. / Als Heterosis-Effekt wird die Überlegenheit in einem oder mehreren Leistungsmerkmalen (z.B. Blattgröße von Pflanzen) von heterozygoten (mischerbigen) Nachkommen über deren unterschiedlich homozygoten (reinerbigen) Eltern bezeichnet. Dieses Phänomen ist schon seit Beginn des letzten Jahrhunderts bekannt und wird weit verbreitet in der Pflanzenzucht genutzt. Trotzdem sind die genetischen und molekularen Grundlagen von Heterosis noch weitestgehend unbekannt. Es wird angenommen, dass heterozygote Individuen mehr regulatorische Möglichkeiten aufweisen als ihre homozygoten Eltern und sie somit auf eine größere Anzahl an wechselnden Umweltbedingungen richtig reagieren können. Diese erhöhte Anpassungsfähigkeit führt zum Heterosis-Effekt. In dieser Arbeit wird ein systembiologischer Ansatz, basierend auf molekularen Netzwerkstrukturen verfolgt, um zu einem besseren Verständnis von Heterosis beizutragen. Dazu wird eine Netzwerkhypothese für Heterosis vorgestellt, die vorhersagt, dass die heterozygoten Individuen, die Heterosis zeigen, mehr regulatorische Interaktionen in ihren molekularen Netzwerken aufweisen als die homozygoten Eltern. Partielle Korrelationen wurden verwendet, um diesen Unterschied in den globalen Interaktionsstrukturen zwischen den Heterozygoten und ihren homozygoten Eltern zu untersuchen. Die Netzwerkhypothese wurde anhand von Metabolit- und Genexpressionsdaten der beiden homozygoten Arabidopsis thaliana Pflanzenlinien C24 und Col-0 und deren wechselseitigen Kreuzungen getestet. Arabidopsis thaliana Pflanzen sind bekannt dafür, dass sie einen Heterosis-Effekt im Bezug auf ihre Biomasse zeigen. Die heterozygoten Pflanzen weisen bei gleichem Alter eine höhere Biomasse auf als die homozygoten Pflanzen. Die Netzwerkhypothese für Heterosis konnte sowohl im Bezug auf mid-parent Heterosis (Unterschied in der Leistung des Heterozygoten im Vergleich zum Mittelwert der Eltern) als auch auf best-parent Heterosis (Unterschied in der Leistung des Heterozygoten im Vergleich zum Besseren der Eltern) für beide Kreuzungen für die Metabolit- und Genexpressionsdaten bestätigt werden. In einer Überrepräsentations-Analyse wurden die Gene, für die die größte Veränderung in der Anzahl der regulatorischen Interaktionen, an denen sie vermutlich beteiligt sind, festgestellt wurde, mit den Genen aus einer quantitativ genetischen (QTL) Analyse von Biomasse-Heterosis in Arabidopsis thaliana verglichen. Die ermittelten Gene aus beiden Studien zeigen eine größere Überschneidung als durch Zufall erwartet. Das deutet darauf hin, dass jede identifizierte QTL-Region viele Gene, die den Biomasse-Heterosis-Effekt in Arabidopsis thaliana beeinflussen, enthält. Die Gene, die in den Ergebnislisten beider Analyseverfahren überlappen, können mit größerer Zuversicht als Kandidatengene für Biomasse-Heterosis in Arabidopsis thaliana betrachtet werden als die Ergebnisse von nur einer Studie. Systembiologie Heterosis Molekulare Profildaten Integrative Analyse Systems biology Heterosis Molecular profile data Integrative analysis Life sciences
5	Integrative Analysis for Identifying Multi-Layer Modules in Precision Medicine Yazdanparast, Aida 12 1900 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Precision medicine aims to employ information from all modalities to develop a comprehensive view of disease progression and administer therapies tailored to the individual patient. A set of genomic features (gene CNVs, mutations, mRNA expressions, and protein abundances) is associated with each patient and it is hard to explain the phenotypic similarities such as gene essentiality or variability in drug response in a single genomic level. Thus, to extract biological principles it is critical to seek mutual information from multi-dimensional datasets. To address these concerns, we first conduct an integrated mRNA/protein analysis in both breast cancer cell lines and tumors, and most interestingly in the breast cancer subtypes. We identified cell lines that provide optimum heterogeneity models for studying the underlying biological processes of tumors. Our systematic observation across multi-omics data identifies distinct subgroups of cancer cells and patients. Based on this identified signal transduction between mRNA and RPPA, we developed a biclustering model to characterize key genetic alterations that are shared in both cancer cell lines and patients. We integrated two types of omics data including copy number variations, transcriptome, and proteome. Bi-EB adopts a data-driven statistics strategy by using Expected-Maximum (EM) algorithm to extract the foreground bicluster pattern from its background noise data in an iterative search. Using Bi-EB algorithm we selected translational gene sets that are characterized by highly correlated molecular profiles among RNA and proteins. To further investigate cell line and tissue in breast cancer we explore the relationship vii between genomic features and the phenotypic factors. Using in vitro/in vivo drug screening data, we adopt partial least square regression method and develop a multi-modular approach to predict anticancer therapy benefits for ER-negative breast cancer patients. The identified joint multi-dimensional modules here provide us new insights into the molecular mechanisms of drugs and cancer treatment. / 2021-12-28 Biomarker discovery Breast cancer Classification Integrative analysis Machine learning Precision medicine
6	Ambulanspersonalens erfarenheter av faktorer som påverkar omhändertagandet av barn upp till 12 år inom ambulanssjukvården : en litteraturöversikt / The ambulance personnel's experiences of factors that affect the care of children up to 12 years in the ambulance services : a literature review Almalah, Suraa, Blomgren, Kristian January 2021 (has links) Att vårda barn beskrivs som en utmaning bland ambulanspersonal. Utmaningen består bland annat i barns anatomi och fysiologi vilken skiljer sig från vuxnas samt avsaknad av utrustning anpassad för barn. Barn är beroende av sina föräldrar vilket innebär att både barnet och föräldrarna måste tas om hand samtidigt, vilket också är en utmaning för ambulanspersonalen. Syftet var att belysa ambulanspersonalens erfarenheter av faktorer som påverkar omhändertagandet av barn upp till 12 år inom ambulanssjukvård. Metoden som användes var en allmän litteraturstudie innehållande kvalitativa vetenskapliga artiklar från databaserna PubMed och Cinahl. Totalt har 14 vetenskapliga artiklar inkluderats och analyserats utifrån en integrerad analys. I resultatet framkom en huvudkategori vilken benämndes främjande och hindrande faktorer för vård av barn, samt fyra subkategorier vilka benämndes känslomässiga reaktioner, kommunikation och föräldrarnas närvaro, brist på erfarenhet och utbildning samt stöd från andra aktörer och kollegor. Ambulanspersonalen uttryckte oro, ångest samt stress i samband med uppdrag som berör barn och att brist på erfarenhet och utbildning kundeförknippas med dessa känslor av osäkerhet av att vårda barn. Kommunikation med barn var en utmaning på grund av att små barn inte kan uttrycka sig verbalt. Med stöd av föräldrarna kunde ambulanspersonalen få anamnes om barnet och dess tillstånd. Föräldrarna ansågs även vara en viktig informationskälla. Ambulanspersonalen uttryckte även vikten av att jobba med en kollega de hade förtroende för vilket kunde minska deras oro, till skillnad mot om de jobbade med en kollega de inte kände så väl, vilket kunde resultera i ökad oro och stress. Slutsatsen var att de vanligaste känslorna bland ambulanspersonalen var ångest, oro och stress. Mer erfarenhet och utbildning i omhändertagandet av barn kan minska dessa känslor och öka ambulanspersonalens kompetens och trygghet / The caring of children was described as a challenge to the ambulance personnel. The challenges may include the fact that children's anatomy and physiology differ from adults and lack of equipment that is adapted to children. But also because children are dependent on their parents, which sometimes means you have to take care of both the child and the parents at the same time. The aim was to shed light on ambulance personnel's experience of factors that affect the care of children up to 12 years in the ambulance service. The method used was a general literature study containing qualitative scientific articles from the databases PubMed and Cinahl. A total of 14 scientific articles have been included and analysed based on an integrated analysis. In the result it appeared a main category called promoting and obstruction factors for the care of children, and four subcategories called emotional reactions, communication and parental presence, lack of experience and education as well as support from other actors and colleagues. The ambulance personnel expressed concern, anxiety and stress associated with assignments involving children and that lack of experience and education could be associated with these feelings of uncertainty of caring for children. Communication with children was a challenge because young children can not express themselves verbally. With the support of the parents, ambulance personnel could get a history of the children's condition. The parents were also considered an important source of information. The ambulance personnel also expressed the importance of working with a colleague you trust which could reduce their concerns, unlike working with a colleague who you do not know well which could result in increased anxiety and stress. The conclusion was that the most common feelings among the ambulance personnel were concern, anxiety and stress. More experience and training in the care of children could reduce these feelings and increase the ambulance personnel´s skills and security Ambulance personnel Child Integrative analysis Prehospital Ambulanspersonal Barn Integrativ analys Prehospital Nursing Omvårdnad
7	Functional assessments of amino acid variation in human genomes Preeprem, Thanawadee 22 May 2014 (has links) The Human Genome Project, initiated in 1990, creates an enormous amount of excitement in human genetics—a field of study that seeks answers to the understanding of human evolution, diseases and development, gene therapy, and preventive medicine. The first completion of a human genome in 2003 and the breakthroughs of sequencing technologies in the past few years deliver the promised benefits of genome studies, especially in the roles of genomic variability and human health. However, intensive resource requirements and the associated costs make it infeasible to experimentally verify the effect of every genetic variation. At this stage of genome studies, in silico predictions play an important role in identifying putative functional variants. The most common practice for genome variant evaluation is based on the evolutionary conservation at the mutation site. Nonetheless, sequence conservation is not the absolute predictor for deleteriousness since phylogenetic diversity of aligned sequences used to construct the prediction algorithm has substantial effects on the analysis. This dissertation aims at overcoming the weaknesses of the conservation-based assumption for predicting the variant effects. The dissertation describes three different integrative computational approaches to identify a subset of high-priority amino acid mutations, derived from human genome data. The methods investigate variant-function relationships in three aspects of genome studies—personal genomics, genomics of epilepsy disorders, and genomics of variable drug responses. For genetic variants found in genomes of healthy individuals, an eight-level variant classification scheme is implemented to rank variants that are important towards individualized health profiles. For candidate genetic variants of epilepsy disorders, a novel 3-dimensional structure-based assessment protocol for amino acid mutations is established to improve discrimination between neutral and causal variants at less conserved sites, and to facilitate variant prioritization for experimental validations. For genomic variants that may affect inter-individual variability in drug responses, an explicit structure-based predictor for structural disturbances is developed to efficiently evaluate unknown variants in pharmacogenes. Overall, the three integrative approaches provide an opportunity for examining the effects of genomic variants from multiple perspectives of genome studies. They also introduce an efficient way to catalog amino acid variants on a large scale genome data. Human genome variations Single nucleotide polymorphisms Missense mutations Amino acid mutations Integrative analysis Functional assessment Functional genomics Computer simulation Mutation (Biology) Amino acids
8	Classification of Glioblastoma Multiforme Patients Based on an Integrative Multi-Layer Finite Mixture Model System Campos Valenzuela, Jaime Alberto 26 November 2018 (has links) Glioblastoma multiforme (GMB) is an extremely aggressive and invasive brain cancer with a median survival of less than one year. In addition, due to its anaplastic nature the histological classification of this cancer is not simple. These characteristics make this disease an interesting and important target for new methodologies of analysis and classification. In recent years, molecular information has been used to segregate and analyze GBM patients, but generally this methodology utilizes single-`omic' data to perform the classification or multi-’omic’ data in a sequential manner. In this project, a novel approach for the classification and analysis of patients with GBM is presented. The main objective of this work is to find clusters of patients with distinctive profiles using multi-’omic’ data with a real integrative methodology. During the last years, the TCGA consortium has made publicly available thousands of multi-’omic’ samples for multiple cancer types. Thanks to this, it was possible to obtain numerous GBM samples (> 300) with data for gene and microRNA expression, CpG sites methylation and copy-number variation (CNV). To achieve our objective, a mixture of linear models were built for each gene using its expression as output and a mixture of multi-`omic' data as covariates. Each model was coupled with a lasso penalization scheme, and thanks to the mixture nature of the model, it was possible to fit multiple submodels to discover different linear relationships in the same model. This complex but interpretable method was used to train over \numprint{10000} models. For \texttildelow \numprint{2400} cases, two or more submodels were obtained. Using the models and their submodels, 6 different clusters of patients were discovered. The clusters were profiled based on clinical information and gene mutations. Through this analysis, a clear separation between the younger patients and with higher survival rate (Clusters 1, 2 and 3) and those from older patients and lower survival rate (Clusters 4, 5 and 6) was found. Mutations in the gene IDH1 were found almost exclusively in Cluster 2, additionally, Cluster 5 presented a hypermutated profile. Finally, several genes not previously related to GBM showed a significant presence in the clusters, such as C15orf2 and CHEK2. The most significant models for each clusters were studied, with a special focus on their covariants. It was discovered that the number of shared significant models were very small and that the well known GBM related genes appeared as significant covariates for plenty of models, such as EGFR1 and TP53. Along with them, ubiquitin-related genes (UBC and UBD) and NRF1, which have not been linked to GBM previously, had a very significant role. This work showed the potential of using a mixture of linear models to integrate multi-’omic’ data and to group patients in order to profile them and find novel markers. The resulting clusters showed unique profiles and their significant models and covariates were comprised by well known GBM related genes and novel markers, which present the possibility for new approaches to study and attack this disease. The next step of the project is to improve several elements of the methodology to achieve a more detail analysis of the models and covariates, in particular taking into account the regression coefficients of the submodels. info:eu-repo/classification/ddc/610 ddc:610
9	Identification de gènes impliqués dans les ataxies épisodiques par combinaison de séquençages génomique et transcriptomique Audet, Sébastien 12 1900 (has links) Cette étude pilote vise à développer une méthode d'analyse intégrative qui permet d'augmenter le taux de réussite du diagnostic clinique des mutations génétiques rares. De plus, l'identification de nouveaux gènes associés à l'ataxie épisodique (EA) et l'évaluation de nouveaux algorithmes de prédiction, pour un examen de variants plus robuste, découleront de l'enquête. Caractérisé par une perte sporadique de la coordination des mouvements volontaires, l'EA se manifeste généralement tardivement, avec une hétérogénéité clinique et génétique élevée, compliquant largement l’obtention d’un diagnostic précis. Alors que quatre gènes ont été liés aux huit sous-types d'EA, de nombreux patients demeurent sans diagnostic moléculaire dû aux limites des méthodes de séquençage d’ADN. Ces lacunes accentuent l’intérêt d’implanter le séquençage de l’ARN en milieu clinique, afin d’obtenir l’information fonctionnelle offerte par l’approche. Des patients atteints d’EA, sans diagnostic moléculaire malgré un examen approfondi, ont été recrutés à Montréal. Le séquençage du génome entier (WGS) et de l'ARN a été effectué sur des échantillons de sang pour identifier les variants nucléotidiques, l'expression différentielle, les événements d'épissage ainsi que les expansions de microsatellites. Plusieurs algorithmes de prédiction de la pathogénicité récents ont été choisis pour être testés parallèlement aux algorithmes standard. Des données WGS provenant d’un trio familial atteint de pathologies neurologiques ont également été soumises au pipeline génomique développé pour la cohorte EA. Des variants candidats ont été identifiés pour chaque patient en fonction des scores de pathogénicité, de la rareté des événements génétiques et des informations fonctionnelles et cliniques connues pour un gène altéré donné. Parmi les découvertes figurent des mutations non-sens, des faux-sens, de l'épissage alternatif ainsi que des expansions nucléotidiques dans des gènes associés aux ataxies spinocérébelleuses ou aux paraplégies spastiques. En plus d'être présents dans les ensembles de données de séquençage disponibles pour chaque patient, les événements génomiques ont été vérifiés par séquençage Sanger de l'ADN et de l'ARN lorsque possible. Les effets fonctionnels potentiels, prédits principalement à partir du RNA-seq et suggérant une expression anormale de l'ARNm, ont également été évalués par amplification PCR et qPCR traditionnelle. À ce jour, quatre des dix patients ont reçu ou sont en voie de recevoir un diagnostic clinique, et quatre autres présentent d’excellents candidats moléculaires pour expliquer une pathologie ataxique. Ce projet devrait permettre un diagnostic mieux défini, conduisant à une meilleure qualité de vie, une meilleure évaluation du pronostic et une meilleure prise en charge des patients. L’identification de modulateurs génétiques chez certains d’entre eux devrait également permettre une meilleure caractérisation clinique des conditions rapportées, bénéficiant les évaluations symptomatiques futures. De plus, la méta-analyse des données RNA-seq offre le potentiel de découvrir des régulateurs de pathogenèse communs à l’EA. Il favorisera également l'approche intégrative pour un plus large éventail de troubles et pourrait éventuellement conduire à de nouvelles stratégies thérapeutiques. / This pilot study aims to develop an integrative analysis method that allows for an increased diagnosis success rate of rare genetic mutations. Moreover, identification of novel genes associated with Episodic Ataxia (EA) and evaluation of new AI-generated prediction algorithms, for a more robust variant examination, will ensue from the investigation. Characterized by sporadic loss of voluntary movement coordination, EA typically manifest with a late onset as well as high-clinical and genetic heterogeneity, setting additional hurdles to diagnosis. While four genes have been linked to the eight subtypes of EA, many patients are left without molecular diagnosis due to the limitations of individual DNA-sequencing methods, which can be mitigated by the functional overview that RNA sequencing (RNA-seq) offers. EA patients, lacking molecular diagnosis despite in-depth examination, were recruited in Montreal. Whole-Genome sequencing (WGS) and RNA-seq were performed on blood samples to identify single nucleotide variants, differential expression, splicing events, structural variants and repeat expansions. Multiple recent pathogenicity prediction algorithms were chosen for testing concurrently to standard ones, in order to evaluate their performance and potential for clinical pipelines integration. WGS data of a family trio from France, in which the father and the daughter present neurologic pathologies, were also processed through the genomic pipeline that was developed for the EA cohort in order to identify the cause of their disorder. Candidate variants were identified for each patient according to pathogenicity scores, rarity of genetic events, and known functional as well as clinical information for a given altered gene. Among the findings are truncations, missenses, alternative splicing, and repeat expansions in genes already associated to either spinocerebellar ataxia or spastic paraplegia. In addition to being present in both datasets when available, validation of these interesting genomic events has been performed through Sanger Sequencing of both DNA and RNA when feasible. For strong candidates where the available functional information from RNA-seq suggests abnormal mRNA expression, validation includes PCR amplification as well as a traditional qPCR to support effects on transcripts. To this day, four out of ten patients have received or are on the verge of receiving a diagnosis, and four others are carrying excellent molecular candidates requiring further validation to explain their ataxic pathologies. This project should provide more defined diagnosis, leading to better quality of life, better evaluation of prognosis and better management of care for patients. Identification of genetic modifier in some of them should also allow for a better clinical characterization of the reported conditions, benefiting future patient examinations. A meta-analysis of our patients’ transcriptomic profiles could also uncover commonly affected pathways in EA development. It will also promote the integrative approach for a larger spectrum of disorders and might eventually lead to new therapeutic strategies. Génomique Transcriptomique Génétique clinique Ataxies Bio-informatique Analyse intégrative WGS RNA-seq SpliceAI ExpansionHunter Genomic Transcriptomic Clinical genetics Ataxia Bioinformatics Integrative analysis
10	Analyse intégrative de données de grande dimension appliquée à la recherche vaccinale / Integrative analysis of high-dimensional data applied to vaccine research Hejblum, Boris 06 March 2015 (has links) Les données d’expression génique sont reconnues comme étant de grande dimension, etnécessitant l’emploi de méthodes statistiques adaptées. Mais dans le contexte des essaisvaccinaux, d’autres mesures, comme par exemple les mesures de cytométrie en flux, sontégalement de grande dimension. De plus, ces données sont souvent mesurées de manièrelongitudinale. Ce travail est bâti sur l’idée que l’utilisation d’un maximum d’informationdisponible, en modélisant les connaissances a priori ainsi qu’en intégrant l’ensembledes différentes données disponibles, améliore l’inférence et l’interprétabilité des résultatsd’analyses statistiques en grande dimension. Tout d’abord, nous présentons une méthoded’analyse par groupe de gènes pour des données d’expression génique longitudinales. Ensuite,nous décrivons deux analyses intégratives dans deux études vaccinales. La premièremet en évidence une sous-expression des voies biologiques d’inflammation chez les patientsayant un rebond viral moins élevé à la suite d’un vaccin thérapeutique contre le VIH. Ladeuxième étude identifie un groupe de gènes lié au métabolisme lipidique dont l’impactsur la réponse à un vaccin contre la grippe semble régulé par la testostérone, et donc liéau sexe. Enfin, nous introduisons un nouveau modèle de mélange de distributions skew t àprocessus de Dirichlet pour l’identification de populations cellulaires à partir de donnéesde cytométrie en flux disponible notamment dans les essais vaccinaux. En outre, nousproposons une stratégie d’approximation séquentielle de la partition a posteriori dans lecas de mesures répétées. Ainsi, la reconnaissance automatique des populations cellulairespourrait permettre à la fois une avancée pratique pour le quotidien des immunologistesainsi qu’une interprétation plus précise des résultats d’expression génique après la priseen compte de l’ensemble des populations cellulaires. / Gene expression data is recognized as high-dimensional data that needs specific statisticaltools for its analysis. But in the context of vaccine trials, other measures, such asflow-cytometry measurements are also high-dimensional. In addition, such measurementsare often repeated over time. This work is built on the idea that using the maximum ofavailable information, by modeling prior knowledge and integrating all data at hand, willimprove the inference and the interpretation of biological results from high-dimensionaldata. First, we present an original methodological development, Time-course Gene SetAnalysis (TcGSA), for the analysis of longitudinal gene expression data, taking into accountprior biological knowledge in the form of predefined gene sets. Second, we describetwo integrative analyses of two different vaccine studies. The first study reveals lowerexpression of inflammatory pathways consistently associated with lower viral rebound followinga HIV therapeutic vaccine. The second study highlights the role of a testosteronemediated group of genes linked to lipid metabolism in sex differences in immunologicalresponse to a flu vaccine. Finally, we introduce a new model-based clustering approach forthe automated treatment of cell populations from flow-cytometry data, namely a Dirichletprocess mixture of skew t-distributions, with a sequential posterior approximation strategyfor dealing with repeated measurements. Hence, the automatic recognition of thecell populations could allow a practical improvement of the daily work of immunologistsas well as a better interpretation of gene expression data after taking into account thefrequency of all cell populations. Analyse intégrée Analyse par groupe de gènes Bayesien non paramétrique Connaissance a priori Cytométrie en flux Dimorphisme sexuel Distribution skew t Données de grande dimension Fenêtrage automatisé Grippe Génomique Modèle de mélange Processus de Dirichlet Vaccin VIH Automated gating Dirichlet process Flow cytometry Flu Gene set analysis Highdimensional data HIV Integrative analysis Mixture model Nonparametric Bayesian Prior knowledge Sexual dimorphism Skew t-distribution Statistical genomics Vaccine

Search results