Global ETD Search

171	Bioinformatic sequence and structural analysis for Amyloidogenicity in Prions and other proteins Gendoo, Deena January 2012 (has links) Detection of amyloidogenic peptides or domains in proteins is of paramount importance towards understanding their role in amyloidosis in conformational diseases. This thesis explores different methods towards detection and prediction of amyloidogenic peptides using a variety of bioinformatic analytical methods. Bioinformatic analysis of secondary structural changes is employed to determine whether classes of structurally ambivalent peptides, mainly discordant and chameleon sequences, are efficient predictors of amyloidogenic segments. This analysis elucidates statistical relationships between discordance, chameleonism, and amyloidogenicity across a database of protein domains (SCOP), a subset of amyloid-forming proteins, and the prion family. The presented results stress upon the limitations of these peptides as predictors of amyloidogenicity, and raise issues on the predictive power that can be reaped from secondary structure prediction methods. In another bioinformatic approach, detection of conformationally variable segments in tertiary structures of PrP globular domains has been performed using Principal Component Analysis. This technique succeeded in identifying five conformationally variable domains within PrP, and ranking these subdomains by their ability to differentiate PrPs based on non-local structural response to pathogenic mutation and prion disease susceptibility. The presented results are corroborated by previous observations from experimental methods and molecular dynamic simulations, suggesting that this approach serves as a fast and reliable method for detection of potential amyloidogenic segments in amyloid-forming proteins. Finally, a structural, functional, and evolutionary bioinformatic analysis is conducted to assess the prevalence of the first experimentally verified amyloid fibril fold in nature, and whether this fold can serve as a prototype for other amyloid-forming proteins. The results indicate a limited scope of this fold in amyloid-forming proteins and across the protein universe, and have implications on future identification of amyloid-forming proteins that share this fold. Collectively, the presented thesis compares these different methods and discusses their efficacy in detection of amyloidogenic segments. / La détection de peptides ou de domaines amyloïdogéniques dans les protéines est d'une importance primordiale dans la compréhension de leur rôle dans l'amylose dans les maladies conformationnelles. Cette thèse explore différentes méthodes en vue de la détection et la prédiction des peptides amyloïdogéniques utilisant une variété de méthodes d'analyse bio-informatique. L'analyse bio-informatique des changements structurels secondaires est employé afin de déterminer si les classes des peptides structurellement ambivalentes, principalement des séquences discordantes et caméléons, sont des prédicteurs efficaces de segments amyloïdogéniques. Cette analyse élucide des relations statistiques entre la discordance, la chameleonism et l'amyloïdogénicité à travers une base de données de domaines protéiques (SCOP), un sous-ensemble de protéines formées d'amyloïdes, et de la famille prion. Les résultats présentés soulignent les limites de ces peptides en tant que prédicteurs d'amyloïdogénicité, et soulèvent des questions sur le pouvoir prédictif qui peut être récolté de méthodes de prédiction de structure secondaire. Dans une autre approche bio-informatique, la détection de segments de conformation variables dans les structures tertiaires de domaines globulaires PrP a été effectuée utilisant « Principal Component Analysis ». Cette technique a réussi à identifier cinq domaines de conformation variables au sein de la protéine PrP, et à classer ces sous-domaines par leur capacité à différencier les PrP fondés sur des réponses structurelles non-locales à la mutation pathogène et la susceptibilité aux maladies prion. Les résultats présentés sont corroborés par des observations antérieures à partir de méthodes expérimentales et de simulations de dynamique moléculaire, ce qui suggère que cette approche sert comme une méthode rapide et fiable pour la détection de segments amyloïdogéniques potentiels dans les protéines formées d'amyloïdes. Finalement, une analyse structurelle, fonctionnelle et évolutive bio-informatique est menée afin d'évaluer la prévalence du premier pli de fibrille amyloïde dans la nature vérifié expérimentalement, et si ce pli peut servir de prototype pour d'autres protéines formées d'amyloïdes. Les résultats indiquent une portée limitée de ce pli dans les protéines formées d'amyloïdes et à travers l'univers des protéines, et ont des répercussions sur l'identification future de protéines formées d'amyloïdes qui partagent ce pli. Collectivement, la thèse présentée compare ces différentes méthodes et discute leur efficacité dans la détection de segments amyloïdogéniques. Biology - Bioinformatics
172	Intron loss and gain in Eukaryotes Coulombe-Huntington, Jasmin January 2008 (has links) Although introns were first discovered almost 30 years ago, their evolutionary origin and function remains elusive. In this thesis, I describe a referenced-based intron mapping method based on multi-species whole-genome alignments. We applied this method in two distinct studies. First we studied intron loss and gain dynamics in mammals and subsequently in Drosophila. We mapped known human introns onto the mouse, rat and dog genomes, mouse introns onto the human genome and Drosophila melanogaster introns onto 10 other fully sequenced Drosophila genomes. This genome-wide approach allowed us to assess the presence or absence of over 150,000 known human introns across four mammalian species and more than 35,000 D. melanogaster introns across 11 fruit fly species. We inferred 122 intron loss events in mammals and no intron gain events. In flies, we were able to identify 1754 intron loss events and 213 gain events. In both studies we found that lost introns tend to be extremely short and show higher than average similarity between their 5' splice-site sequence and the 3' partner splice-site sequence. We also demonstrate that losses in mammals occur preferentially in highly expressed house-keeping genes, while in Drosophila we show that lost and gained introns are flanked by longer than average exons, display quite distinct phase distributions and losses demonstrate significant clustering within genes. Across flies, it appears introns that have been lost evolve faster than other introns while they occur in slowly evolving genes. Our results in both studies strongly support the cDNA recombination mechanism of intron loss. The results in flies also suggest that selective pressures affect site-specific loss rates and show that intron gain has occurred within the Drosophila lineage, solidifying the “introns-middle” hypothesis and providing some hints about the gain mechanism and origin of introns. / Malgré le fait que les introns furent découverts il y a près de 30 ans, leur origine et leur fonction nous échappent encore. Au cours de cette thèse, je décrirais une méthode qui permet de projeter des introns d'une espèce de référence sur d'autres génomes, basée sur des alignements de génomes complets à plusieurs espèces. Nous avons appliqué cette méthode dans le cadre de deux études distinctes. Premièrement, nous avons étudié les pertes et les gains d'introns chez les mammifères et ensuite chez les Drosophiles. Nous avons projeté les introns humains sur le génome de la souris, du rat et du chien, les introns de la souris sur le génome humain et les introns de la Drosophile melanogaster sur les génomes de 10 autres espèces de Drosophiles complètement séquencées. Cette approche d'ordre génomique nous a permis de comparer la présence ou l'absence de plus de 150,000 introns humains dans quatre espèces de mammifères et plus de 35,000 introns de D. melanogaster dans 11 espèces de drosophiles. Nous avons détecté 122 pertes d'introns chez les mammifères mais aucun gain d'intron. Chez les mouches à fruits, nous avons identifié 1754 pertes d'introns et 213 gains d'introns. Dans les deux études, nous démontrons que les introns perdus sont extrêmement courts et démontrent une similarité relativement élevée entre le site d'épissage au début de l'intron et le site d'épissage à la fin de l'intron. Nous démontrons chez les mammifères les pertes d'introns se produisent de préférence dans des gènes hautement exprimés et de fonctions cruciales à la cellule. Chez les drosophiles nous démontrons que les introns perdus ou gagnés sont délimités par des exons plus longs que la moyenne, ont une distribution de phase plutôt distincte et les pertes démontrent une tendance à se retrouver en groupe à l'intérieur des gènes. Chez les mouches à fruits, il semble que les introns perdus évoluent plus rapidement que la moyenne Biology - Bioinformatics
173	Knowledge discovery from gene expression data using neural-genetic models : a comparative study of four European countries with special attention to the education of these children Keedwell, Edward January 2003 (has links) No description available. 006 Bioinformatics
174	Denoising amplicon-based metagenomic data Gaspar, John M. 26 July 2014 (has links) <p> Reducing the effects of sequencing errors and PCR artifacts has emerged as an essential component in amplicon-based metagenomic studies. Denoising algorithms have been written that can reduce error rates in mock community data, in which the true sequences are known, but they were designed to be used in studies of real communities. To evaluate the outcome of the denoising process, we developed methods that do not rely on <i>a priori </i> knowledge of the correct sequences, and we applied these methods to a real-world dataset. We found that the denoising algorithms had substantial negative side-effects on the sequence data. For example, in the most widely used denoising pipeline, AmpliconNoise, the algorithm that was designed to remove pyrosequencing errors changed the reads in a manner inconsistent with the known spectrum of these errors, until one of the parameters was increased substantially from its default value.</p><p> With these shortcomings in mind, we developed a novel denoising program, FlowClus. FlowClus uses a systematic approach to filter and denoise reads efficiently. When denoising real datasets, FlowClus provides feedback about the process that can be used as the basis to adjust the parameters of the algorithm to suit the particular dataset. FlowClus produced a lower error rate compared to other denoising algorithms when analyzing a mock community dataset, while retaining significantly more sequence information. Among its other attributes, FlowClus can analyze longer reads being generated from current protocols and irregular flow orders. It has processed a full plate (1.5 million reads) in less than four hours; using its more efficient (but less precise) trie analysis option, this time was further reduced, to less than seven minutes. </p> Biology, Bioinformatics
175	Integration of Cancer-Related Mutations for Pan-Cancer Analysis Wu, Tsung-Jung 13 August 2014 (has links) <p> Years of sequence feature curation by UniProtKB/Swiss-Prot, PIR-PSD, NCBI-CDD, RefSeq and other database biocurators has led to a rich repository of information on functional sites of genes and proteins. This information along with variation-related annotation can be used to scan human short sequence reads from next-generation sequencing (NGS) pipelines for presence of non-synonymous single-nucleotide variations (nsSNVs) that affect functional sites. This and similar workflows are becoming more important because thousands of NGS data sets are being made available through projects such as The Cancer Genome Atlas (TCGA), and researchers want to evaluate their biomarkers in genomic data. BioMuta, an integrated sequence feature database, provides a framework for automated and manual curation and integration of cancer-related sequence features so that they can be used in NGS analysis pipelines. Sequence feature information in BioMuta is collected from the Catalogue of Somatic Mutations in Cancer (COSMIC), ClinVar, UniProtKB and through biocuration of information available from publications. Additionally, nsSNVs identified through automated analysis of NGS data from TCGA are also included in the database. Due to the petabytes of data and sequence information present in NGS primary databases, a High-performance Integrated Virtual Environment (HIVE) platform for storing, analyzing, computing and curating NGS data and associated metadata has been developed. Using HIVE, 31,979 nsSNVs were identified in TCGA-derived NGS data from breast cancer patients. All variations identified through this process are stored in a Curated Short Read archive, and the nsSNVs from the tumor samples are included in BioMuta. Currently, BioMuta has 26 cancer types with 13,896 small scale and 308,986 large scale study-derived variations. Integration of variation data allows identifications of novel or common nsSNVs that can be prioritized in validation studies.</p> Biology, Bioinformatics
176	Experimental design and statistical analysis in high throughput screening Murie, Carl Eric January 2014 (has links) High throughput screening (HTS) is a biotechnology that allows researchers to detect the small number of active features (e.g. small molecules, small interfering RNAs) among libraries containing up to hundreds of thousands of features. HTS assays, as with all experimental techniques, are prone to both random error resulting from the inherent variability of biological processes or experimental procedures, and systematic error which can be introduced through any number of known or unknown sources. The effect of both types of error can result in truly inactive features being labeled as active (false positives) and truly active features being labeled as inactive (false negatives). The goal of experimental design and statistical analysis is to minimize and estimate the error of an assay, although in the HTS field these methods are not always fully utilized.This thesis presents improvements in the statistical analysis and experimental design of HTS in order to improve the detection of rare biological activity. I first present a comparison of the effectiveness of normalization methods for HTS screening in two titration series experiments and extend the results in a third experiment with two differently designed but otherwise identical screens: compounds in replicate plates were either placed in the same well locations or were randomly assigned to different locations. Best results were obtained with a combination of appropriate normalization and randomization. Secondly, the Single Assay-wide Variance Experimental (SAVE) design is introduced whereby a small replicated subset of an entire screen is used to derive Empirical Bayes random error estimates which are applied to the remaining majority of unreplicated measurements. SAVE is shown to produce valid and informative P-values comparable to the P-values produced with multi-replicate data. Thirdly, the Control Plate Regression (CPR) normalization method, designed for assays such as secondary screens where there may be a majority of active features, is developed and shown to outperform current methodology. Diagnostic techniques are provided that allow researchers to predict the effectiveness and appropriateness of applying CPR. Lastly, the Statistics and dIagnostic Graphs for HTS (SIGHTS) software was developed to implement many of the techniques discussed in this thesis and is designed to be accessible to researchers with no programming experience.Combining graphical assessments, randomization procedures, normalization methods customized to the requirements of the screen, and statistical testing is shown to produce superior results to current HTS analysis techniques. / Le criblage à haut débit (CHD) est une biotechnologie qui permet l'identification d'un petit nombre de caractéristiques biologiques (petites molécules, petits ARN interférents) actifs parmi un très grand nombre de caractéristiques (jusqu'à des centaines de mille). Les expériences CHD, comme dans le cas de toute technique expérimentale, sont enclins autant aux erreurs aléatoires résultants de la variabilité inhérente des processus biologiques ou des procédures expérimentales, qu'aux erreurs systématiques qui peuvent être introduites par une multitude de sources connues ou inconnues. L'effet des deux types d'erreurs peut résulter en une identification comme actif d'activités réellement inactives (faux-positifs) et en des caractéristiques réellement actives identifiées comme étant inactives (faux-négatifs). Le but de la conception expérimentale et de l'analyse statistique est de minimiser et d'estimer l'erreur d'une expérience, bien que ces méthodes ne soient pas entièrement appliquées dans le domaine de la CHD. Cette thèse présente une suite de méthodes graphiques qui utilisent la correspondance entre les données et les attentes biologiques ou statistiques afin d'aider à évaluer la qualité de l'expérience et d'aider à choisir des techniques analytiques qui soient les plus appropriées. Une conception expérimentale randomisée (les caractéristiques sont assignées à différentes positions de puits sélectionnés de manière aléatoire au travers des réplicats de plaques) est présenté et comparé à une conception standard (les caractéristiques sont assignées aux mêmes positions de puits au travers des réplicats de plaques) et démontre qu'il est possible de mieux détecter les caractéristiques actives tout en réduisant les effets erronés. Une conception expérimentale est présenté où les valeurs p informatives peuvent être produites pour un essai à réplicat unique en utilisant le test statistique Modèle à Variance Aléatoire (MVA) avec un petit sous-ensemble de données répliquées à partir de l'essai à réplicat unique. Troisièmement, la méthode de normalisation "Control Plate Regression (CPR)" conçu pour des expériences de dépistage secondaire, ou il peut y avoir majorité d'éléments actifs, a été développée et démontre une meilleure performance que les méthode antérieures. Des techniques diagnostiques sont fournis pour permettre aux chercheurs de prédire l'efficacité et la pertinence de l'application de la méthode CPR. L'application combinée des évaluations graphiques d'une expérience, la conception expérimentale randomisé, les techniques de normalisation désignées pour des types de données spécifiques et les tests statistiques sont présentés comme ayant une capacité à produire des résultats de niveau supérieur aux techniques d'analyses CHD courantes. Le progiciel SIGHTS fut développé afin d'implémenter les techniques présentées dans cette thèse afin de rendre ces méthodes accessible aux chercheurs sans expertise en programmation. Biology - Bioinformatics
177	Conservation analysis of potential cis-NATs in Brassicaceae plants for crop improvement Bouchard, Johnathan January 2014 (has links) Canola fuels a multi-billion dollar industry in Canada. It is a Canadian trademarked name of specific cultivars derived from specific Brassicaceae plants. Cis-NATs are natural antisense transcripts that overlap a gene and are not translated into proteins. Instead, they silence their parent gene's expression through various mechanisms. Their role in humans is well established, but their role in plants is relatively obscure. The goal of this thesis project is to analyze the conservation of cis-NATs across 8 different Brassicaceae genera (9 species). This is useful for picking up targets for crop improvement in canola. Conservation was studied across the 9 species, then across two subgroups of 4 and 2 species, respectively; cis-NATs simultaneously exhibiting conservation in all three scenarios were selected. A total of 34 potential candidates were identified. The study also suggests that the type of a cis-NAT might also affect its conservation. The presented methodology is a powerful pre-screening strategy to direct experimental efforts. It can be used with genes and other transcribed non-coding DNA. / Le canola est à la base d'une industrie canadienne de plusieurs milliards de dollars. En fait, le mot canola est un acronyme canadien incluant certaines plantes dérivées d'espèces de la famille des Brassicaceae. Les cis-NATs sont des molécules d'ARN qui ne sont pas traduites en protéines. Elles réduisent plutôt l'expression des gènes qu'elles superposent à travers différents mécanismes. Leur rôle chez les humains est bien établit, mais ce n'est pas le cas chez les plantes. Le but de cette thèse est d'identifier des cis-NATs qui sont conservés à travers 8 genres différents (9 espèces) de la famille des Brassicaceae. Cela est pratique pour identifier des candidats pouvant être utilisés pour une application agronomique. La conservation a été étudiée à travers les 9 espèces, puis à travers deux sous-groupes de 4 et 2 espèces, respectivement. Les cis-NATs qui démontraient une conservation à travers 9, 4, et 2 espèces simultanément ont été sélectionnés. 34 candidats ont été identifiés. Le projet de recherche suggère aussi que le type de cis-NAT peut potentiellement influencer sa conservation. La méthode présentée est une stratégie de recherche préalable et très efficace pour diriger les efforts expérimentaux. Elle peut être aussi utilisée avec des gènes ou n'importe quel autre élément génétique non codant qui est transcrit. Biology - Bioinformatics
178	Investigating non-canonical functions of gamma-tubulin by using genome scale structure-function (GSSF) analysis Nguyen, Thi Thu Thao January 2010 (has links) Gamma-tubulin is a conserved component of microtubule-organizing center (MTOC) and functions in microtubule nucleation in vivo. Recent studies suggest that gamma-tubulin might have additional roles in microtubule organization. For example, the deletion of DSYL domain at the acidic unstructured C-terminal of Tub4 abrogates the Kar9-dependent pathway for spindle positioning. In vivo, gamma-tubulin is modulated via phosphorylation and the tyrosine 445 residue was found to be one of the phosphorylation sites of Tub4. In addition, the phospho-mimetic mutation (tub4-Y445D) causes defects in chromosome segregation. We hypothesize that differential phosphorylation of Tyr445 could control the non-canonical functions of Tub4. If this is the case, it is expected that phospho-mimetic and phospho-inhibiting mutants at Tyr445 would yield specific defects that report on the distinctive functions of Tub4. / In order to test this hypothesis, Genome Scale Structure Function (GSSF) analysis has been performed. This method consists of two main steps, first high-throughput Synthetic Genetic Array (SGA) analysis and second, data clustering using hierarchical algorithm. SGA is a powerful method to reveal genetic interacting partners of gene of interest. We have extended the SGA method by using known or predicted separation-of-function query alleles to cross into the deletion collection, which facilitates not only the study of essential genes but also the dissection of different functional modalities of genes. SGA analysis was conducted between a phospho-inhibiting tub4 mutant (tub4-Y445F) and ~4600 deletion mutants. Next, data clustering using hierarchical algorithm was performed on gene interaction matrix to identify major pathways that Tub4 is involved in. In addition to tub4 mutant, the GSSF analysis has been performed on conditional alleles from two different essential genes Glc7 (glc7-E101Q) and Ame1 (ame1-4), and has revealed genetic networks which recover known-regulated pathways as well as suggest new pathways that these two genes are involved. / Here, we present the GSSF analysis of the phospho-inhibiting allele tub4-Y445F. The results revealed previously known and expected pathways of gamma-tubulin including spindle positioning, actin organization, cell cycle checkpoints and interestingly, suggested new role of gamma-tubulin in DNA damage repair machinery. Preliminary data supporting the new role of gamma-tubulin in the DNA damage repair machinery is also presented, including genetic interactions with the MRX complex and HU sensitivity. / Altogether, the data outlined indicated that gamma-tubulin functions in a much more diverse network than would be expected if it were solely a MT nucleation factor. We propose that GSSF analysis on other tub4 separation-of-function mutants such as phospho-mimetic mutant tub4-Y445D will reveal how gamma-tubulin coordinates its multiple regulatory functions in cells. / La γ-tubuline est un composant du centre d'organisation des microtubules (COMT) et intervient dans la nucléation des microtubules in vivo. Des études récentes suggèrent que le rôle de la γ-tubuline pourrait s'étendre au-delà de cette fonction. Ainsi, la délétion du domaine DSYL à l'extrémité C-terminale acide et non structurée de Tub4 abolit la voie de positionnement du fuseau mitotique dépendant de Kar9. In vivo, la γ-tubuline est modulée par phosphorylation et le résidu tyrosine 445 est un site de phosphorylation de Tub4. De plus, une mutation phospho-mimétique (tub4-Y445D) provoque des défauts de ségrégation des chromosomes. Nous posons l'hypothèse que le phosphorylation différentielle de Tyr445 dicte les fonctions non-canoniques de Tub4. Par exemple, des mutants phospho-mimétiques ou inhibants la phosphorylation au site Tyr445 produiraient des défauts de diverses fonctions de Tub4. / Pour tester cette hypothèse, nous avons entrepris une étude structure-fonction à l'échelle du génome (GSSF) où une analyse du Synthetic Genetic Array (SGA) est suivie d'un regroupement des données par un algorithme hiérarchique. Le SGA est une technique permettant de révéler des interactions génétiques entre des gènes d'intérêt. Une analyse du SGA a été conduite entre un mutant de tub4 et ~4,600 mutants de délétion. Étant l'un des rares laboratoires à utiliser des mutations conditionnelles dans des analyses de SGA, nous pouvons étudier les gènes essentiels mais aussi disséquer les différentes fonctions des gènes. Dans un second temps, le regroupement des données par un algorithme hiérarchique a été réalisé à partir d'une matrice d'interactions génétiques dans le but d'identifier les principales voies d'action de Tub4. En plus de mutants tub4, une analyse GSSF a été conduite avec des allèles conditionnels des gènes essentiels Glc7 et Ame1, glc7-E101Q and ame1-4. Les réseaux d'interactions génétiques ainsi révélés comportent des voies connues pour être régulée par ces deux génes mais aussi suggèrent de nouvelles connexions. / Nous présentons ici l'analyse GSSF de l'allèle tub4-Y445F, inhibant la phosphorylation. Les résultats confirment le rôle de la γ-tubuline dans le positionnement du fuseau mitotique, l'organisation de l'actine et les points de contrôle du cycle cellulaire. Notre étude suggère que le γ-tubuline joue un rôle dans la machinerie de réparation des dommages à l'ADN. Des résultats préliminaires tels que des interactions génétique avec le complexe MRX et de test de sensibilité à HU sont présentées pour appuyer cette nouvelle fonction. Dans leur ensemble, nos données indiquent que la γ-tubuline a un rôle plus complexe que facteur de nucléation des microtubules. Nous pensons que les études GSSF d'autres allèles conditionnels de tub4 tel que tub4-Y445D (phospho-mimétique) permettront de mieux comprendre la coordination de ses multiples fonctions. Biology - Bioinformatics
179	Predicting transcription factor binding sites using phylogenetic footprinting and a probabilistic framework for evolutionary turnover Parmar, Victor January 2010 (has links) Identifying genomic locations of transcription-factor binding sites (TFBS), particularly in higher eukaryotic genomes, has been an enormous challenge. Computational methods involving identification of sequence conservation between related genomes have been the most successful since sites found in such highly conserved regions are more likely to be functional, i.e. are bound and regulate protein production. In this thesis, we present such a probabilistic algorithm for predicting TFBSs which also takes evolutionary turnovers into account. Our algorithm is validated via simulations and the results of its application on ChIP-chip data are presented. / L'identification des sites de fixation des facteurs de transcription (TFBS), particulièrement sur les génomes eucaryotiques plus élevés, a été un énorme défi. Les méthodes informatiques comportant l'identification de la conservation de séquence entre les génomes de différentes espèces ont eu beaucoup de succès parce que les sites trouvés dans de telles régions fortement conservées sont probablement fonctionnels (les facteurs de transcription se rajoutent sur le génome à ces sites-là et réglent la production de protéine). Dans cette thèse, nous présentons un algorithme probabiliste pour la prédiction de TFBSs qui prend en considération également le remuement évolutionnaire. Notre algorithme est validé par l'intermédiare des simulations et le résultats de son application sur des données ChIP-chip sont présentés Biology - Bioinformatics
180	Bioinformatics approaches to understanding the breast cancer microenvironment Pepin, Francois January 2010 (has links) Breast cancer is a complex disease that requires the acquisition of several traits in order to proliferate and spread to nearby and distant tissues. However, many combinations are possible, making it harder to determine their significance. Genome-wide approaches such as gene expression profiling have provided an unbiased and global tool to investigate those traits, allowing investigators to both separate tumors into biologically meaningful categories and then to investigate their features in that context. A well-organized effort is required in order to collect and analyze the large number of samples necessary for such analyses. The Bioinformatics Integrated Application Software represents a way to facilitate both the organization of laboratory manipulation and automating subsequent analyses. / A large part of the complexity of breast cancer comes from the different types of cells that constitute the microenvironment and participate in diverse ways to tumor progression. Blood vessels play an important role in tumor progression, as additional vessels are necessary to support tumor growth. However, those new vessels are generally immature and often cannot efficiently provide nutrients to the tumor. This thesis shows that there exist two classes of tumor blood vessels that are associated with vessel maturity and differ in their expression of several antiangiogenic drug targets. / Numerous interactions occur between the various components of the tumor microenvironment. Using matched expression profiles of these cell types, it is possible to iden- tify specific processes that involve several cell types, such as Th1 and Th2 immune responses. This first step will open the door to a better mapping of the interactions and signals that occur in breast cancer. / Le cancer du sein est une maladie complexe qui requiert l'accumulation de plusieurs caractéristiques avant de pouvoir se multiplier et envahir les tissues rapprochés et éloignés. Plusieurs combinaisons sont par contre possibles, compliquant la tâche de d ́eterminer leurs importances. Les techniques d'analyse sur tout le génome comme l'expression génique sont des outils globaux et non biaisés pour étudier ces caractéristiques. Elle permettent de séparer les tumeurs en groupes biologiquement significatifs et d'étudier leurs caractéristiques dans ce contexte. Un effort concerté est nécessaire pour collecter et analyser la grande quantité de tumeurs requise. Le "Bioinformatics Integrated Application Software" est un système qui permet d'organiser les manipulations de laboratoire et d'automatiser les analyses ultérieures. / Une large proportion de la complexité du cancer du sein provient des diff ́erentes espèces de cellules faisant partie du microenvironnement et participant à la progression de la tumeur. Les vaisseaux sanguins jouent un rôle important dans la progression du cancer car des vaisseaux additionels sont nécessaires pour supporter la croissance tumorale. Ces vaisseaux sont par contre généralement immatures et ne peuvent souvent pas alimenter efficacement la tumeur. Cette thèse démontre qu'il existe deux catégories de vaisseaux sanguins tumoraux qui sont associées avec la maturité des vaisseaux et différent dans leur expression de gènes cibles de plusieurs médicaments antiangiogenèses. / De nombreuses interactions se produisent entre les différentes composantes du microenvironnement tumoral. L'utilisation de profils d'expressions concordants de différentes espèces cellulaires rend possible l'identification de procédés impliquant plusieurs espèces cellulaires, incluant des réactions immunitaires de types Th1 et Th2. Cette première étape va ouvrir la porte à une meilleure connaissance des échanges de signaux dans le cancer du sein. Biology - Bioinformatics

Search results