Global ETD Search

1	Développement de méthodes et d'algorithmes pour la caractérisation et l'annotation des transcriptomes avec les séquenceurs haut débit. / Development of methods and tools for the characterization and annotation of the transcriptomes with Next-Generation Sequencing technologies. Philippe, Nicolas 29 September 2011 (has links) Depuis leur apparition, les séquenceurs haut débit ont révolutionné l'étude des transcriptomes à l'échelle du génome. En effet, ils offrent la possibilité de générer des millions, voire des milliards de séquences, appelées reads. Des nouvelles approches transcriptomiques, telles que la Digital Gene Expression (DGE) et le RNA-Sequencing (RNA-Seq), permettent aujourd'hui de répertorier, de quantifier, voire reconstruire tous les transcrits d'une cellule, même les plus rares. Parmi ce type de transcrits se trouvent des ARN non-codants régulateurs ; des variants d'épissages créateurs de protéines ; et aussi des chimères (par fusion de gènes ou trans-épissage). La caractérisation de l'ensemble de ces transcrits représente un réel défi algorithmique, mais suscite aussi un défi biologique car certains peuvent être impliqués dans de nombreux processus cellulaires physiologiques et pathologiques et sont fréquemment décrits dans les cancers.Dans ce travail, nous proposons des algorithmes et des méthodes pour la caractérisation et l'annotation des transcriptomes. Tout d'abord, nous proposons une étude statistique sur la DGE afin d'évaluer l'impact des erreurs de séquences lors de l'analyse des reads. À partir de cette analyse, nous avons développé un pipeline d'annotation pour la DGE. Par le biais de ce premier travail, nous avons pu démontrer que de nombreuses informations étaient partagées entre les reads. Cela nous a amené à concevoir la structure d'indexation Gk arrays qui permet d'organiser une quantité massive de reads de façon à pouvoir interroger rapidement la structure sous forme de requêtes. Enfin, en s'appuyant sur les Gk arrays, nous avons développé CRAC qui est un logiciel spécialisé dans le traitement du RNA-Seq. En intégrant sa propre phase de mapping, CRAC est capable de distinguer les phénomènes biologiques des erreurs de séquences. Ilpermet notamment l'identification de chimères qui sont souvent très faiblement exprimées dans un transcriptome et sont par nature complexe à détecter avec des parties localisées à différents endroits sur le génome. / Since their introduction, high-throughput sequencers have revolutionized transcriptomic studies at genome scale. Indeed, they have the ability to generate millions, or even billions of short sequences, called reads. New transcriptomic approaches, such as Digital Gene Expression (DGE) and RNA-sequencing (RNA-Seq), enable the identification, quantification, and reconstitution of all transcripts of the cell, even rare ones. Among these transcripts are regulatory non-coding RNAs, alternative splice variants, which code for novel proteins, but also non colinear transcripts termed chimeras (generated by either gene fusion or trans-splicing). The characterization of these transcripts constitutes a sheer algorithmic,but also a biological challenge due to their differences in nature, their diverse implications in physiological and cellular processes, and for some their role in cancer development.In this work, we focus on algorithms and methods for the characterization and annotation of transcriptomes. First, we proposed a statistical study on DGE to assess the impact of sequence errors on the analysis. Therefrom, we developed a pipeline for the DGE annotation. Through this initial work,we demonstrated that a lot of information is shared between the reads. This property led us to design, the Gk arrays, an indexing data structure for organizing huge amounts of reads in memory and algorithms to quickly query this structure. Finally, based on the Gk arrays we have conceived, CRAC,a software specialised in the RNA-Seq processing. By integrating its own mapping process, CRAC is able to distinguish the biological phenomena from sequence errors. Moreover, it allows to identify chimeric RNAs, which may be weakly expressed in a transcriptome and are inherently complex to detect since their fragments originate from different places on the genome. Transciptome Genome Sequenceur haut débit RNA-Sequencing Bio-informatique Cancer Transcriptomic Genomic Next Generation Sequencer RNA-Sequencing Bioinformatic Cancer
2	Genetics of Glioma : Transcriptome and MiRNome Based Approches Soumya, A M January 2013 (has links) (PDF) Glioma, the tumor of glial cells, is one of the common types of primary central nervous system (CNS) neoplasms. Astrocytoma is the most common of all gliomas and originates from astrocytic glial cells. Astrocytoma tumors belong to two main categories: benign tumors, comprising of grade I Pilocytic astrocytoma and malignant tumors which diffusely infiltrate throughout the brain parenchyma. Diffusely infiltrating astrocytomas are graded into diffuse astrocytoma (DA; grade II), anaplastic astrocytoma (AA; grade III) and glioblastoma (GBM; grade IV) in the order of increasing malignancy. Patients with grade II astrocytoma have a median survival time of 6 to 8 years after surgical intervention. While the more aggressive grade III (AA) and grade IV (GBM) are together called malignant astrocytomas, the treatment protocols and length of survival are distinctly different between these grades. The median survival time for grade III patients is 2 to 3 years whereas patients with grade IV have a median survival of 12-15 months. GBMs have been further divided into primary GBM and secondary GBM on the basis of clinical and histopathological criteria. Primary GBM presents in an acute de novo manner with no evidence of an antecedent lower grade tumor and it accounts for >90% of all GBMs. In contrast, secondary GBM results from the progressive malignant transformation of a grade II or grade III astrocytoma. The current WHO grading system of astrocytomas is based on the histopathological characteristics of the underlying tumor tissue. Diagnoses by pathologists are dependent on specific histologic features: increased mitosis, nuclear atypia, microvascular proliferation and/or necrosis, which associate with biologically aggressive behaviour (WHO 2007). Though grading based on histology is largely reproducible and well accepted, subjectivity involved and substantial disagreement between pathologists has remained a major concern. Because of inherent sampling problems (mainly due to tumor location in the brain) and inadequate sample size available for histological evaluation, there exists a very high possibility of error in grading. Recent studies have attempted to characterize the molecular basis for the histological and prognostic differences between grade III and grade IV astrocytoma. While reports have shown the grade specific profile of gene expression, there is no molecular signature that can accurately classify grade III and grade IV astrocytoma samples. In the current work, we have identified molecular signatures for the accurate classification of grade III and grade IV astrocytoma patients by using transcriptome and miRNome data. The receptor tyrosine kinase pathway is known to be overexpressed in 88% of glioblastoma patients. The expression and activation of the receptors is reported to be deregulated by events like amplification and activating mutations. The aberrant expression of RTKs could also be due to the deregulation of miRNAs, which, in the untransformed astrocytes regulate and fine-tune the levels of the RTKs. In the current study, we have identified that tumor suppressor miRNA miR-219-5p regulates RTK pathway by targeting EGFR and PDGFRα. Part I. Transcriptome approach: Identification of a 16-gene signature for classification of malignant astrocytomas In order to obtain a more robust molecular classifier to accurately classify grade III and grade IV astrocytoma samples, we used transcriptome data from microarray study previously performed in our laboratory. The differential regulation of 175 genes identified from microarray was validated in a cohort of grade III and grade IV patients by real-time qRT-PCR. In order to identify the classification signature that can classify grade III and grade IV astrocytoma samples, we used the expression data of 175 genes for performing Prediction Analysis of Microarrays (PAM) in the training set of grade III and grade IV astrocytoma samples. PAM analysis identified the most discriminatory 16-gene expression signature for the classification of grade III and grade IV astrocytoma. The Principal Component Analysis (PCA) of 16-genes astrocytoma patient samples revealed that the expression of 16-genes could classify grade III and grade IV astrocytoma samples into two separate clusters. In the training set, the 16-gene signature was able to classify grade III and grade IV patients with an accuracy rate of 87.9% as tested by additional analysis of Cross-Validated probability by PAM. The 16-gene signature obtained in the training set was validated in the test set with diagnostic accuracy of 89%. We further validated the 16-gene signature in three independent cohorts of patient samples from publicly available databases: GSE1993, GSE4422 and TCGA datasets and the classification signature got validated with accuracy rates of 88%, 92% and 99% respectively. To address the discordance in grading between 16-gene signature and histopathology, we looked at the clinical features (age and survival) and molecular markers (CDKN2A loss, EGFR amplification and p53 mutation) that differ substantially between grade III and grade IV in discordant grade III and grade IV samples. The grading done by 16-gene signature correlated with known clinical and molecular markers that distinguish grade III and grade IV proving the utility of the 16-gene signature in the molecular classification of grade III and grade IV. In order to identify the pathways that 16 genes of the classification signature could regulate, we performed protein-protein interaction network and subsequently pathway analysis. The pathways with highest significance were ECM (extracellular matrix) and focal adhesion pathways, which are known to be involved in the epithelial to mesenchymal transition (EMT), correlating well with the aggressive infiltration of grade IV tumors. In addition to accurately classifying the grade III and grade IV samples, the 16-gene signature also demonstrated that genes involved in epithelial-mesenchymal transition play key role in distinguishing grade III and grade IV astrocytoma samples. Part II. miRNome approach microRNAs (miRNAs) have emerged as one of the important regulators of the interaction network that controls various cellular processes. miRNAs are short non-coding RNAs (mature RNA being 21-22nt long) that regulate the target mRNA by binding mostly in the 3’ UTR bringing about either translational repression or degradation of the target. miRNAs are shown to play key roles in cell survival, proliferation, apoptosis, migration, invasion and various other characteristic features that get altered in human cancers. miRNAs are characterized to have oncogenic or tumor suppressor role and the aberrant expression of miRNAs is reported in multiple human cancer types. Part A. Genome-wide expression profiling identifies deregulated miRNAs in malignant astrocytoma With an aim to identify the role of miRNAs in the development of in malignant astrocytoma, we performed a large-scale, genome-wide microRNA (miRNA) (n=756) expression profiling of 26 grade IV astrocytoma, 13 grade III astrocytoma and 7 normal brain samples. Using Significance Analysis of Microarrays (SAM), we identified several differentially regulated miRNAs between control normal brain and malignant astrocytoma, grade III and grade IV astrocytoma, grade III astrocytoma and grade IV secondary GBM, progressive pathway and de novo pathway of GBM development and also between primary and secondary GBM. Importantly, we identified a most discriminatory 23-miRNA expression signature, by using PAM, which precisely distinguished grade III from grade IV astrocytoma samples with an accuracy of 90%. We re-evaluated the grading of discordant samples by histopathology and identified that one of the discordant grade III samples had areas of necrosis and it was reclassified as grade IV GBM. Similarly, out of two discordant grade IV samples, one sample had oligo component and it was reclassified as grade III mixed oligoastrocytoma. Thus, after the revised grading, the prediction accuracy increased from 90% to 95%. The differential expression pattern of nine miRNAs was further validated by real-time RT-PCR in an independent set of malignant astrocytomas (n=72) and normal samples (n=7). Inhibition of two glioblastoma-upregulatedmiRNAs (miR-21 and miR-23a) and exogenous overexpression of two glioblastoma-downregulatedmiRNAs (miR-218 and miR-219-5p) resulted in reduced soft agar colony formation but showed varying effects on cell proliferation and chemosensitivity. Thus, we have identified the grade specific expression of miRNAs in malignant astrocytoma and identified a miRNA expression signature to classify grade III astrocytoma from grade IV glioblastoma. In addition, we have demonstrated the functional relevance of miRNA modulation and thus showed the miRNA involvement and their importance in astrocytoma development. Part B. miR-219-5p inhibits the receptor tyrosine kinase pathway by targeting mitogenic receptor kinases in glioblastoma The receptor tyrosine kinase (RTK) pathway, being one of the important growth promoting pathways, is known to be deregulated in 88% of the patients with glioblastoma. In order to understand the role of miRNAs in regulating the RTK pathway, we undertook a screening procedure to identify the potential miRNAs that could target different members of the RTK pathway. From the screening study involving bioinformatical prediction of miRNAs and subsequent experimental validation by modulation of miRNA levels in glioma cell lines, we identified miR-219-5p as a candidate miRNA. The overexpression of miR-219-5p reduced the protein levels of both EGFR and PDGFRα. We confirmed the binding of miR-219-5p to the 3’ UTRs by using reporter plasmids. We also confirmed the specificity of miR-219-5p binding sites in the 3’ UTR of EGFR by site directed mutagenesis of binding sites which abrogated the miRNA-UTR interaction. The expression of miR-219-5p was significantly downregulated in grade III as well as in grade IV astrocytoma samples in the miRNA microarray experiment and we further validated the downregulation in an independent cohort of grade III and grade IV astrocytoma patients by real-time qRT-PCR. The ectopic overexpression of miR-219-5p in glioma cell lines inhibited cell proliferation, colony formation, anchorage independent growth and the migration of glioma cells. In addition, overexpression of miR-219-5p decreased MAPK and PI3K pathways, in concordance with its ability to target EGFR and PDGFRα. Additionally, for the further characterization of miR-219-5p – EGFR interaction and its effect on MAPK and PI3K pathways, we used U87 glioma cells that stably overexpress wild-type EGFR and constitutively active ΔEGFR (both lacking 3’-UTR and thus being insensitive to miR-219-5p overexpression) along with U87 parental cells. In these cell lines with the overexpression of EGFR lacking 3’-UTR, miR-219-5p was unable to inhibit - MAPK and PI3K pathways and also glioma cell migration suggesting that these effects were indeed because of its ability to target EGFR. Further, in the glioblastoma patient cohort (TCGA dataset), we found significant negative correlation between EGFR protein levels, both total EGFR and phospho EGFR and miR-219-5p levels in the glioblastoma tissue samples suggesting a role of miR-219-5p in increasing the protein levels of EGFR in glioblastoma. In summary, we have identified and characterized miR-219-5p as the RTK regulating tumor suppressor miRNA in glioblastoma. Gliomas Glioma Malignant Astrocytoma Glioblastoma miRNome Approach miRNA Receptor Tyrosine Kinase (RTK) Pathway MiRNome-MicroRNAs(MiRNAs) - Glioma MiRNAs Transciptome, Glioma EMT Pathway miRNome Approach Molecular Biology
3	Développement de méthodes et d'algorithmes pour la caractérisation et l'annotation des transcriptomes avec les séquenceurs haut débit Philippe, Nicolas 29 September 2011 (has links) (PDF) Depuis leur apparition, les séquenceurs haut débit ont révolutionné l'étude des transcriptomes à l'échelle du génome. En effet, ils offrent la possibilité de générer des millions, voire des milliards de séquences, appelées reads. Des nouvelles approches transcriptomiques, telles que la Digital Gene Expression (DGE) et le RNA-Sequencing (RNA-Seq), permettent aujourd'hui de répertorier, de quantifier, voire reconstruire tous les transcrits d'une cellule, même les plus rares. Parmi ce type de transcrits se trouvent des ARN non-codants régulateurs ; des variants d'épissages créateurs de protéines ; et aussi des chimères (par fusion de gènes ou trans-épissage). La caractérisation de l'ensemble de ces transcrits représente un réel défi algorithmique, mais suscite aussi un défi biologique car certains peuvent être impliqués dans de nombreux processus cellulaires physiologiques et pathologiques et sont fréquemment décrits dans les cancers.Dans ce travail, nous proposons des algorithmes et des méthodes pour la caractérisation et l'annotation des transcriptomes. Tout d'abord, nous proposons une étude statistique sur la DGE afin d'évaluer l'impact des erreurs de séquences lors de l'analyse des reads. À partir de cette analyse, nous avons développé un pipeline d'annotation pour la DGE. Par le biais de ce premier travail, nous avons pu démontrer que de nombreuses informations étaient partagées entre les reads. Cela nous a amené à concevoir la structure d'indexation Gk arrays qui permet d'organiser une quantité massive de reads de façon à pouvoir interroger rapidement la structure sous forme de requêtes. Enfin, en s'appuyant sur les Gk arrays, nous avons développé CRAC qui est un logiciel spécialisé dans le traitement du RNA-Seq. En intégrant sa propre phase de mapping, CRAC est capable de distinguer les phénomènes biologiques des erreurs de séquences. Ilpermet notamment l'identification de chimères qui sont souvent très faiblement exprimées dans un transcriptome et sont par nature complexe à détecter avec des parties localisées à différents endroits sur le génome. [SDV:CAN] Life Sciences/Cancer [SDV:BIO] Life Sciences/Biotechnology Transciptome Genome Sequenceur haut débit RNA-Sequencing Bio-informatique Cancer

1

Page generated in 0.0312 seconds