Spelling suggestions: "subject:"protein 1interaction betworks"" "subject:"protein 1interaction conetworks""
11 |
Design, implementation and experimental validation of a network-based model to predict mitotic microtubule regulating proteinsKhan, Faisal Farooq January 2013 (has links)
The purpose of this thesis was to study mitosis in Drosophila, from a network biology perspective. The primary aim was to develop and test a network-based prediction model that could integrate available data in public databases (like Flybase) and, based on that, predict potential mitotic proteins. The approach taken to design the protein interaction network included the use of a priori knowledge about the microtubule composition of the mitotic spindle and the higher likelihood of microtubule-associated proteins (MAPs) to have a putative mitotic function. The design also included the integration of different complementary datasets, from gene expression and functional RNAi screens to cross species conservation of MAPs for fitting a network-based model for predicting mitotic proteins. I begin with the creation of the MAP interactome based on a MAP dataset in Drosophila. This initial network was extended by transferring homologs and interologues of MAP datasets from four other species, i.e. human, mouse, rat and Arabidopsis. These proteins were then used as seed proteins to conduct a virtual pull-down experiment, by adding indirect interactors into the network, i.e. proteins that directly bind to two or more MAPs within the network, which completed the MAP interactome. Data from genome-wide studies in Drosophila were gathered for each node in the MAP interactome. These ‘layers’ of data were then used as features to fit a prediction model that could score each node in the network, based on the likelihood of its role in mitosis. The final model performed with 96% accuracy after 10-fold cross validation and was used to rank all the proteins in the MAP interactome. By analysing the top 100 high scoring predicted mitotic proteins, a highly connected cluster of 33 proteins was identified that was subject to experimental validation in the lab. The first approach was to conduct an in vitro analysis using an RNAi screen to test for any spindle, chromosome or centrosome phenotypes upon gene knockdown. After two independent RNAi screens, around 80% of the proteins produced mutant mitotic phenotypes strongly supporting the results of the MAP prediction model. The second approach was to conduct an in vivo analysis by expressing GFP- fusion constructs of selected genes from the subcluster. These were expressed in Drosophila early embryos to study their subcellular localization during interphase and mitosis. A variety of localizations were observed ranging from chromatin and microtubules to more generic cytoplasmic localizations. These results suggested not all predicted proteins were co-localizing with microtubules, and therefore might not necessarily be microtubule associated proteins but can possibly be functioning as microtubule associated regulator proteins. Proteomics analysis of a subset of these genes showed a large proportion of false positive interactions but also picked new interactions between member proteins that highlighted a module within the subcluster. The RNAi hits from the in vitro analysis and the members of the module within subcluster-16 from the in vivo analysis provide interesting subjects for further characterization.
|
12 |
Σχεδιασμός & ανάπτυξη μιας μετα-βάσης δεδομένων για το δίκτυο πρωτεϊνικών αλληλεπιδράσεων στον άνθρωποΓιουτλάκης, Άρης 26 July 2013 (has links)
Η αποσαφήνιση της σχέσης του γονοτύπου με το φαινότυπο ενός οργανισμού είναι μια από τις μεγαλύτερες προκλήσεις των επιστημών ζωής σήμερα. Για την επίτευξη του στόχου αυτού, η κατανόηση της δομής και της ρύθμισης του δικτύου πρωτεϊνικών αλληλεπιδράσεων (ΔΠΑ) είναι ένα από τα καθοριστικά στάδια αυτής της συσχέτισης. Πρώτο βήμα προς την κατεύθυνση αυτή αποτελεί η λεπτομερής και ακριβής ανακατασκευή του ΔΠΑ. Πειραματικά αποτελέσματα που υποστηρίζουν πρωτεϊνικές αλληλεπιδράσεις δημοσιεύονται στη βιβλιογραφία, από όπου η γνώση αυτή εξορύσσεται είτε μέσω άμεσης καταγραφής από ερευνητές είτε μέσω υπολογιστικών αλγορίθμων ανάλυσης κειμένου, και αποθηκεύεται σε πρωτογενείς βάσεις δεδομένων πρωτεϊνικών αλληλεπιδράσεων (ΒΔΠΑ). Για το ΔΠΑ στον άνθρωπο, υπάρχουν αρκετές ΒΔΠΑ, οι οποίες λόγω διαφορετικών στόχων, τρόπων εξόρυξης γνώσης από τη βιβλιογραφία και διαφορετικής διαχείρισης της βάσης, παρουσιάζουν μικρή επικάλυψη, περιγράφουν τα δεδομένα τους με ασύμβατο μεταξύ τους τρόπο και ορολογία, και ορίζουν τις πρωτεϊνικές αλληλεπιδράσεις μέσω διαφορετικών επιπέδων αναφοράς της γονιδιακής πληροφορίας.
Για την ενοποίηση δεδομένων πρωτεϊνικών αλληλεπιδράσεων από διάφορες πρωτογενείς βάσεις έχουν αναπτυχθεί μετα-βάσεις, οι οποίες προσπαθούν να ξεπεράσουν τα προβλήματα που προκύπτουν από την ετερογένεια των ΒΔΠΑ. Και στην περίπτωση των μεταβάσεων, όμως, ανακύπτουν προβλήματα, που αφορούν: α) στο ότι το δίκτυο ορίζεται με βάση τις πρωτεϊνικές αλληλεπιδράσεις και όχι τις πρωτεΐνες-κόμβους του ΔΠΑ, β) στον πλεονασμό κωδικών ταυτοποίησης των πρωτεϊνών στα διάφορα επίπεδα αναφοράς της γονιδιακής πληροφορίας, γ) στην ετερογένεια του τρόπου κανονικοποίησης των κωδικών ταυτοποίησης πρωτεϊνών, δ) στην υστέρηση της ανανέωσής τους σε σχέση με τις πρωτογενείς βάσεις και ε) στην επιλογή των δεδομένων που καταγράφονται από τις ΒΔΠΑ.
Ο σκοπός αυτής της εργασίας είναι ο σχεδιασμός και η ανάπτυξη μιας μετα-βάσης δεδομένων για το δίκτυο πρωτεϊνικών αλληλεπιδράσεων στον άνθρωπο, PICKLE, που να προσφέρει επαρκείς λύσεις στα προβλήματα αυτά. Η μεγάλη διαφορά σε σχέση με τις υπάρχουσες μετα-βάσεις είναι ο ορισμός του ΔΠΑ με βάση το αξιολογημένο πλήρες ανθρώπινο πρωτεϊνωμα (Reviewed complete Human Proteome), όπως αυτό ορίζεται από τη βάση δεδομένων γνώσης πρωτεϊνικής πληροφορίας UniProt ΚΒ. Για τις πρωτεΐνες αυτές αναζητήθηκε η σχετική πληροφορία αλληλεπιδράσεων στις πέντε κύριες δημόσιες βάσεις πρωτεϊνικών αλληλεπιδράσεων στον άνθρωπο, DIP, HPRD, IntAct, MINT και BioGRID. Τα προβλήματα του πλεονασμού και της κανονικοποίησης λύθηκαν μέσω της ανάπτυξης μίας κατάλληλης γονιδιακής οντολογίας, η οποία μας επέτρεψε να συνδέσουμε το πλήρες ανθρώπινο πρωτεϊνωμα με τα υπόλοιπα επίπεδα αναφοράς της γενετικής πληροφορίας, δρώντας παράλληλα ως ένας ευέλικτος και ακριβής μηχανισμός κανονικοποίησης. Για τη γρήγορη ανανέωση των δεδομένων της μετα-βάσης, αναπτύχθηκε μια αυτοματοποιημένη διαδικασία σύνδεσης και ενημέρωσής της από τις PPIDBs. Η πρώτη έκδοση της PICKLE κατέγραψε 83720 αλληλεπιδράσεις για 12418 UNIPROT IDs από το σύνολο των 20225 του πλήρους ανθρώπινου πρωτεϊνωματος, που υποστηρίζονται από 27.590 δημοσιεύσεις. Η PICKLE θα εμπλουτιστεί με ένα φιλικό προς το χρήστη γραφικό περιβάλλον και θα συνδεθεί με εργαλεία ανάλυσης δικτύων και ομικών δεδομένων, για να αποτελέσει πολύτιμο εργαλείο σε βιοϊατρικές μελέτες και εφαρμογές. / The elucidation of the underlying relationship between an organism’s genotype and its expressed phenotype is currently one the greatest challenges faced by life sciences and biology in general. In order to achieve that, the better understanding of the inner structure and regulation mechanisms of the protein-protein interaction (PPI) networks is of great importance. The first step towards that goal is the detailed and accurate reconstruction of the PPI network itself. The scientific literature is constantly being updated with new experimental results supporting PPI evidence, which in turn are fed into primary PPI databases (PPIDB) by the use of either curators or text mining algorithms. Currently there is a large number of PPIDB referring to the human PPIs. Since many of them have different goals, literature curation methods, and database administration strategies, it is not surprising that they also exhibit a limited PPI overlap and incompatible terminology for PPI intera\-ctors, i.e. use of arbitrary levels of genetic organization.
A number of meta-databases have been developed in order to achieve integrated overviews of PPI networks while circumventing the problems inherent in the field of primary PPI databases. Unfortunately, meta-databases have a number of issues of their own, such as: a) top-down network definition based on protein interactions instead of interactors, b) protein identifier redundancy in all levels of reference, c) the use of {\it ad hoc} normalization methods, d) infrequent updating and d) insufficient information stored.
The major goal of this thesis is the design and implementation of PICKLE (Protein Interaction Knowledge Base), a meta-database for the human PPI network created specifically to tackle the aforementioned problems. PICKLE’s novelty stems from its unique approach to PPI network definition, following a bottom-up reconstruction method based on UniProt’s reviewed complete human proteome (RCHP) definition. Five primary PPIDB (DIP, HPRD, IntAct, ΜΙΝΤ and BioGRID) were mined for interactions explicitly constrained by UniProt’s proteome definition. Furthermore, in order to tackle the issues of redundancy and inadequate normalization, a specific ontology was designed which allowed linking of the RCHP set with all the other levels of genetic organization while also serving as an agile yet accurate normaliza\-tion mechanism. In order to address the issue of updating, an autonomous means of data collection and integration was developed. PICKLE’s maiden release recorded 83720 direct PPIs involving 12418 UniProt IDs (out of 20225) supported by a total of 27590 publications. PICKLE, an evolving valuable bioinformatics for biomedical research and red biotechnology applications tool will soon be updated with a user-friendly interface and upgraded by linking it with network analysis software and various omics datasets.
|
13 |
Systems biological approach to Parkinson's diseaseHeil, Katharina Friedlinde January 2018 (has links)
Parkinson’s Disease (PD) is the second most common neurodegenerative disease in the Western world. It shows a high degree of genetic and phenotypic complexity with many implicated factors, various disease manifestations but few clear causal links. Ongoing research has identified a growing number of molecular alterations linked to the disease. Dopaminergic neurons in the substantia nigra, specifically their synapses, are the key-affected region in PD. Therefore, this work focuses on understanding the disease effects on the synapse, aiming to identify potential genetic triggers and synaptic PD associated mechanisms. Currently, one of the main challenges in this area is data quality and accessibility. In order to study PD, publicly available data were systematically retrieved and analysed. 418 PD associated genes could be identified, based on mutations and curated annotations. I curated an up-to-date and complete synaptic proteome map containing a total of 6,706 proteins. Region specific datasets describing the presynapse, postsynapse and synaptosome were also delimited. These datasets were analysed, investigating similarities and differences, including reproducibility and functional interpretations. The use of Protein-Protein-Interaction Network (PPIN) analysis was chosen to gain deeper knowledge regarding specific effects of PD on the synapse. Thus I generated a customised, filtered, human specific Protein-Protein Interaction (PPI) dataset, containing 211,824 direct interactions, from four public databases. Proteomics data and PPI information allowed the construction of PPINs. These were analysed and a set of low level statistics, including modularity, clustering coefficient and node degree, explaining the network’s topology from a mathematical point of view were obtained. Apart from low-level network statistics, high-level topology of the PPINs was studied. To identify functional network subgroups, different clustering algorithms were investigated. In the context of biological networks, the underlying hypothesis is that proteins in a structural community are more likely to share common functions. Therefore I attempted to identify PD enriched communities of synaptic proteins. Once identified, they were compared amongst each other. Three community clusters could be identified as containing largely overlapping gene sets. These contain 24 PD associated genes. Apart from the known disease associated genes in these communities, a total of 322 genes was identified. Each of the three clusters is specifically enriched for specific biological processes and cellular components, which include neurotransmitter secretion, positive regulation of synapse assembly, pre- and post-synaptic membrane, scaffolding proteins, neuromuscular junction development and complement activation (classical pathway) amongst others. The presented approach combined a curated set of PD associated genes, filtered PPI information and synaptic proteomes. Various small- and large-scale analytical approaches, including PPIN topology analysis, clustering algorithms and enrichment studies identified highly PD affected synaptic proteins and subregions. Specific disease associated functions confirmed known research insights and allowed me to propose a new list of so far unknown potential disease associated genes. Due to the open design, this approach can be used to answer similar research questions regarding other complex diseases amongst others.
|
14 |
Analyzing and Modeling Large Biological Networks: Inferring Signal Transduction PathwaysBebek, Gurkan January 2007 (has links)
No description available.
|
15 |
Analysis of Meso-scale Structures in Weighted GraphsSardana, Divya January 2017 (has links)
No description available.
|
16 |
Clustering algorithms and shape factor methods to discriminate among small GTPase phenotypes using DIC image analysis.Papaluca, Arturo 10 1900 (has links)
Naïvement perçu, le processus d’évolution est une succession d’événements de duplication et de mutations graduelles dans le génome qui mènent à des changements dans les fonctions et les interactions du protéome. La famille des hydrolases de guanosine triphosphate (GTPases) similaire à Ras constitue un bon modèle de travail afin de comprendre ce phénomène fondamental, car cette famille de protéines contient un nombre limité d’éléments qui diffèrent en fonctionnalité et en interactions. Globalement, nous désirons comprendre comment les mutations singulières au niveau des GTPases affectent la morphologie des cellules ainsi que leur degré d’impact sur les populations asynchrones.
Mon travail de maîtrise vise à classifier de manière significative différents phénotypes de la levure Saccaromyces cerevisiae via l’analyse de plusieurs critères morphologiques de souches exprimant des GTPases mutées et natives. Notre approche à base de microscopie et d’analyses bioinformatique des images DIC (microscopie d’interférence différentielle de contraste) permet de distinguer les phénotypes propres aux cellules natives et aux mutants. L’emploi de cette méthode a permis une détection automatisée et une caractérisation des phénotypes mutants associés à la sur-expression de GTPases constitutivement actives. Les mutants de GTPases constitutivement actifs Cdc42 Q61L, Rho5 Q91H, Ras1 Q68L et Rsr1 G12V ont été analysés avec succès.
En effet, l’implémentation de différents algorithmes de partitionnement, permet d’analyser des données qui combinent les mesures morphologiques de population native et mutantes. Nos résultats démontrent que l’algorithme Fuzzy C-Means performe un partitionnement efficace des cellules natives ou mutantes, où les différents types de cellules sont classifiés en fonction de plusieurs facteurs de formes cellulaires obtenus à partir des images DIC. Cette analyse démontre que les mutations Cdc42 Q61L, Rho5 Q91H, Ras1 Q68L et Rsr1 G12V induisent respectivement des phénotypes amorphe, allongé, rond et large qui sont représentés par des vecteurs de facteurs de forme distincts. Ces distinctions sont observées avec différentes proportions (morphologie mutante / morphologie native) dans les populations de mutants.
Le développement de nouvelles méthodes automatisées d’analyse morphologique des cellules natives et mutantes s’avère extrêmement utile pour l’étude de la famille des GTPases ainsi que des résidus spécifiques qui dictent leurs fonctions et réseau d’interaction. Nous pouvons maintenant envisager de produire des mutants de GTPases qui inversent leur fonction en ciblant des résidus divergents. La substitution fonctionnelle est ensuite détectée au niveau morphologique grâce à notre nouvelle stratégie quantitative. Ce type d’analyse peut également être transposé à d’autres familles de protéines et contribuer de manière significative au domaine de la biologie évolutive. / Evolution is a gradual process that gives rise to changes in the form of mutations that are reflected at the protein level. We propose that evolution of new pathways occurs by switching binding partners, hence creating new functions. The different functions encountered in a given family of related proteins have emerged from a common ancestor that has been duplicated and mutated to become implicated in new interactions and to gain new functions. In this study, we will use native and constitutive active mutant variants of the Ras-like family of small GTPases as working model, to explore such gene duplications, followed by neo / sub-functionalization. The reason for choosing this family resides in the fact that it is a defined set of proteins with well known functions that are mediated through multiple protein-protein interactions.
The aim of this master is to perform a classification of budding yeast phenotypes using different approaches in order to statistically determine at which level of the population these constitutively active mutations are capable to affect cell morphology. Working with a subset of the Ras-like small GTPases family, we recently developed an approach to catalogue and classify these proteins based on multiple physical and chemical criteria. Using microscopic and bioinformatics methods, we characterized phenotypes associated with over-expression of the native small GTPases of the budding yeast Saccharomyces cerevisiae, showing that an established classification is not very clear.
We are interested to investigate how point mutations in small GTPases can affect the cell morphology and their level of impact on asynchronous population. We want to establish a method to determine and quantify mutant and wild type-like phenotypes on these populations using Differential interference contrast microscopy (DIC) images only. As for the first aim of this study, we hypothesize that clustering algorithms can partition mutant cells from wild type cells based on cell shape factor measurements. To prove this hypothesis, we proposed to implement different clustering algorithms to analyze datasets which combines measurements from wild type and respective mutant populations.
We created constitutively active forms of these small GTPases and used Cdc42, Rho5, Ras1 and Rsr1 to validate our results. We observed that Cdc42 Q61L, Rho5 Q91H, Ras1 Q68L and Rsr1 G12V mutations induced characteristic amorphous, clumped/elongated, rounded and discrete large phenotypes respectively. This classification allowed us to define a phenotypical classification related to functions. Phenotype classification of the small GTPases has been confirmed using shape factor formulas accompanied with bioinformatics approaches. These approaches which involved different clustering methods allowed an automated quantitative characterization of the phenotypes of up to 7293 mutant cells.
Sequence alignment of Cdc42 and Rho5 showed 46.1% identity as well as 62.6% for Ras1 and Rsr1 allowing the identification of diverged residues potentially involved in specific functions and protein-protein interactions. Directed mutagenesis and substitution of these sites from one gene to another have been performed in some positions to test for specificity and involvement in morphology changes. In parallel, interactions observed for native and constitutively active mutants Cdc42 and Rho5 will be assayed with protein-fragment complementation assay (PCA). This will enable us to determine whether a high correlation exists between functions switches and binding partner’s switches.
We propose to expand this approach to the whole Ras-like small GTPases family and monitor protein-protein interactions and functions at a network scale. This research will confirm whether enrichment or depletion of residues in specific sites induces a switch of function due to switching binding partners. Understanding the mechanism underlying such correlation is important to gain insight in the biological mechanisms underlying the Ras-like small GTPases and other proteins evolution. Such knowledge is of fundamental importance in biomedical and pharmaceutical fields, since Ras-like small GTPases represent important targets for therapeutic interventions and for the evolutionary biology field.
|
17 |
Des protéines et de leurs interactions aux principes évolutifs des systèmes biologiques / From proteins and their interactions to evolutionary principles of biological systemsCarvunis, Anne-Ruxandra 26 January 2011 (has links)
Darwin a révélé au monde que les espèces vivantes ne cessent jamais d’évoluer, mais les mécanismes moléculaires de cette évolution restent le sujet de recherches intenses. La biologie systémique propose que les relations entre génotype, environnement et phénotype soient sous-tendues par un ensemble de réseaux moléculaires dynamiques au sein de la cellule, mais l’organisation de ces réseaux demeure mystérieuse. En combinant des concepts établis en biologie évolutive et systémique avec la cartographie d’interactions protéiques et l’étude des méthodologies d’annotation de génomes, j’ai développé de nouvelles approches bioinformatiques qui ont en partie dévoilé la composition et l’organisation des systèmes cellulaires de trois organismes eucaryotes : la levure de boulanger, le nématode Caenorhabditis elegans et la plante Arabidopsis thaliana. L’analyse de ces systèmes m’a conduit à proposer des hypothèses sur les principes évolutifs des systèmes biologiques. En premier lieu, je propose une théorie selon laquelle la traduction fortuite de régions intergéniques produirait des peptides sur lesquels la sélection naturelle agirait pour aboutir occasionnellement à la création de protéines de novo. De plus, je montre que l’évolution de protéines apparues par duplication de gènes est corrélée avec celle de leurs profils d’interactions. Enfin, j’ai mis en évidence des signatures de la co-évolution ancestrale hôte-pathogène dans l’organisation topologique du réseau d‘interactions entre protéines de l’hôte. Mes travaux confortent l’hypothèse que les systèmes moléculaires évoluent, eux aussi, de manière darwinienne. / Darwin exposed to the world that living species continuously evolve. Yet the molecular mechanisms of evolution remain under intense research. Systems biology proposes that dynamic molecular networks underlie relationships between genotype, environment and phenotype, but the organization of these networks is mysterious. Combining established concepts from evolutionary and systems biology with protein interaction mapping and the study of genome annotation methodologies, I have developed new bioinformatics approaches that partially unveiled the composition and organization of cellular systems for three eukaryotic organisms: the baker’s yeast, the nematode Caenorhabditis elegans and the plant Arabidopsis thaliana. My analyses led to insights into the evolution of biological systems. First, I propose that the translation of peptides from intergenic regions could lead to de novo birth of new protein-coding genes. Second, I show that the evolution of proteins originating from gene duplications and of their physical interaction repertoires are tightly interrelated. Lastly, I uncover signatures of the ancestral host-pathogen co-evolution in the topology of a host protein interaction network. My PhD work supports the thesis that molecular systems also evolve in a Darwinian fashion.
|
18 |
Clustering algorithms and shape factor methods to discriminate among small GTPase phenotypes using DIC image analysisPapaluca, Arturo 10 1900 (has links)
No description available.
|
19 |
Unraveling the Structure and Assessing the Quality of Protein Interaction Networks with Power Graph AnalysisRoyer, Loic 12 December 2017 (has links) (PDF)
Molecular biology has entered an era of systematic and automated experimentation. High-throughput techniques have moved biology from small-scale experiments focused on specific genes and proteins to genome and proteome-wide screens. One result of this endeavor is the compilation of complex networks of interacting proteins. Molecular biologists hope to understand life's complex molecular machines by studying these networks. This thesis addresses tree open problems centered upon their analysis and quality assessment.
First, we introduce power graph analysis as a novel approach to the representation and visualization of biological networks. Power graphs are a graph theoretic approach to lossless and compact representation of complex networks. It groups edges into cliques and bicliques, and nodes into a neighborhood hierarchy. We demonstrate power graph analysis on five examples, and show its advantages over traditional network representations. Moreover, we evaluate the algorithm performance on a benchmark, test the robustness of the algorithm to noise, and measure its empirical time complexity at O (e1.71)- sub-quadratic in the number of edges e.
Second, we tackle the difficult and controversial problem of data quality in protein interaction networks. We propose a novel measure for accuracy and completeness of genome-wide protein interaction networks based on network compressibility. We validate this new measure by i) verifying the detrimental effect of false positives and false negatives, ii) showing that gold standard networks are highly compressible, iii) showing that authors' choice of confidence thresholds is consistent with high network compressibility, iv) presenting evidence that compressibility is correlated with co-expression, co-localization and shared function, v) showing that complete and accurate networks of complex systems in other domains exhibit similar levels of compressibility than current high quality interactomes.
Third, we apply power graph analysis to networks derived from text-mining as well to gene expression microarray data. In particular, we present i) the network-based analysis of genome-wide expression profiles of the neuroectodermal conversion of mesenchymal stem cells. ii) the analysis of regulatory modules in a rare mitochondrial cytopathy: emph{Mitochondrial Encephalomyopathy, Lactic acidosis, and Stroke-like episodes} (MELAS), and iii) we investigate the biochemical causes behind the enhanced biocompatibility of tantalum compared with titanium.
|
20 |
TARGETED AND UNTARGETED OMICS FOR DISEASE BIOMARKERS USING LC-MSGorityala, Shashank January 2018 (has links)
No description available.
|
Page generated in 0.1254 seconds