• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 422
  • 86
  • 54
  • 54
  • 50
  • 21
  • 11
  • 7
  • 5
  • 5
  • 2
  • 2
  • 2
  • 2
  • 2
  • Tagged with
  • 849
  • 470
  • 348
  • 131
  • 118
  • 111
  • 105
  • 88
  • 67
  • 65
  • 62
  • 61
  • 58
  • 51
  • 50
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
301

Protein-protein interactions and metabolic pathways reconstruction of <i>Caenorhabditis elegans</i>

Akhavan Mahdavi, Mahmood 08 June 2007 (has links)
Metabolic networks are the collections of all cellular activities taking place in a living cell and all the relationships among biological elements of the cell including genes, proteins, enzymes, metabolites, and reactions. They provide a better understanding of cellular mechanisms and phenotypic characteristics of the studied organism. In order to reconstruct a metabolic network, interactions among genes and their molecular attributes along with their functions must be known. Using this information, proteins are distributed among pathways as sub-networks of a greater metabolic network. Proteins which carry out various steps of a biological process operate in same pathway.<p>The metabolic network of <i>Caenorhabditis elegans</i> was reconstructed based on current genomic information obtained from the KEGG database, and commonly found in SWISS-PROT and WormBase. Assuming proteins operating in a pathway are interacting proteins, currently available protein-protein interaction map of the studied organism was assembled. This map contains all known protein-protein interactions collected from various sources up to the time. Topology of the reconstructed network was briefly studied and the role of key enzymes in the interconnectivity of the network was analysed. The analysis showed that the shortest metabolic paths represent the most probable routes taken by the organism where endogenous sources of nutrient are available to the organism. Nonetheless, there are alternate paths to allow the organism to survive under extraneous variations. <p>Signature content information of proteins was utilized to reveal protein interactions upon a notion that when two proteins share signature(s) in their primary structures, the two proteins are more likely to interact. The signature content of proteins was used to measure the extent of similarity between pairs of proteins based on binary similarity score. Pairs of proteins with a binary similarity score greater than a threshold corresponding to confidence level 95% were predicted as interacting proteins. The reliability of predicted pairs was statistically analyzed. The sensitivity and specificity analysis showed that the proposed approach outperformed maximum likelihood estimation (MLE) approach with a 22% increase in area under curve of receiving operator characteristic (ROC) when they were applied to the same datasets. When proteins containing one and two known signatures were removed from the protein dataset, the area under curve (AUC) increased from 0.549 to 0.584 and 0.655, respectively. Increase in the AUC indicates that proteins with one or two known signatures do not provide sufficient information to predict robust protein-protein interactions. Moreover, it demonstrates that when proteins with more known signatures are used in signature profiling methods the overlap with experimental findings will increase resulting in higher true positive rate and eventually greater AUC. <p>Despite the accuracy of protein-protein interaction methods proposed here and elsewhere, they often predict true positive interactions along with numerous false positive interactions. A global algorithm was also proposed to reduce the number of false positive predicted protein interacting pairs. This algorithm relies on gene ontology (GO) annotations of proteins involved in predicted interactions. A dataset of experimentally confirmed protein pair interactions and their GO annotations was used as a training set to train keywords which were able to recover both their source interactions (training set) and predicted interactions in other datasets (test sets). These keywords along with the cellular component annotation of proteins were employed to set a pair of rules that were to be satisfied by any predicted pair of interacting proteins. When this algorithm was applied to four predicted datasets obtained using phylogenetic profiles, gene expression patterns, chance co-occurrence distribution coefficient, and maximum likelihood estimation for S. cerevisiae and <i>C. elegans</i>, the improvement in true positive fractions of the datasets was observed in a magnitude of 2-fold to 10-fold depending on the computational method used to create the dataset and the available information on the organism of interest. <p>The predicted protein-protein interactions were incorporated into the prior reconstructed metabolic network of <i>C. elegans</i>, resulting in 1024 new interactions among 94 metabolic pathways. In each of 1024 new interactions one unknown protein was interacting with a known partner found in the reconstructed metabolic network. Unknown proteins were characterized based on the involvement of their known partners. Based on the binary similarity scores, the function of an uncharacterized protein in an interacting pair was defined according to its known counterpart whose function was already specified. With the incorporation of new predicted interactions to the metabolic network, an expanded version of that network was resulted with 27% increase in the number of known proteins involved in metabolism. Connectivity of proteins in protein-protein interaction map changed from 42 to 34 due to the increase in the number of characterized proteins in the network.
302

Identification of protein-protein interactions in the type two secretion system of <i>aeromonas hydrophila</i>

Zhong, Su 09 March 2009 (has links)
The type II secretion system is used by many pathogenic and non-pathogenic bacteria for the extracellular secretion of enzymes and toxins. <i>Aeromonas hydrophila</i> is a Gram-negative pathogen that secretes proteins via the type II secretion system.<p> In the studies described here, a series of yeast two-hybrid assays was performed to identify protein-protein interactions in the type II secretion system of <i>A. hydrophila</i>. The periplasmic domains of ExeA and ExeB were assayed for interactions with the periplasmic domains of Exe A, B, C, D, K, L, M, and N. Interactions were observed for both ExeA and ExeB with the secretin ExeD in one orientation. In addition, a previously identified interaction between ExeC and ExeD was observed. In order to further examine and map these interactions, a series of eight two-codon insertion mutations in the amino terminal domain of ExeD was screened against the periplasmic domains of ExeA and ExeB. As a result, the interactions were verified and mapped to subdomains of the ExeD periplasmic domain. To positively identify the region of ExeD involved in the interactions with ExeA, B, C and D, deletion mutants of ExeD were constructed based on the two-codon insertion mutation mapping of subdomains of the ExeD periplasmic domain, and yeast two-hybrid assays were carried out. The results showed that a fragment of the periplasmic domain of ExeD, from amino acid residue 26 to 200 of ExeD, was involved in the interactions with ExeA, B and C. As an independent assay for interactions between ExeAB and the secretin, His-tagged derivatives of the periplasmic domains of ExeA and ExeB were constructed and co-purification on Ni-NTA agarose columns was used to test for interactions with untagged ExeD. These experiments confirmed the interaction between ExeA and ExeD, although there was background in the co-purification test.<p> These results provide support for the hypothesis that the ExeAB complex functions to organize the assembly of the secretin through interactions between both peptidoglycan and the secretin that result in its multimerization into the peptidoglycan and outer membrane layers of the envelope.
303

Analysis and Redesign of Protein-Protein Interactions: A Hotspot-Centric View

Layton, Curtis James January 2010 (has links)
<p><p>One of the most significant discoveries from mutational analysis of protein interfaces is that often a large percentage of interface residues negligibly perturb the binding energy upon mutation, while residues in a few critical "hotspots" drastically reduce affinity when mutated. The organization of protein interfaces into hotspots has a number of important implications. For example, small interfaces can have high affinity, and when multiple binding partners are generated to the same protein, they are predisposed to binding the same regions and often have the same hotspots. Even small molecules that bind to interfaces and disrupt protein-protein interactions (PPIs) tend to bind at hotspots. This suggests that some hotspot-forming sites on protein surfaces are <italic>intrinsically</italic> more apt to form protein interfaces. These observations paint a hotspot-centric picture of PPI energetics, and present a question of fundamental importance which remains largely unanswered: <italic>why are hotspots hot?</italic></p></p><p><p>In order to gain insight into the nature of hotspots I experimentally examined the small, but high-affinity interface between the synthetically evolved ankyrin repeat protein Off7 with E. coli maltose binding protein by characterization of mutant variants and redesigned interfaces. In order to characterize many mutants, I developed two high-throughput assays to measure protein-protein binding that integrate with existing technology for the high-throughput fabrication of genes. The first is an ELISA-based method using in vitro expressed protein for semi-quantitative analysis of affinity. Starting from DNA encoding protein partners, binding data is obtained in just a few hours; no exogenous purification is required. For the second assay, I develop data fitting methods and thermodynamic framework for determination of binding free energies from binding-induced shifts in protein thermal stability monitored with Sypro Orange.</p></p><p><p>Analysis of Off7/MBP variants using these methods reveals that conservative mutagenesis or local computational repacking is tolerated for many residues in the interface without drastic loss of affinity, except for a single essential hotspot. This hotspot contains a Tyr-His-Asp hydrogen bonding network reminiscent of a common catalytic motif. Substitution of the tyrosine with phenylalanine shows that a single hydrogen bond across the interface is critical for binding. Analysis of the protein database by structural bioinformatics shows that, although rare, this motif is present in other naturally evolved interfaces. Such a triad was found in the homodimeric interface of PH0642 from Pyrococcus horikoshii, and is conserved between many homologues in the nitrilase superfamily, meeting one of the key criteria by which potential hotspots can be identified. This analysis supports a number of analogies between hotspot residues and catalytic residues in enzyme active sites, and raises the intriguing possibility that hotspots may be associated with other structural motifs that could be used for identification or design of PPIs.</p></p> / Dissertation
304

Prediction for the Essential Protein with the Support Vector Machine

Yang, Zih-Jie 06 September 2011 (has links)
Essential proteins affect the cellular life deeply, but it is hard to identify them. Protein-protein interaction is one of the ways to disclose whether a protein is essential or not. We notice that many researchers use the feature set composed of topology properties from protein-protein interaction to predict the essential proteins. However, the functionality of a protein is also a clue to determine its essentiality. In this thesis, to build SVM models for predicting the essential proteins, our feature set contains the sequence properties which can influence the protein function, topology properties and protein properties. In our experiments, we download Scere20070107, which contains 4873 proteins and 17166 interactions, from DIP database. The ratio of essential proteins to nonessential proteins is nearly 1:4, so it is imbalanced. In the imbalanced dataset, the best values of F-measure, MCC, AIC and BIC of our models are 0.5197, 0.4671, 0.2428 and 0.2543, respectively. We build another balanced dataset with ratio 1:1. For balanced dataset, the best values of F-measure, MCC, AIC and BIC of our models are 0.7742, 0.5484, 0.3603 and 0.3828, respectively. Our results are superior to all previous results with various measurements.
305

Accurate and Reliable Cancer Classi cation Based on Pathway-Markers and Subnetwork-Markers

Su, Junjie 2010 December 1900 (has links)
Finding reliable gene markers for accurate disease classification is very challenging due to a number of reasons, including the small sample size of typical clinical data, high noise in gene expression measurements, and the heterogeneity across patients. In fact, gene markers identified in independent studies often do not coincide with each other, suggesting that many of the predicted markers may have no biological significance and may be simply artifacts of the analyzed dataset. To nd more reliable and reproducible diagnostic markers, several studies proposed to analyze the gene expression data at the level of groups of functionally related genes, such as pathways. Given a set of known pathways, these methods estimate the activity level of each pathway by summarizing the expression values of its member genes and using the pathway activities for classification. One practical problem of the pathway-based approach is the limited coverage of genes by currently known pathways. As a result, potentially important genes that play critical roles in cancer development may be excluded. In this thesis, we first propose a probabilistic model to infer pathway/subnetwork activities. After that, we developed a novel method for identifying reliable subnetwork markers in a human protein-protein interaction (PPI) network based on probabilistic inference of subnetwork activities. We tested the proposed methods based on two independent breast cancer datasets. The proposed method can efficiently find reliable subnetwork markers that outperform the gene-based and pathway-based markers in terms of discriminative power, reproducibility and classification performance. The identified subnetwork markers are highly enriched in common GO terms, and they can more accurately classify breast cancer metastasis compared to markers found by a previous method.
306

Prediction Of Protein-protein Interactions From Sequence Using Evolutionary Relations Of Proteins And Species

Guney, Tacettin Dogacan 01 October 2009 (has links) (PDF)
Prediction of protein-protein interactions is an important part in understanding the biological processes in a living cell. There are completely sequenced organisms that do not yet have experimentally verified protein-protein interaction networks. For such organisms, we can not generally use a supervised method, where a portion of the protein-protein interaction network is used as training set. Furthermore, for newly-sequenced organisms, many other data sources, such as gene expression data and gene ontology annotations, that are used to identify protein-protein interaction networks may not be available. In this thesis work, our aim is to identify and cluster likely protein-protein interaction pairs using only sequence of proteins and evolutionary information. We use a protein&rsquo / s phylogenetic profile because the co-evolutionary pressure hypothesis suggests that proteins with similar phylogenetic profiles are likely to interact. We also divide phylogenetic profile into smaller profiles based on the evolutionary lines. These divided profiles are then used to score the similarity between all possible protein pairs. Since not all profile groups have the same number of elements, it is a difficult task to assess the similarity between such pairs. We show that many commonly used measures do not work well and that the end result greatly depends on the type of the similarity measure used. We also introduce a novel similarity measure. The resulting dense putative interaction network contains many false-positive interactions, therefore we apply the Markov Clustering algorithm to cluster the protein-protein interaction network and filter out the weaker edges. The end result is a set of clusters where proteins within the clusters are likely to be functionally linked and to interact. While this method does not perform as well as supervised methods, it has the advantage of not requiring a training set and being able to work only using sequence data and evolutionary information. So it can be used as a first step in identifying protein-protein interactions in newly-sequenced organisms.
307

Multi-resolution Visualization Of Large Scale Protein Networks Enriched With Gene Ontology Annotations

Yasar, Sevgi 01 September 2009 (has links) (PDF)
Genome scale protein-protein interactions (PPIs) are interpreted as networks or graphs with thousands of nodes from the perspective of computer science. PPI networks represent various types of possible interactions among proteins or genes of a genome. PPI data is vital in protein function prediction since functions of the cells are performed by groups of proteins interacting with each other and main complexes of the cell are made of proteins interacting with each other. Recent increase in protein interaction prediction techniques have made great amount of protein-protein interaction data available for genomes. As a consequence, a systematic visualization and analysis technique has become crucial. To the best of our knowledge, no PPI visualization tool consider multi-resolution viewing of PPI network. In this thesis, we implemented a new approach for PPI network visualization which supports multi-resolution viewing of compound graphs. We construct compound nodes and label them by using gene set enrichment methods based on Gene Ontology annotations. This thesis further suggests new methods for PPI network visualization.
308

High-level Expression Of Hepatitis B Surface Antigen In Pichia Pastoris, Its Purification And Immunological Characterization

Selamoglu, Hande 01 November 2009 (has links) (PDF)
Hepatitis B virus (HBV), which belongs to the family Hepadnaviridae, is responsible for acute and chronic hepatitis. The vaccines presently used to immunize patients against HBV are recombinant subunit vaccines consisting of viral surface antigens (S protein). However, they are expensive and their use is limited in poor countries. For that reason, HBV remains an important worldwide health problem. Of the 2 billion people who have been infected with the HBV, more than 350 million have chronic (lifelong) infections, who face increased risk of developing cirrhosis and hepatocellular carcinoma. In this study, high-level expression of recombinant Hepatitis B surface Antigen (rHBsAg), PreS2-S was achieved in the methylotrophic yeast, Pichia pastoris. For this aim, a single copy of HBV M gene (PreS2-S) was inserted at the downstream of the alcohol oxidase (AOX1) promoter of the pPICZA vector. rHBsAg protein could then be expressed intracellularly by induction with methanol. High cell density fermentation was followed by chromatographic separation to obtain pure rHBsAg. Humoral response after immunization with the purified protein was observed in mice using commercial Hepatitis B surface antigen kits. It was verified by the atomic force microscopy that rHBsAg has been produced in the desired conformation.
309

Parallelization Of Functional Flow To Predict Protein Functions

Akkoyun, Emrah 01 February 2011 (has links) (PDF)
Protein-protein interaction networks provide important information about what the biological function of proteins whose roles are unknown might be in a cell. These interaction networks were analyzed by a variety of approaches by running them on a single computer and the roles of the proteins identified were used to predict the function of the proteins unidentified. The functional flow is an approach that takes the network connectivity, distance effect, topology of the network with local and global views into account. With these advantages, that the functional flow produces more accurate results on the prediction of protein functions was presented by the previos conducted researches. However, the application implemented for this approach could not be practically applied on the large and complex network produced for the complex species because of memory limitation. The purpose of this thesis is to provide a new application be implemented on the high computing performance where the application can be scaled on the large data sets. Therefore, Hadoop, one of the open source map/reduce environments, was installed on 18 hosts each of which has eight cores. Method / the first map/reduce job distributes the protein interaction network as a format which allows parallel distributed computing to all the worker nodes, the other map/reduce job generates flows for each known protein function and the role of the proteins unidentified are predicted by accumulating all of these generated flows. It has been observed in the experiments we performed that the application requiring high performance computing can be decomposed into worker nodes efficiently and the application can provide better performance as the resources increase.
310

Sinec: Large Scale Signaling Network Topology Reconstruction Using Protein-protein Interactions And Rnai Data

Hashemikhabir, Seyedsasan 01 September 2012 (has links) (PDF)
Reconstructing the topology of a signaling network by means of RNA interference (RNAi) technology is an underdetermined problem especially when a single gene in the network is knocked down or observed. In addition, the exponential search space limits the existing methods to small signaling networks of size 10-15 genes. In this thesis, we propose integrating RNAi data with a reference physical interaction network. We formulate the problem of signaling network reconstruction as finding the minimum number of edit operations on a given reference network. The edit operations transform the reference network to a network that satisfy the RNAi observations. We show that using a reference network does not simplify the computational complexity of the problem. Therefore, we propose an approach that provides near optimal results and can scale well for reconstructing networks up to hundreds of components. We validate the proposed method on synthetic and real datasets. Comparison with the state of the art on real signaling networks shows that the proposed methodology can scale better and generates biologically significant results.

Page generated in 0.0531 seconds