• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 18
  • 6
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 42
  • 42
  • 17
  • 14
  • 8
  • 7
  • 7
  • 6
  • 6
  • 6
  • 6
  • 5
  • 4
  • 4
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Differential Roles of Tryptophan Residues in the Functional Expression of Human Anion Exchanger 1

Okawa, Yuka 15 August 2012 (has links)
Anion exchanger 1 (AE1) is a 95 kDa glycoprotein that facilitates Cl-/HCO3- exchange across the erythrocyte plasma membrane. Seven conserved tryptophan (Trp) residues are in the AE1 membrane domain; at the membrane interface (Trp648, Trp662, and Trp723), in transmembrane segment (TM) 4 (Trp492 and Trp496), and in hydrophilic loops (Trp831, and Trp848). All 7 Trp residues were individually mutated into alanine (Ala) and phenylalanine (Phe) and transiently expressed in human embryonic kidney (HEK)-293 cells. The 7 Trp residues could be grouped into three classes according to the impact of the mutations on the functional expression of AE1: class 1, normal expression, class 2, expression decreased, and class 3, expression decreased by Ala substitution. These results indicate that Trp residues play differential roles in AE1 expression depending on their location in the protein and suggest that Trp mutants with a low expression are misfolded and retained in the ER.
32

Data Mining Algorithms for Classification of Complex Biomedical Data

Lan, Liang January 2012 (has links)
In my dissertation, I will present my research which contributes to solve the following three open problems from biomedical informatics: (1) Multi-task approaches for microarray classification; (2) Multi-label classification of gene and protein prediction from multi-source biological data; (3) Spatial scan for movement data. In microarray classification, samples belong to several predefined categories (e.g., cancer vs. control tissues) and the goal is to build a predictor that classifies a new tissue sample based on its microarray measurements. When faced with the small-sample high-dimensional microarray data, most machine learning algorithm would produce an overly complicated model that performs well on training data but poorly on new data. To reduce the risk of over-fitting, feature selection becomes an essential technique in microarray classification. However, standard feature selection algorithms are bound to underperform when the size of the microarray data is particularly small. The best remedy is to borrow strength from external microarray datasets. In this dissertation, I will present two new multi-task feature filter methods which can improve the classification performance by utilizing the external microarray data. The first method is to aggregate the feature selection results from multiple microarray classification tasks. The resulting multi-task feature selection can be shown to improve quality of the selected features and lead to higher classification accuracy. The second method jointly selects a small gene set with maximal discriminative power and minimal redundancy across multiple classification tasks by solving an objective function with integer constraints. In protein function prediction problem, gene functions are predicted from a predefined set of possible functions (e.g., the functions defined in the Gene Ontology). Gene function prediction is a complex classification problem characterized by the following aspects: (1) a single gene may have multiple functions; (2) the functions are organized in hierarchy; (3) unbalanced training data for each function (much less positive than negative examples); (4) missing class labels; (5) availability of multiple biological data sources, such as microarray data, genome sequence and protein-protein interactions. As participants in the 2011 Critical Assessment of Function Annotation (CAFA) challenge, our team achieved the highest AUC accuracy among 45 groups. In the competition, we gained by focusing on the 5-th aspect of the problem. Thus, in this dissertation, I will discuss several schemes to integrate the prediction scores from multiple data sources and show their results. Interestingly, the experimental results show that a simple averaging integration method is competitive with other state-of-the-art data integration methods. Original spatial scan algorithm is used for detection of spatial overdensities: discovery of spatial subregions with significantly higher scores according to some density measure. This algorithm is widely used in identifying cluster of disease cases (e.g., identifying environmental risk factors for child leukemia). However, the original spatial scan algorithm only works on static spatial data. In this dissertation, I will propose one possible solution for spatial scan on movement data. / Computer and Information Science
33

Untersuchung der synaptischen Neurotransmitterfreisetzung mit kombiniert elektrophysiologischen und bildgebenden Verfahren / Investigation of the synaptic neurotransmitter release with combined electrophysiological and imaging techniques

Sigler, Albrecht 27 April 2005 (has links)
No description available.
34

The relationship between orthology, protein domain architecture and protein function

Forslund, Kristoffer January 2011 (has links)
Lacking experimental data, protein function is often predicted from evolutionary and protein structure theory. Under the 'domain grammar' hypothesis the function of a protein follows from the domains it encodes. Under the 'orthology conjecture', orthologs, related through species formation, are expected to be more functionally similar than paralogs, which are homologs in the same or different species descended from a gene duplication event. However, these assumptions have not thus far been systematically evaluated. To test the 'domain grammar' hypothesis, we built models for predicting function from the domain combinations present in a protein, and demonstrated that multi-domain combinations imply functions that the individual domains do not. We also developed a novel gene-tree based method for reconstructing the evolutionary histories of domain architectures, to search for cases of architectures that have arisen multiple times in parallel, and found this to be more common than previously reported. To test the 'orthology conjecture', we first benchmarked methods for homology inference under the obfuscating influence of low-complexity regions, in order to improve the InParanoid orthology inference algorithm. InParanoid was then used to test the relative conservation of functionally relevant properties between orthologs and paralogs at various evolutionary distances, including intron positions, domain architectures, and Gene Ontology functional annotations. We found an increased conservation of domain architectures in orthologs relative to paralogs, in support of the 'orthology conjecture' and the 'domain grammar' hypotheses acting in tandem. However, equivalent analysis of Gene Ontology functional conservation yielded spurious results, which may be an artifact of species-specific annotation biases in functional annotation databases. I discuss possible ways of circumventing this bias so the 'orthology conjecture' can be tested more conclusively. / At the time of the doctoral defense, the following paper was unpublished and had a status as follows: Paper 6: Epub ahead of print.
35

Intrinsically disordered proteins in molecular recognition and structural proteomics

Oldfield, Christopher John 05 1900 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Intrinsically disordered proteins (IDPs) are abundant in nature, being more prevalent in the proteomes of eukaryotes than those of bacteria or archaea. As introduced in Chapter I, these proteins, or portions of these proteins, lack stable equilibrium structures and instead have dynamic conformations that vary over time and population. Despite the lack of preformed structure, IDPs carry out many and varied molecular functions and participate in vital biological pathways. In particular, IDPs play important roles in cellular signaling that is, in part, enabled by the ability of IDPs to mediate molecular recognition. In Chapter II, the role of intrinsic disorder in molecular recognition is examined through two example IDPs: p53 and 14-3-3. The p53 protein uses intrinsically disordered regions at its N- and C-termini to interact with a large number of partners, often using the same residues. The 14-3-3 protein is a structured domain that uses the same binding site to recognize multiple intrinsically disordered partners. Examination of the structural details of these interactions highlights the importance of intrinsic disorder and induced fit in molecular recognition. More generally, many intrinsically disordered regions that mediate interactions share similar features that are identifiable from protein sequence. Chapter IV reviews several models of IDP mediated protein-protein interactions that use completely different parameterizations. Each model has its relative strengths in identifying novel interaction regions, and all suggest that IDP mediated interactions are common in nature. In addition to the biologic importance of IDPs, they are also practically important in the structural study of proteins. The presence of intrinsic disordered regions can inhibit crystallization and solution NMR studies of otherwise well-structured proteins. This problem is compounded in the context of high throughput structure determination. In Chapter III, the effect of IDPs on structure determination by X-ray crystallography is examined. It is found that protein crystals are intolerant of intrinsic disorder by examining existing crystal structures from the PDB. A retrospective analysis of Protein Structure Initiative data indicates that prediction of intrinsic disorder may be useful in the prioritization and improvement of targets for structure determination.
36

Validation of a cell line model for studying XPD protein function in Nucleotide Excision Repair

Kavuri, Naga Swathi Sree 16 May 2023 (has links)
No description available.
37

Network-based inference of protein function and disease-gene association

Jaeger, Samira 23 April 2012 (has links)
Proteininteraktionen sind entscheidend für zelluläre Funktion. Interaktionen reflektieren direkte funktionale Beziehungen zwischen Proteinen. Veränderungen in spezifischen Interaktionsmustern tragen zur Entstehung von Krankheiten bei. In dieser Arbeit werden funktionale und pathologische Aspekte von Proteininteraktionen analysiert, um Funktionen für bisher nicht charakterisierte Proteine vorherzusagen und Proteine mit Krankheitsphänotypen zu assoziieren. Verschiedene Methoden wurden in den letzten Jahren entwickelt, die die funktionalen Eigenschaften von Proteinen untersuchen. Dennoch bleibt ein wesentlicher Teil der Proteine, insbesondere menschliche, uncharakterisiert. Wir haben eine Methode zur Vorhersage von Proteinfunktionen entwickelt, die auf Proteininteraktionsnetzwerken verschiedener Spezies beruht. Dieser Ansatz analysiert funktionale Module, die über evolutionär konservierte Prozesse definiert werden. In diesen Modulen werden Proteinfunktionen gemeinsam über Orthologiebeziehungen und Interaktionspartner vorhergesagt. Die Integration verschiedener funktionaler Ähnlichkeiten ermöglicht die Vorhersage neuer Proteinfunktionen mit hoher Genauigkeit und Abdeckung. Die Aufklärung von Krankheitsmechanismen ist wichtig, um ihre Entstehung zu verstehen und diagnostische und therapeutische Ansätze zu entwickeln. Wir stellen einen Ansatz für die Identifizierung krankheitsrelevanter Genprodukte vor, der auf der Kombination von Proteininteraktionen, Proteinfunktionen und Netzwerkzentralitätsanalyse basiert. Gegeben einer Krankheit, werden krankheitsspezifische Netzwerke durch die Integration von direkt und indirekt interagierender Genprodukte und funktionalen Informationen generiert. Proteine in diesen Netzwerken werden anhand ihrer Zentralität sortiert. Das Einbeziehen indirekter Interaktionen verbessert die Identifizierung von Krankheitsgenen deutlich. Die Verwendung von vorhergesagten Proteinfunktionen verbessert das Ranking von krankheitsrelevanten Proteinen. / Protein interactions are essential to many aspects of cellular function. On the one hand, they reflect direct functional relationships. On the other hand, alterations in protein interactions perturb natural cellular processes and contribute to diseases. In this thesis we analyze both the functional and the pathological aspect of protein interactions to infer novel protein function for uncharacterized proteins and to associate yet uncharacterized proteins with disease phenotypes, respectively. Different experimental and computational approaches have been developed in the past to investigate the basic characteristics of proteins systematically. Yet, a substantial fraction of proteins remains uncharacterized, particularly in human. We present a novel approach to predict protein function from protein interaction networks of multiple species. The key to our method is to study proteins within modules defined by evolutionary conserved processes, combining comparative cross-species genomics with functional linkage in interaction networks. We show that integrating different evidence of functional similarity allows to infer novel functions with high precision and a very good coverage. Elucidating the pathological mechanisms is important for understanding the onset of diseases and for developing diagnostic and therapeutic approaches. We introduce a network-based framework for identifying disease-related gene products by combining protein interaction data and protein function with network centrality analysis. Given a disease, we compile a disease-specific network by integrating directly and indirectly linked gene products using protein interaction and functional information. Proteins in this network are ranked based on their network centrality. We demonstrate that using indirect interactions significantly improves disease gene identification. Predicted functions, in turn, enhance the ranking of disease-relevant proteins.
38

Χρήση ευφυών αλγοριθμικών τεχνικών για επεξεργασία πρωτεϊνικών δεδομένων

Θεοφιλάτος, Κωνσταντίνος 10 June 2014 (has links)
H παρούσα διατριβή εκπονήθηκε στο Εργαστήριο Αναγνώρισης Προτύπων, του Τμήματος Μηχανικών Ηλεκτρονικών Υπολογιστών και Πληροφορικής του Πανεπιστημίου Πατρών. Αποτελεί μέρος της ευρύτερης ερευνητικής δραστηριότητας του Εργαστηρίου στον τομέα του σχεδιασμού και της εφαρμογής των τεχνολογιών Υπολογιστικής Νοημοσύνης στην ανάλυση βιολογικών δεδομένων. Η διδακτορική αυτή διατριβή χρηματοδοτήθηκε από το πρόγραμμα Ηράκλειτος ΙΙ. Ο τομέας της πρωτεωμικής είναι ένα σχετικά καινούργιο και γρήγορα αναπτυσσόμενο ερευνητικό πεδίο. Μια από τις μεγαλύτερες προκλήσεις στον τομέα της πρωτεωμικής είναι η αναδόμηση του πλήρους πρωτεϊνικού αλληλεπιδραστικού δικτύου μέσα στα κύτταρα. Εξαιτίας του γεγονότος, ότι οι πρωτεϊνικές αλληλεπιδράσεις παίζουν πολύ σημαντικό ρόλο στις βασικές λειτουργίες ενός κυττάρου, η ανάλυση αυτών των δικτύων μπορεί να αποκαλύψει τον ρόλο αυτών των αλληλεπιδράσεων στις ασθένειες καθώς και τον τρόπο με τον οποίο οι τελευταίες αναπτύσσονται. Παρόλα αυτά, είναι αρκετά δύσκολο να καταγραφούν και να μελετηθούν οι πρωτεϊνικές αλληλεπιδράσεις ενός οργανισμού, καθώς το πρωτέωμα διαφοροποιείται από κύτταρο σε κύτταρο και αλλάζει συνεχώς μέσα από τις βιοχημικές του αλληλεπιδράσεις με το γονιδίωμα και το περιβάλλον. Ένας οργανισμός έχει ριζικά διαφορετική πρωτεϊνική έκφραση στα διάφορα σημεία του σώματός του, σε διαφορετικά στάδια του κύκλου ζωής του και υπό διαφορετικές περιβαλλοντικές συνθήκες. Δημιουργούνται, λοιπόν, δύο πάρα πολύ σημαντικοί τομείς έρευνας, που είναι, πρώτον, η εύρεση των πραγματικών πρωτεϊνικών αλληλεπιδράσεων ενός οργανισμού που θα συνθέσουν το πρωτεϊνικό δίκτυο αλληλεπιδράσεων και, δεύτερον, η περαιτέρω ανάλυση του πρωτεϊνικού δικτύου για εξόρυξη πληροφορίας (εύρεση πρωτεϊνικών συμπλεγμάτων, καθορισμός λειτουργίας πρωτεϊνών κτλ). Στην παρούσα διδακτορική διατριβή παρουσιάζονται καινοτόμες αλγοριθμικές τεχνικές Υπολογιστικής Νοημοσύνης για την πρόβλεψη πρωτεϊνικών αλληλεπιδράσεων, τον υπολογισμό ενός βαθμού εμπιστοσύνης για κάθε προβλεφθείσα αλληλεπίδραση, την πρόβλεψη πρωτεϊνικών συμπλόκων από δίκτυα πρωτεϊνικών αλληλεπιδράσεων και την πρόβλεψη της λειτουργίας πρωτεϊνών. Συγκεκριμένα, στο κομμάτι της πρόβλεψης και βαθμολόγησης πρωτεϊνικών αλληλεπιδράσεων αναπτύχθηκε μια πληθώρα καινοτόμων τεχνικών ταξινόμησης. Αυτές κυμαίνονται από υβριδικούς συνδυασμούς μετα-ευρετικών μεθόδων και ταξινομητών μηχανικής μάθησης, μέχρι μεθόδους γενετικού προγραμματισμού και υβριδικές μεθοδολογίες ασαφών συστημάτων. Στο κομμάτι της πρόβλεψης πρωτεϊνικών συμπλόκων υλοποιήθηκαν δύο βασικές καινοτόμες μεθοδολογίες μη επιβλεπόμενης μάθησης, οι οποίες θεωρητικά και πειραματικά ξεπερνούν τα μειονεκτήματα των υπαρχόντων αλγορίθμων. Για τις περισσότερες από αυτές τις υλοποιηθείσες μεθοδολογίες υλοποιήθηκαν φιλικές προς τον χρήστη διεπαφές. Οι περισσότερες από αυτές τις μεθοδολογίες μπορούν να χρησιμοποιηθούν και σε άλλους τομείς. Αυτό πραγματοποιήθηκε με μεγάλη επιτυχία σε προβλήματα βιοπληροφορικής όπως η πρόβλεψη microRNA γονιδίων και mRNA στόχων τους και η μοντελοποίηση - πρόβλεψη οικονομικών χρονοσειρών. Πειραματικά, η μελέτη αρχικά επικεντρώθηκε στον οργανισμό της ζύμης (Saccharomyces cerevisiae), έτσι ώστε να αξιολογηθούν οι αλγόριθμοι, που υλοποιήθηκαν και να συγκριθούν με τις υπάρχουσες αλγοριθμικές μεθοδολογίες. Στη συνέχεια, δόθηκε ιδιαίτερη έμφαση στις πρωτεΐνες του ανθρώπινου οργανισμού. Συγκεκριμένα, οι καλύτερες αλγοριθμικές τεχνικές για την ανάλυση δεδομένων πρωτεϊνικών αλληλεπιδράσεων εφαρμόστηκαν σε ένα σύνολο δεδομένων που δημιουργήθηκε για τον ανθρώπινο οργανισμό. Αυτό είχε σαν αποτέλεσμα την δημιουργία ενός πλήρους, σταθμισμένου δικτύου πρωτεϊνικών αλληλεπιδράσεων για τον άνθρωπο και την εξαγωγή των πρωτεϊνικών συμπλόκων, που υπάρχουν σε αυτό καθώς και τον λειτουργικό χαρακτηρισμό πολλών αχαρακτήριστων πρωτεϊνών. Τα αποτελέσματα της ανάλυσης των δεδομένων πρωτεϊνικών αλληλεπιδράσεων για τον άνθρωπο είναι διαθέσιμα μέσω μίας διαδικτυακής βάσης γνώσης HINT-KB (http://hintkb.ceid.upatras.gr), που υλοποιήθηκε στα πλαίσια αυτής της διδακτορικής διατριβής. Σε αυτή την βάση γνώσης ενσωματώνεται, από διάφορες πηγές, ακολουθιακή, δομική και λειτουργική πληροφορία για ένα τεράστιο πλήθος ζευγών πρωτεϊνών του ανθρώπινου οργανισμού. Επίσης, οι χρήστες μπορούν να έχουν προσβαση στις προβλεφθείσες πρωτεϊνικές αλληλεπιδράσεις και στον βαθμό εμπιστοσύνης τους. Τέλος, παρέχονται εργαλεία οπτικοποίησης του δικτύου πρωτεϊνικών αλληλεπιδράσεων, αλλά και εργαλεία ανάκτησης των πρωτεϊνικών συμπλόκων που υπάρχουν σε αυτό και της λειτουργίας πρωτεϊνών και συμπλόκων. Το προβλήματα με τα οποία καταπιάνεται η παρούσα διδακτορική διατριβή έχουν σημαντικό ερευνητικό ενδιαφέρον, όπως τεκμηριώνεται και από την παρατιθέμενη στη διατριβή εκτενή βιβλιογραφία. Μάλιστα, βασικός στόχος είναι οι παρεχόμενοι αλγόριθμοι και υπολογιστικά εργαλεία να αποτελέσουν ένα οπλοστάσιο στα χέρια των βιοπληροφορικάριων για την επίτευξη της κατανόησης των κυτταρικών λειτουργιών και την χρησιμοποίηση αυτής της γνώσης για γονιδιακή θεραπεία διαφόρων πολύπλοκων πολυπαραγοντικών ασθενειών όπως ο καρκίνος. Τα σημαντικόταρα επιτεύγματα της παρούσας διατριβής μπορούν να συνοψισθούν στα ακόλουθα σημεία: • Παροχή ολοκληρωμένης υπολογιστικής διαδικασίας ανάλυσης δεδομένων πρωτεϊνικών αλληλεπιδράσεων • Σχεδιασμός και υλοποίηση ευφυών τεχνικών πρόβλεψης και βαθμολόγησης πρωτεϊνικών αλληλεπιδράσεων, που θα παρέχουν αποδοτικά και ερμηνεύσιμα μοντέλα πρόβλεψης. • Σχεδιασμός και υλοποίηση αποδοτικών αλγορίθμων μη επιβλεπόμενης μάθησης για την εξόρυξη πρωτεϊνικών συμπλόκων από δίκτυα πρωτεϊνικών αλληλλεπιδράσεων. • Δημιουργία μιας βάσης γνώσης που θα παρέχει στην επιστημονική κοινότητα όλα τα ευρήματα της ανάλυσης των δεδομένων πρωτεϊνικών αλληλεπιδράσεων για τον ανθρώπινο οργανισμό. / The present dissertation was conducted in the Pattern Recognition Laboratory, of the Department of Computer Engineering and Informatics at the University of Patras. It is a part of the wide research activity of the Pattern Recognition Laboratory in the domain of designing, implementing and applying Computational Intelligence technologies for the analysis of biological data. The present dissertation was co-financed by the research program Hrakleitos II. The proteomics domain is a quite new and fast evolving research domain. One of the great challenges in the domain of proteomics is the reconstruction of the complete protein-protein interaction network within the cells. The analysis of these networks is able to uncover the role of protein-protein interactions in diseases as well as their developmental procedure, as protein-protein interactions play very important roles in the basic cellular functions. However, this is very hard to be accomplished as protein-protein interactions and the whole proteome is differentiated among cells and it constantly changes through the biochemical cellular and environment interactions. An organism has radically different protein expression in different tissues, in different phases of his life and under varying environmental conditions. Two very important domains of research are created. First, the identification of the real protein-protein interactions within an organism which will compose its protein interaction network. Second, the analysis of the protein interaction network to extract knowledge (search for protein complexes, uncovering of proteins functionality e.tc.) In the present dissertation novel algorithmic Computational Intelligent techniques are presented for the prediction of protein-protein interactions, the prediction of a confidence score for each predicted protein-protein interaction, the prediction of protein complexes and the prediction of proteins functionality. In particular, in the task of predicting and scoring protein-protein interactions, a wide range of novel classification techniques was designed and developed. These techniques range from hybrid combinations of meta-heuristic methods and machine learning classifiers, to genetic programming methods and fuzzy systems. For the task of predicting protein complexes, two novel unsupervised methods were designed and developed which theoretically and experimentally surpassed the limitations of existing methodologies. For most of the designed techniques user friendly interfaces were developed to allow their utilizations by other researchers. Moreover, many of the implemented techniques were successfully applied to other research domaines such as the prediction of microRNAs and their targets and the forecastment of financial time series. The experimental procedure, initially focused on the well studied organism of Yeast (Saccharomyces cerevisiae) to validate the performance of the proposed algorithms and compare them with existing computational methodologies. Then, it focuses on the analysis of protein-protein interaction data from the Human organism. In specific, the best algorithmic techniques, from the ones proposed in the present dissertation, were applied to a human protein-protein interaction dataset. This resulted to the construction of a weighted protein-protein interaction network of high coverage, to the extraction of human protein complexes and to the functional characterization of Human proteins and complexes. The results of the analysis of Human protein-protein interaction data are available in the web knowledge base HINT-KB (http://hintkb.ceid.upatras.gr) which was implemented during this dissertation. In this knowledge base, structural, functional and sequential information from various sources were incorporated for every protein pair. Moreover, HINTKB provide access to the predicted and scored protein-protein interactions and to the predicted protein complexes and their functional characterization. The problems which occupied the present dissertation have very significant research interest as it is proved by the provided wide bibliography. The basic goal is the provided algorithms and tools to contribute in the ultimate goal of systems biology to understand the cellular mechanisms and contribute in the development of genomic therapy of complex diseases such as cancer. The most important achievements of the present dissertation are summarized in the next points: • Providing an integrated computational framework for the analysis of protein-protein interaction data. • Designing and implementing intelligent techniques for predicting and scoring protein-protein interactions in an accurate and interpretable manner. • Designing and implementing effective unsupervised algorithmic techniques for extracting protein complexes and predicting their functionality. • Creating a knowledge base which will provide to the scientific community all the findings of the analysis conducted on the Human protein-protein interaction data.
39

Study Of Structure, Dynamics & Self-Assembly Of Human Insulin-Like Growth Factor Binding Protein-2 By Novel NMR And Biophysical Methods

Swain, Monalisa 07 1900 (has links) (PDF)
My research work for PhD has focused on: (i) the development and application of new NMR methodologies to solve challenging problems in structural biology and (ii) studying important biological systems to correlate their structural and functional aspects. I have worked on diverse research projects ranging from NMR methodology development to the study of structure and dynamics of protein-based nanotubes. Chapter 1 of my thesis gives brief introduction to bio-molecular NMR spectroscopy and the different biological systems that I have studied. In recent years, several new methods have emerged for rapid NMR data collection. One class of methods is G-matrix Fourier transform (GFT) projection NMR spectroscopy. GFT NMR spectroscopy involves phase sensitive joint sampling of two or more chemical shifts in an indirect dimension of a multidimensional NMR experiment. Chapter 2 describes a new method based on the principle of GFT NMR for increasing further the speed of data collection. In the current implementations of the GFT method, cosine/sine modulation of all chemical shifts involved in the joint sampling are collected and stored as separate FIDs. A post-acquisition data processing step (application of G-matrix) then separates the different inter-modulations of chemical shifts. Thus, joint sampling of K+1 spins results in 2K combination of chemical shifts (also representing 2K projection angles). One limitation of this approach is that even if only a few of the 2K components of the multiplet (or projection angles) is desired, an entire data set containing information for all 2K shift combinations is collected. We have proposed a simple method which releases this restriction and allows one to selectively detect only the desired linear combination of chemical shifts/projection angles out of 2K combinations in a phase sensitive manner. The method involves selecting the appropriate cosine/sine modulations of chemical shifts and forming the desired linear combination by phase cycling of the radiofrequency pulses and receiver. This will benefit applications where only certain linear combination of shifts are desired or/and are sufficient. Further, G-matrix transformation required for forming the linear combination is performed within the pulse sequence. This avoids the need for any post-acquisition data processing. Taken together, this mode of data acquisition will foster new applications in projection NMR spectroscopy for rapid resonance assignment and structure determination. Chapter 3 describes another GFT NMR-based method for rapid estimation of secondary structure in proteins. This involves the detection of specific linear combination of backbone chemical shifts and facilitates a clear separation and estimation of residues in different secondary structures of a given protein. This methodology named as CSSI-PRO (Combination of Shifts for Secondary structure Identification in PROteins), involves detection of specific linear combination of backbone 1Hα and 13C’ chemical shifts in a two dimensional (2D) NMR experiment. Such linear combination of shifts facilitates editing of residue belonging to α-helical/ β-strand regions into distinct spectral regions nearly independent of the amino acid type. This helps in the estimation of overall secondary structure content of the protein. Comparison of the estimated secondary structure content with those obtained from the respective 3D structures and/or the method of Chemical Shift Index (CSI) was carried out for 254 proteins and gives a correlation of more than 90% and an overall RMSD of 6.5%. The method has high sensitivity and data can be acquired in a few minutes. This methodology has several applications such as for high-throughput screening of proteins in structural proteomics and for monitoring conformational changes during protein folding and/or ligand-binding events. Chapter 4 (Part-A and Part-B) describes an area of my research which involves the study of structure and function in the Insulin-like Growth Factor Binding Protein (IGFBP) family. IGFBPs (six in number; IGFBP1-6) belong to the IGF-system, which plays an important role in growth and development of the human body. This system is comprised of the following components: (i) Two peptide hormones, IGF-1 and -2, (ii) type 1 and type 2 IGF receptors, (iii) six IGF-binding proteins (IGFBP; numbered 1-6) and (iv) IGFBP proteases. IGF-1 and -2 are small signalling peptides (~7.5 kDa) that stimulate action by binding to specific cell surface receptors (IGF-1R) evoking subsequent response inside the cell. Six soluble IGF binding proteins, the IGFBPs, which range in 22-31 kDa in size and share overall sequence and structural homology with each other, regulate the activity of the IGFs. IGFBPs bind strongly to IGFs (KD ~ 300-700 pM) to ensure that all the circulating IGF in the blood stream is sequestered and inhibit the action of IGFs by blocking their access to the receptors. Proteolysis of the IGFBPs dissociates IGFs from the complex, enabling them to bind and activate the cell surface receptors. IGFBPs have been recently implicated in different cancers and HIV/AIDS. However, the nature of their interaction with the ligand: IGF-1 or IGF-2 at a molecular level poorly understood. This is due to the difficulty in over-expressing these proteins in large scale and in soluble amounts which is required for structural studies. We have for the first time developed an efficient method for bacterial expression of full-length human IGFBP-2, a 33 kDa system, in soluble (upto 30 mg/ml) and folded form. Using a single step purification protocol, hIGFBP-2 was obtained with >95% purity and structurally characterized using NMR spectroscopy. The protein was found to exist as a monomer at the high concentrations required for structural studies and to exist in a single conformation exhibiting a unique intra-molecular disulfide-bonding pattern. The protein retained full biologic activity as evident from its strong binding to IGF-1 and IGF-2 detected using surface plasmon resonance (SPR). This study represents the first high-yield expression of wild-type recombinant human IGFBP-2 in E. coli and first structural characterization by NMR. Using different NMR methods, we are now in the process of elucidating the 3D structure of this molecule. Chapter 5 (Part-A and Part-B) describes our discovery of nanotubular structures formed by spontaneous self-assembly of a small fragment from the C-terminal domain of hIGFBP-2. The nanotubular structures are several micrometers long and have a uniform outer diameter of ~35 nm. These structures were studied extensively by NMR and other techniques such as TEM, fluorescence and circular dichroism (CD). The water soluble nanotubes form through intermolecular disulphide bonds due to the presence of three cysteines in the polypeptide chain and exhibit enhanced tyrosine fluorescence. Based on different experimental evidences we have proposed a mechanism for the formation of the nanotubes. This was considered as a breakthrough by the journal ChemComm and featured on the cover-page of the journal. An article highlighting the discovery was also published in RSC news. In recent years, a number of novel polypeptide and DNA based nanotubes have been reported. Our study reveals intrinsically fluorescent self-assembling nanotubes made up of disulphide bonds having the following novel properties: (i) their formation/dissociation can be controlled by tuning the redox conditions, (ii) they do not require the support of any additional chemical agent for self-assembly, (iii) they have high stability due to the involvement of covalent interactions, (iv) the monomer is a small polypeptide chain which can be chemically synthesized or produced using simple recombinant methods and (v) they possess high inherent fluorescence and can thus be easily detected against a background of other proteins. In addition, the presence of an RGD motif in this polypeptide fragment offers avenues for novel biomedical applications. The RGD motif is known to be recognized by integrins. The design of such self-assembling polypeptide fragments containing an RGD motif can be utilized to enhance the efficacy of cancer therapeutics. Towards this end, we have investigated the structural basis of formation of these nanotubular structures by NMR spectroscopy and proposed its application for cancer cell imaging.
40

Protein function prediction by integrating sequence, structure and binding affinity information

Zhao, Huiying 03 February 2014 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Proteins are nano-machines that work inside every living organism. Functional disruption of one or several proteins is the cause for many diseases. However, the functions for most proteins are yet to be annotated because inexpensive sequencing techniques dramatically speed up discovery of new protein sequences (265 million and counting) and experimental examinations of every protein in all its possible functional categories are simply impractical. Thus, it is necessary to develop computational function-prediction tools that complement and guide experimental studies. In this study, we developed a series of predictors for highly accurate prediction of proteins with DNA-binding, RNA-binding and carbohydrate-binding capability. These predictors are a template-based technique that combines sequence and structural information with predicted binding affinity. Both sequence and structure-based approaches were developed. Results indicate the importance of binding affinity prediction for improving sensitivity and precision of function prediction. Application of these methods to the human genome and structure genome targets demonstrated its usefulness in annotating proteins of unknown functions and discovering moon-lighting proteins with DNA,RNA, or carbohydrate binding function. In addition, we also investigated disruption of protein functions by naturally occurring genetic variations due to insertions and deletions (INDELS). We found that protein structures are the most critical features in recognising disease-causing non-frame shifting INDELs. The predictors for function predictions are available at http://sparks-lab.org/spot, and the predictor for classification of non-frame shifting INDELs is available at http://sparks-lab.org/ddig.

Page generated in 0.07 seconds