Spelling suggestions: "subject:"1protein function"" "subject:"2protein function""
11 |
Πρόβλεψη πρωτεϊνικής λειτουργίας με χρήση μεθόδου συγχρονισμού σύνθετων δικτύωνΤσιούτσιου, Βάια 11 October 2013 (has links)
Οι πρωτεϊνικές αλληλεπιδράσεις (PPI) αναφέρονται στην σύνδεση δύο ή περισσοτέρων πρωτεϊνών ώστε να εκτελεστεί μια βιολογική λειτουργία. Την τελευταία δεκαετία, νέες τεχνολογίες υψηλής απόδοσης για τον εντοπισμό αυτών των αλληλεπιδράσεων έχουν παράγει μεγάλης κλίμακας σύνολα δεδομένων τόσο του ανθρώπου όσο και των περισσοτέρων ειδών. Με την αναπαράσταση αυτών των δεδομένων σε δίκτυα, με τους κόμβους να αναπαριστούν τις πρωτεΐνες και τις ακμές τις αλληλεπιδράσεις, μπορούν να εξαχθούν χρήσιμες πληροφορίες σχετικά με τον προσδιορισμό της λειτουργίας των πρωτεϊνών/πρόβλεψη ή σχετικά με το πώς να σχεδιαστούν κατάλληλα φάρμακα που προσδιορίζουν τα νέα γονίδια-στόχους για τον καρκίνο ή τους μηχανισμούς που ελέγχουν (ή ρυθμίζουν) τις βιολογικές αλληλεπιδράσεις που είναι υπεύθυνες για την καλή ή την κακή λειτουργία ενός κυττάρου.
Στα πλαίσια της παρούσας διπλωματικής, κληθήκαμε να κάνουμε λειτουργική πρόβλεψη των πρωτεϊνών στο δίκτυο πρωτεϊνικών αλληλεπιδράσεων του ανθρώπου εφαρμόζοντας μια μέθοδο δυναμικής επικάλυψης η οποία βασίζεται στον έλεγχο του πώς οι ταλαντωτές οργανώνονται σε ένα «αρθρωτό»(modular) δίκτυο σχηματίζοντας διεπαφές συγχρονισμού και κοινότητες επικάλυψης. Μελετήσαμε το δίκτυο πρωτεϊνικών αλληλεπιδράσεων του ανθρώπου και τις κλάσεις λειτουργιών θεωρώντας ένα σύνολο ταλαντωτών φάσης (phase oscillators) με μία τοπολογία συνδέσεων που ορίζεται από το δίκτυο πρωτεϊνικών αλληλεπιδράσεων του ανθρώπου. Συγκεκριμένα, αρχίσαμε με μία απλή ομαδοποίηση για κάθε πρωτεΐνη και έπειτα χρησιμοποιήσαμε την μέθοδο δυναμικής επικάλυψης για τον προσδιορισμό των λειτουργιών των πρωτεϊνών του PPI δικτύου. Στην συνέχεια, εντοπίσαμε εκείνες τις πρωτεΐνες οι οποίες δεν είχαν ομαδοποιηθεί σωστά καθώς και τις πρωτεϊνες που ήταν πιθανόν να «συμμετείχαν» σε περισσότερες από μία λειτουργικές κλάσεις (πολυλειτουργικές πρωτεΐνες).
Με κατάλληλο έλεγχο των αλληλεπιδράσεων μεσαίας κλίμακας του δικτύου των δυναμικών συστημάτων που δημιουργήθηκε παρήχθησαν χρήσιμες πληροφορίες για τις μικρής και μεγάλης κλίμακας διαδικασίες μέσω των οποίων οι βιολογικές διεργασίες οργανώνονται σε ένα κύτταρο γεγονός που αποκαλύπτει ότι η μέθοδος είναι ικανή όχι μόνο να εντοπίσει τις μη σωστά ομαδοποιημένες πρωτεΐνες αλλά και να αποκαλύψει αυτές που έχουν διπλή λειτουργικότητα (2 λειτουργίες). / Protein-protein interactions (PPI) refer to the binding of two or more proteins to perform a biological function. In the last decade, novel high-throughput technologies for detecting those interactions have produced large-scale data sets across human and most model species. By embedding these data in networks, with nodes representing proteins and edges the detected PPIs, useful information can be extracted regarding protein functional annotation/prediction or on how to design proper drugs, identifying new targets on cancer, or mechanisms to control (or regulate) the biological interactions responsible for the functioning,or malfunctioning, of a cell.
Under the framework of my master thesis, I had to apply a method of dynamical overlap based on the inspection of how oscillators organize in a modular network by forming synchronization interfaces and overlapping communities to the human PPI network. I studied the human protein interaction network (PIN) and its functional modules by considering an ensemble of phase oscillators with a topology of connections defined by the human PIN. In particular, I started with a single classification for each protein and I used the dynamical overlap method for identifying/predicting of the proteins function(s) in the PPI network. Then, I identified all those proteins that were misclassified and those proteins that were likely to be involved in more than one of the functional categories in our data(multifunctional proteins).
A proper inspection on the meso-scale interactions of the generated network of dynamical systems provided useful information on the micro- and macro- scale processes through which biological processes are organized in a cell, that is, the method is not only able to identify the misclassified proteins but also to unveil those proteins that have double functionality.
|
12 |
Using semantic similarity measures across Gene Ontology to predict protein-protein interactionsHelgadóttir, Hanna Sigrún January 2005 (has links)
Living cells are controlled by proteins and genes that interact through complex molecular pathways to achieve a specific function. Therefore, determination of protein-protein interaction is fundamental for the understanding of the cell’s lifecycle and functions. The function of a protein is also largely determined by its interactions with other proteins. The amount of protein-protein interaction data available has multiplied by the emergence of large-scale technologies for detecting them, but the drawback of such measures is the relatively high amount of noise present in the data. It is time consuming to experimentally determine protein-protein interactions and therefore the aim of this project is to create a computational method that predicts interactions with high sensitivity and specificity. Semantic similarity measures were applied across the Gene Ontology terms assigned to proteins in S. cerevisiae to predict protein-protein interactions. Three semantic similarity measures were tested to see which one performs best in predicting such interactions. Based on the results, a method that predicts function of proteins in connection with connectivity was devised. The results show that semantic similarity is a useful measure for predicting protein-protein interactions.
|
13 |
Redundancy-aware learning of protein structure-function relationshipsBryant, Drew 13 May 2013 (has links)
The protein kinases are a large family of enzymes that play a fundamental role in propagating signals within the cell. Because of the high degree of binding site similarity shared among protein kinases, designing drug compounds with high specificity among the kinases has proven difficult. However, computational approaches to comparing the 3-dimensional geometry and physicochemical properties of key binding site residues, referred to here as substructures, have been shown to be informative of inhibitor selectivity. This thesis introduces two fundamental approaches for the comparative analysis of substructure similarity and demonstrates the importance of each method on a variety of large protein structure datasets for multiple biological applications.
The Family-wise Alignment of SubStructural Templates Framework (The FASST Framework) provides an unsupervised learning approach for identifying substructure clusterings. The substructure clusterings identified by FASST allow for the automatic evaluation of substructure variability, the identification of distinct structural conformations and the selection of anomalous outlier structures within large structure datasets. These clusterings are shown to be capable of identifying biologically meaningful structure trends among a diverse number of protein families. The FASST Live visualization and analysis platform provides multiple comparative analysis pipelines and allows the user to interactively explore the substructure clusterings computed by FASST.
The Combinatorial Clustering Of Residue Position Subsets (CCORPS) method provides a supervised learning approach for identifying structural features that are correlated with a given set of annotation labels. The ability of CCORPS to identify structural features predictive of functional divergence among families of homologous enzymes is demonstrated across 48 distinct protein families. The CCORPS method is further demonstrated to generalize to the very difficult problem of predicting protein kinase inhibitor affinity. CCORPS is demonstrated to make perfect or near-perfect predictions for the binding ability of 12 of the 38 kinase inhibitors studied, while only having overall poor predictive ability for 1 of the 38 compounds. Additionally, CCORPS is shown to identify shared structural features across phylogenetically diverse groups of kinases that are correlated with binding affinity for particular inhibitors; such instances of structural similarity among phylogenetically diverse kinases are also shown to not be rare among kinases. Finally, these function-specific structural features may serve as potential starting points for the development of highly specific kinase inhibitors.
Importantly, both The FASST Framework and CCORPS implement a redundancy-aware approach to dealing with structure overrepresentation that allows for the incorporation of all available structure data. As shown in this thesis, surprising structural variability exists even among structure datasets consisting of a single protein sequence. By incorporating the full variety of structural conformations within the analysis, the methods presented here provide a richer view of the variability of large protein structure datasets.
|
14 |
Variability of Specificity Determinants in the O- Succinylbenzoate Synthase FamilyWang, Chenxi 1986- 14 March 2013 (has links)
Understanding how protein sequence, structure and function coevolve is at the core of functional genome annotation and protein engineering. The fundamental problem is to determine whether sequence variation contributes to functional differences or if it is a consequence of evolutionary divergence that is unrelated to functional specificity. To address this problem, we cannot merely analyze sequence variation between homologous proteins that have different functions. For comparison, we need to understand the factors that determine sequence variation in proteins that have the same function, such as a set of orthologous enzymes.
Here, we address this problem by analyzing the evolution of functionally important residues in the o-succinylbenzoate synthase (OSBS) family. The OSBS family consists of several hundred enzymes that catalyze a step in menaquinone (Vit. K2) synthesis. Based on phylogeny, the OSBS family can be divided into eight major subfamilies. We assayed wild-type OSBS enzyme activities. The results show that the enzymes from γ-Proteobacteria subfamily 1 and Bacteroidetes have relatively low values, the enzyme from Cyanobacteria subfamily 1 is intermediate, and the values for the proteins from the Actinobacteria and Firmicutes subfamilies are relatively high. We are using computational and experimental methods to identify functionally important amino acids in each subfamily. Our data suggest that each subfamily has a different set of functionally important residues, even though the enzymes catalyze the same reaction. These differences may have accumulated because different mutations were required in each subfamily to compensate for deleterious mutations or to adapt to changing environments. We assessed the roles of these amino acids in enzyme structure and function. Our method achieved 70% successful rate to identify positions that play important roles in one family but not another. The residues P119 and A329 play important role in D. psychrophila but not in T.fusca OSBS. We also observed two class switch mutations in T.fusca, P11 and P22. The mutations at these two position have a similar kinetic parameters as wild-type D. psychrophila OSBS.
|
15 |
Automatic Assignment of Protein Function with Supervised ClassifiersJung, Jae 16 January 2010 (has links)
High-throughput genome sequencing and sequence analysis technologies have
created the need for automated annotation and analysis of large sets of genes. The
Gene Ontology (GO) provides a common controlled vocabulary for describing gene
function. However, the process for annotating proteins with GO terms is usually
through a tedious manual curation process by trained professional annotators. With
the wealth of genomic data that are now available, there is a need for accurate auto-
mated annotation methods.
The overall objective of my research is to improve our ability to automatically an-
notate proteins with GO terms. The first method, Automatic Annotation of Protein
Functional Class (AAPFC), employs protein functional domains as features and learns
independent Support Vector Machine classifiers for each GO term. This approach relies only on protein functional domains as features, and demonstrates that statistical
pattern recognition can outperform expert curated mapping of protein functional
domain features to protein functions. The second method Predict of Gene Ontology
(PoGO) describes a meta-classification method that integrates multiple heterogeneous
data sources. This method leads to improved performance than the protein domain
method can achieve alone.
Apart from these two methods, several systems have been developed that employ pattern recognition to assign gene function using a variety of features, such as the sequence similarity, presence of protein functional domains and gene expression
patterns. Most of these approaches have not considered the hierarchical relationships
among the terms in the form of a directed acyclic graph (DAG). The DAG represents
the functional relationships between the GO terms, thus it should be an important
component of an automated annotation system. I describe a Bayesian network used as
a multi-layered classifier that incorporates the relationships among GO terms found in
the GO DAG. I also describe an inference algorithm for quickly assigning GO terms
to unlabeled proteins. A comparative analysis of the method to other previously
described annotation systems shows that the method provides improved annotation
accuracy when the performance of individual GO terms are compared. More importantly, this method enables the classification of significantly more GO terms to more
proteins than was previously possible.
|
16 |
Using semantic similarity measures across Gene Ontology to predict protein-protein interactionsHelgadóttir, Hanna Sigrún January 2005 (has links)
<p>Living cells are controlled by proteins and genes that interact through complex molecular pathways to achieve a specific function. Therefore, determination of protein-protein interaction is fundamental for the understanding of the cell’s lifecycle and functions. The function of a protein is also largely determined by its interactions with other proteins. The amount of protein-protein interaction data available has multiplied by the emergence of large-scale technologies for detecting them, but the drawback of such measures is the relatively high amount of noise present in the data. It is time consuming to experimentally determine protein-protein interactions and therefore the aim of this project is to create a computational method that predicts interactions with high sensitivity and specificity. Semantic similarity measures were applied across the Gene Ontology terms assigned to proteins in S. cerevisiae to predict protein-protein interactions. Three semantic similarity measures were tested to see which one performs best in predicting such interactions. Based on the results, a method that predicts function of proteins in connection with connectivity was devised. The results show that semantic similarity is a useful measure for predicting protein-protein interactions.</p>
|
17 |
Engineering Cholesterol-Based Fibers for Antibody Immobilization and Cell CaptureCohn, Celine January 2015 (has links)
In 2015, the United States is expected to have nearly 600,000 deaths attributed to cancer. Of these 600,000 deaths, 90% will be a direct result of cancer metastasis, the spread of cancer throughout the body. During cancer metastasis, circulating tumor cells (CTCs) are shed from primary tumors and migrate through bodily fluids, establishing secondary cancer sites. As cancer metastasis is incredibly lethal, there is a growing emphasis on developing "liquid biopsies" that can screen peripheral blood, search for and identify CTCs. One popular method for capturing CTCs is the use of a detection platform with antibodies specifically suited to recognize and capture cancer cells. These antibodies are immobilized onto the platform and can then bind and capture cells of interest. However, current means to immobilize antibodies often leave them with drastically reduced function. The antibodies are left poorly suited for cell capture, resulting in low cell capture efficiencies. This body of work investigates the use of lipid-based fibers to immobilize proteins in a way that retains protein function, ultimately leading to increased cell capture efficiencies. The resulting increased efficiencies are thought to arise from the retained three-dimensional structure of the protein as well as having a complete coating of the material surface with antibodies that are capable of interacting with their antigens. It is possible to electrospin cholesterol-based fibers that are similar in design to the natural cell membrane, providing proteins a more natural setting during immobilization. Such fibers have been produced from cholesterol-based cholesteryl succinyl silane (CSS). These fibers have previously illustrated a keen aptitude for retaining protein function and increasing cell capture. Herein the work focuses on three key concepts. First, a model is developed to understand the immobilization mechanism used by electrospun CSS fibers. The antibody immobilization and cell capturing abilities of the CSS fibers were compared to that of hydrophobic polycaprolactone (PCL) fibers and hydrophilic plasma-treated PCL fibers. Electrospun CSS fibers were found to immobilize equivalent amounts of protein as hydrophobically immobilized proteins. However, these proteins captured 6 times more cells, indicative of retained protein function. The second key concept was the design and fabrication of a hybridized lipid fiber. Lipid fibers provide improved protein function but fabrication difficulties have limited their adoption. Thus, we sought to fabricate a lipid-polymer hybrid that is easily fabricated while maintaining protein function. The hybrid fiber consists of a PCL backbone with conjugated CSS. The hybrid lipid fibers showed improved protein function. In addition, higher lipid concentrations were directly correlated to higher cell capture efficiencies. The third key concept was on the development of dually functionalized lipid fibers and understanding the resulting cell capture efficiencies. Many platforms are unable to simultaneously search for heterogeneous populations of CTCs–the ability to dually functionalize cell-capturing platforms would address this technological weakness. Studies indicated that dually functionalizing the lipid fibers did not compromise the platforms' abilities to capture the cells of interest. Such dually functionalized fibers allow for a single cell-capture platform to successfully detect heterogeneous populations of CTCs. The body of work encompassed herein describes the use of lipid fibers for antibody immobilization and cell capture. Data from various projects indicate that the use of cholesterol-based fibers produced from electrospun CSS are well suited for protein immobilization. The CSS fibers are able to immobilize equivalent amounts of protein as compared to other immobilization techniques. However, the benefit of these fibers is illustrated by the strong cell-capturing efficiencies, indicating that the immobilized proteins are able to retain their function and selectively target cells of interest. The successful immobilization of proteins and their retained function allows for the development of increasingly sensitive cancer diagnostic tools that are able to screen for CTCs early on in the cancer disease cycle.
|
18 |
Plasmonic field effects on the spectroscopic and photobiological function of the photosynthetic system of bacteriorhodopsinBiesso, Arianna 06 March 2009 (has links)
The first section of this thesis concerns the study of interactions between the intense local plasmonic field generated by nanostructure and a well known photosynthetic protein system, bacteriorhodopsin (bR). bR is a membrane protein responsible for proton transport. Among the many intermediates formed upon photoexcitations, two of the most relevant have been studied. The intermediates under studies were I460 and M412, and their decay dynamics were measured in presence of the plasmonic field generated by the excitation of their surface electrons using visible photons.
Both intermediates decay lifetime were affected when the plasmonic field was turned on, and it was verify that thermal effect were not the source of the change in dynamic.
The second part concerns the investigation of third-order nonlinearity of a series of extended conjugated squaraine dyes in the telecommunication spectral region. Their nonlinearity is measured via Degenerate Four Wave Mixing and Z-scan as function of the dyes increasing conjugation length and number of squarylium groups. The dyes produced large real and imaginary values for the third order nonlinearity in the 1300-1500 nm range which makes them attractive for optical limiting type of applications.
|
19 |
Behavioural testing and general phenotyping of mice with mutations in Eef1a2 to investigate autism, intellectual disability and epilepsyHope, Jilly Evelyn January 2018 (has links)
Eukaryotic Elongation Factor 1A (eEF1A) plays a key role in protein synthesis by delivering aminoacylated tRNAs to the A site of the ribosome. In higher vertebrates, two isoforms of eEF1A exist called eEF1A1 and eEF1A2, with eEF1A2 being expressed in adult brain, heart and skeletal muscle. Since 2012, several different de novo heterozygous missense mutations in EEF1A2 have been identified in humans and these cause epilepsy, intellectual disability and autism. Before considering treatment options, it is vital to determine whether these mutations cause loss or gain of protein function. I performed a battery of behavioural tests using two mouse lines with heterozygous loss of function mutations in eEF1A2. The aim was to determine whether there were any behavioural phenotypes consistent with intellectual disability and/or autism. Using heterozygous wasted mice (Eef1a2+/wst), I analysed the effects of aging on behaviour and found that Eef1a2+/wst mice showed reduced marble burying activity and reduced movement in the open field test with age. In a test of social behaviour, Eef1a2+/wst mice showed a significantly reduced preference for social novelty at all ages tested. The second heterozygous null line, Del22.ex3, was generated on a pure C57BL/6J genetic background. This new line was made in order to reduce the level of variation observed in data from the wasted line, which was on a mixed genetic background. The genetic background was shown to have an influence on behaviour as the results differed between this line and the wasted line. Del22.ex3 Eef1a2+/- mice showed significantly reduced engagement in repetitive behaviours compared with wild-type littermates and normal preference for social novelty. Using CRISPR/Cas9, a mouse line with the D252H missense mutation was generated and I repeated my behavioural testing on heterozygotes from this line. I found no behavioural abnormalities in this line suggesting a mouse-human difference in the ability to tolerate eEF1A2 missense mutations. Previous attempts to make a line with the G70S missense mutation were unsuccessful but as a product of this experiment, it was found that mice expressing G70S eEF1A2 had a comparable phenotype to and died at the same age as complete knockouts. This suggested that the G70S protein is non-functional and cannot compensate for loss of wild-type eEF1A2. These experiments have improved our understanding of the phenotypic effects of Eef1a2 mutations in mice and have shown, for the first time, that mutations in Eef1a2 affect mouse behaviour.
|
20 |
基於資料科學方法之巨量蛋白質功能預測 / Applying Data Science to High-throughput Protein Function Prediction劉義瑋, Liu, Yi-Wei Unknown Date (has links)
自人體基因組計畫與次世代定序的完成後,生物資料呈現爆炸性的成長,其中蛋白質序列也是大量發現的基因產物之一,然而蛋白質的功能檢測與標記極其耗時,因此存在大量已知序列卻不知其功能的蛋白質,在實驗前透過電腦先預測可能之功能,能夠幫助生物學家排定不同的蛋白質功能實驗順序,因而加快蛋白質功能標注的速度。基因本體論(GO)是一個被廣泛使用描述基因產物功能與性質的分類方法,分為生物途徑、細胞組件、分子功能三個分支,每個分支皆為一個由多個GO組成的階層樹。蛋白質功能預測為透過蛋白質序列預測該蛋白質所擁有的GO,因此可以視為一個多標籤的分類機器學習問題。我們提出一個基於序列同源性的機器學習預測框架,同時能夠結合蛋白質家族的資訊,並設計多種不同的投票方法解決多標籤的預測問題。 / Biological data has grown explosively with the accomplishment of Human Genome Project and Next-generation sequencing. Annotating protein function with wet lab experiment is time-consuming, so many proteins’ functions are still unknown. Fortunately, computational function prediction can help wet lab formulate biological hypotheses and prioritize experiments. Gene Ontology (GO) is the framework for unifying the representation of gene function and classifying these functions into three domains namely, Biological Process Ontology, Cellular Component Ontology, and Molecular Function Ontology. Each domain is a hierarchical tree composed of labels known as GO terms. Protein function prediction can be considered as a multiple label classification problem, i.e., given a protein sequence, predict its GO terms. We proposed a machine learning framework to predict protein function based on its homology sequence structure, which is believed to contain protein family information and designed various voting mechanisms to resolve the multiple label prediction problem.
|
Page generated in 0.0647 seconds