1 |
Comparing Naïve Bayes Classifiers with Support Vector Machines for Predicting Protein Subcellular Location Using Text FeaturesLam, Yin 07 July 2010 (has links)
Proteins play many roles in the body, and the task of understanding how proteins function is very challenging. Determining a protein’s location within the cell (also referred to as the subcellular location) helps shed light on the function of that protein. Protein subcellular location can be inferred through experimental methods or predicted using computational systems. In particular, we focus on two existing computational systems, namely EpiLoc and HomoLoc, that use features derived from text (abstracts of technical papers), and apply a support vector machine (SVM) classifier to classify proteins into their respective locations. Both EpiLoc and HomoLoc’s prediction accuracy is comparable to that of state-of-the-art protein location prediction systems. However, in addition to accuracy, other factors such as training efficiency must be considered in evaluating the quality of a location prediction system. In this thesis, we replace the SVM classifier in EpiLoc and HomoLoc, by a naïve Bayes classifier and by a novel classifier which we call the Mean Weight Text classifier. The Mean Weight Text classifier and the naïve Bayes classifier are simple to implement and execute efficiently. In addition, naïve Bayes classifiers have been shown effective in the context of protein location prediction and are considered preferable to SVM due to clarity in explaining the process used to derive the results. Evaluating the performance of these classifiers on existing data sets, we find that SVM classifiers have a slightly higher accuracy than naïve Bayes and Mean Weight Text classifiers. This slight advantage is offset by the simplicity and efficiency offered by naïve Bayes and Mean Weight Text classifiers. Moreover, we find that the Mean Weight Text classifier has a slightly higher accuracy than the naïve Bayes classifier. / Thesis (Master, Computing) -- Queen's University, 2010-07-06 11:06:47.613
|
2 |
Predicting Human and Animal Protein Subcellular LocationKhavari, Sepideh 31 August 2016 (has links)
No description available.
|
3 |
A computational investigation of solubility, functionality and the adaptation in subcellular compartments of proteinsChan, Pedro January 2011 (has links)
A cell is considered to be the smallest unit of life. It carries out a variety of biochemical reactions through the activities of proteins and protein enzymes. In order to perform functions, proteins must be in their native folded state together with the correct environmental conditions. A slight change in pH or temperature could cause disruption to the electrostatic interactions within the protein, thus leading to conformational change and the loss of activity. Studies have shown that solubility could be enhanced by increasing the number of charges on the protein surface. And from the studies of extremophiles, we learned that the presence of non-polar aromatic residues could be a key for thermostable proteins. Thus, charges are important to determine the function and adaptation of proteins.Over the decades, large amount of protein sequence and structure information relating to molecular biology has been produced. By employing algorithms, computational and statistical techniques, it is possible to analyse these data to solve biological problems. Often these investigations are based mainly on sequences since their numbers outstrip the number of available structures. However, adding structures would allow us to investigate problems such as the relationship between charges, sequence, structure and functions, which is the aim of this study.In this thesis, the relationships between proteins and function were examined by various electrostatic features derived from charges and also geometric properties from structures. One interesting finding is that the averaged value of pH of maximum stability of proteins within a subcellular location was highly correlated to the pH of that subcellular compartment, which was due to pKas (of histidines), and their locations on the proteins. We also found that the size of the largest non-charged patch on the protein surface correlates with solubility and provides a predictor with a maximum accuracy of 76%. The use of novel charge-based methods shows little improvement in distinguishing between enzymes and non-enzymes. However, the method of using real charges with grid size of 1 angstrom has paved a way into the idea of using charges and dipoles pattern from enzyme active site to distinguish different enzymes. Finally, a web-tool for displaying conserved residues on 3D protein structure is made available to the public for identifying residues that may be of functional importance.
|
4 |
Atypical Solute Carriers : Identification, evolutionary conservation, structure and histology of novel membrane-bound transportersPerland, Emelie January 2017 (has links)
Solute carriers (SLCs) constitute the largest family of membrane-bound transporter proteins in humans, and they convey transport of nutrients, ions, drugs and waste over cellular membranes via facilitative diffusion, co-transport or exchange. Several SLCs are associated with diseases and their location in membranes and specific substrate transport makes them excellent as drug targets. However, as 30 % of the 430 identified SLCs are still orphans, there are yet numerous opportunities to explain diseases and discover potential drug targets. Among the novel proteins are 29 atypical SLCs of major facilitator superfamily (MFS) type. These share evolutionary history with the remaining SLCs, but are orphans regarding expression, structure and/or function. They are not classified into any of the existing 52 SLC families. The overall aim in this thesis was to study the atypical SLCs with a focus on their phylogenetic clustering, evolutionary conservation, structure, protein expression in mouse brains and if and how their gene expressions were affected upon changed food intake. In Papers I-III, the focus was on specific proteins, MFSD5 and MFSD11 (Paper I), MFSD1 and MFSD3 (Paper II), and MFSD4A and MFSD9 (Paper III). They all shared neuronal expression, and their transcription levels were altered in several brain areas after subjecting mice to food deprivation or a high-fat diet. In Paper IV, the 29 atypical SLCs of MFS type were examined. They were divided into 15 families, based on phylogenetic analyses and sequence identities, to facilitate functional studies. Their sequence relationships with other SLCs were also established. Some of the proteins were found to be well conserved with orthologues down to nematodes and insects, whereas others emerged at first in vertebrates. The atypical SLCs of MFS type were predicted to have the common MFS structure, composed of 12 transmembrane segments. With single-cell RNA sequencing and in situ proximity ligation assay, co-expression of atypical SLCs was analysed to get a comprehensive understanding of how membrane-bound transporters interact. In conclusion, the atypical SLCs of MFS type are suggested to be novel SLC transporters, involved in maintaining nutrient homeostasis through substrate transport.
|
5 |
Functional analysis of candidate effector proteins during Sporisorium scitamineum x sugarcane interaction / Análise funcional de proteínas candidatas a efetores durante a interação Sporisorium scitamineum x canaSilva, Natália de Sousa Teixeira e 04 February 2019 (has links)
Sugarcane smut is a worldwide distributed disease important to agribusiness, since it can affect sugarcane yield drastically. The disease is caused by the Basidiomycete Sporisorium scitamineum, a biotrophic fungus that colonizes mainly sugarcane. Sugarcane-smut interaction has been extensively studied by this research group for the past few years in their various aspects, considering both the pathogen attack and plant defenses. This work aimed to functionally address fungal candidate effector proteins associated with this pathosystem. Effectors are essential to modulate host metabolism to allow pathogen colonization. The identification of such proteins may assist in recognition of resistance genes relevant to genetic breeding programs. Based on the complete genome sequence of S. scitamineum and the dual transcriptomic data candidate genes were selected in silico. Selection strategies were based on the predicted secretome and differential expression levels of the genes in planta. Candidate effectors were analyzed regarding their expression pattern, subcellular location and influence over basal plant defenses and plant immunity. The results showed that the S. scitamineum candidate effector genes are expressed under the influence of the host genotype. It was observed various expression patterns in the set of selected genes and differential subcellular localization patterns. These results will enable future researches considering virulence level of different isolates and also help decision making in plant breeding programs. / O carvão da cana-de-açúcar é uma doença cosmopolita de grande importância para o agronegócio, uma vez que pode afetar a produtividade da cultura. A doença é causada pelo basidiomiceto Sporisorium scitamineum, fungo biotrófico que coloniza exclusivamente a cana-de-açúcar. A interação cana-carvão vem sendo extensivamente estudada por este grupo de pesquisa nos últimos anos em seus vários aspectos, considerando as atividades de ataque e defesa do patógeno e da planta, respectivamente. Este trabalho teve como finalidade o estudo funcional de proteínas candidatas a efetores neste patossistema. Efetores são moléculas essenciais na manipulação do metabolismo e fisiologia do hospedeiro de forma a permitir sua colonização. A identificação de tais proteínas auxilia no reconhecimento de genes de resistência podendo gerar informações relevantes a programas de melhoramento genético na produção de variedades resistentes. A estratégia de seleção utilizada se baseia em características do secretoma predito e da expressão diferencial de genes do patógeno in planta. Os candidatos foram analisados quanto ao padrão de expressão gênica, à localização sub celular e sua influência sobre a defesa basal e imunidade em plantas. Os resultados demonstraram que a expressão dos genes que codificam para as proteínas efetoras de S. scitamineum e é influenciada pelo genótipo das plantas infectadas. Foram observadas variações no padrão de expressão entre o conjunto de efetores selecionados, bem como padrões diferenciais de localização sub celular e influência sobre a imunidade em plantas. Os resultados gerados por este trabalho servirão de subsídio para estudos futuros sobre os níveis de virulência dos diferentes isolados do patógeno bem como para auxiliar a tomada de decisão em programas de melhoramento genético de variedades resistentes ao carvão da cana.
|
6 |
Imagerie cellulaire du stress métallique induit par le cadmium chez la micro-algue verte Chlamydomonas reinhardtii par techniques synchrotron µXRF / XAS et nanoSIMS / Cell imaging of the metallic stress induced by cadmium in the green micro-alga Chlamydomonas reinhardtii by synchrotron-based techniques (µXRF/XAS) and nanoSIMSPenen, Florent 17 December 2015 (has links)
La micro-algue verte Chlamydomonas reinhardtii est considérée comme un modèle dans l’étude du stress métallique chez les organismes photosynthétiques. Les mécanismes de tolérance au stress induit par le cadmium ne sont pas encore clairement établis. Afin de déterminer ces mécanismes, la localisation subcellulaire et la spéciation chimique in situ du cadmium ont été déterminées chez trois souches de C. reinhardtii exposées au cadmium en condition mixotrophe (CO2 + Acétate) : (i) une souche de type sauvage (wt), (ii) la souche cell-wall less (cw15) qui est déficiente en paroi cellulaire, (iii) la souche pcs1 qui surexprime la phytochélatine synthase (PCS), enzyme ordinairement cytosolique, directement dans son chloroplaste. Pour ce faire, la toxicité du cadmium a été déterminée en mesurant la croissance ainsi que la teneur en chlorophylle et en amidon des micro-algues. Puis, la localisation du cadmium au niveau subcellulaire a été réalisée par trois techniques complémentaires (fractionnement subcellulaire, µXRF, TEM/X-EDS). La spéciation chimique in situ du cadmium a été effectuée par µXAS et XAS. Enfin, l’imagerie élémentaire et isotopique par nanoSIMS a permis de compléter les distributions élémentaires dans la cellule et de déterminer l’impact du cadmium sur les mécanismes d’assimilation du carbone. (i) Les résultats de ce travail montrent que la souche wt est la plus sensible au cadmium des trois avec une diminution de la croissance et de la teneur en chlorophylle. Lorsqu’elle ne présente pas ces signes de toxicité, le cadmium est séquestré dans l’ensemble de la cellule par des ligands thiolés et de façon mineure par les granules de polyphosphates. Suite à l’exposition à de fortes concentrations en cadmium, le cadmium intracellulaire est lié majoritairement à des ligands carboxylés probablement induits par le stress oxydatif. De plus, la présence du cadmium dans le pyrénoïde bloque l’assimilation du carbone inorganique (CO2), au profit de l’assimilation du carbone organique (acétate), qui est stocké sous forme d’amidon. (ii) La surexpression de la PCS de la souche pcs1 provoque une production d’amidon importante autour du pyrénoïde et protège la chlorophylle du stress lié au cadmium. Bien que la synthèse de phytochélatines soit potentiellement élevée, la moitié du cadmium intracellulaire est séquestrée par les granules de polyphosphates et l’amidon. (iii) La souche cw15 est la plus tolérante des trois souches et n’accumule pas la totalité du cadmium disponible, contrairement aux cellules possédant une paroi cellulaire. Similairement au wt, le cadmium intracellulaire est séquestré majoritairement par des ligands thiolés et de façon mineure par les granules de polyphosphates. L’observation de granules de polyphosphates excrétées par les micro-algues permet l’hypothèse de l’excrétion du cadmium vacuolaire induisant un flux constant de cadmium à travers la cellule. En conclusion, la séquestration du cadmium via des ligands soufre, potentiellement par des polypeptides thiolés, est le mécanisme de tolérance majoritaire mis en place par C. reinhardtii. Néanmoins, la séquestration du cadmium par les granules de polyphosphates semble apporter une plus grande tolérance vis-à-vis du stress lié au cadmium. / The green micro-alga Chlamydomonas reinhardtii is commonly used as a model for the study of the metallic stress in photosynthetic organisms. Tolerance mechanisms against stress induced by cadmium are not well understood. In order to determine these mechanisms, subcellular location and in situ speciation have been determined in three C. reinhardtii strains exposed to cadmium in mixotrophic conditions (CO2 + Acetate) : (i) a wild type strain (wt), (ii) a cell-wall less strain (cw15) which is deficient in cell-wall, (iii) the pcs1 strain which overexpresses the cytosolic enzyme phytochetlatin synthase (PCS) directly in the chloroplast. Cadmium toxicity has been determined by the monitoring of growth and chlorophyll, starch content in micro-algae. Then, cadmium location at subcellular level has been performed using three complementary techniques (subcellular fractionation, µXRF and TEM/X-EDS). In situ cadmium speciation has been studied by µXAS and XAS. Finally, elemental and isotopic imaging by nanoSIMS has allowed to complete elemental distribution in the cells and to determine the impact of cadmium on the assimilation of carbon. (i) The results of this work show that the wt strain is the most sensitive strain to cadmium stress among the three studied strains with a growth and chlorophyll content decrease. When wt cells do not show signs of toxicity, cadmium is mainly sequestered in the whole cell by thiolated ligands and in polyphosphate granules. After an exposure to high concentration of cadmium, intracellular cadmium is mainly bound to carboxylated ligands, probably induced by oxidative stress. Moreover, cadmium located in the pyrenoid blocks inorganic carbon (CO2) assimilation and increases organic carbon (acetate) assimilation which is stored as starch. (ii) The overexpresssion of PCS in the pcs1 strain induces a strong production of starch around the pyrenoid and proctects the chlorophyll against cadmium stress. Although the synthesis of phytocheltins was potentially strong, half of the intracellular cadmium is sequestered in polyphosphate granules and in starch. (iii) Unlike cell-walled cells, the cw15 strain is the most tolerant strain and does not accumulate the totality of available cadmium. Similarly to wt strain, intracellular cadmium is mainly sequestered by thiolated ligands and in polyphosphate granules. The observation of polyphosphate granules excreted by the micro-algae allows the hypothesis of the excretion of vacuolar cadmium, inducing a constant flux of cadmium through the cells. In conclusion, cadmium sequestration by sulfur ligands, potentially by thiolated polypeptides, is the main tolerance mechanism implemented by C. renhardtii. However, cadmium sequestration in polyphosphate granules seems to allow a better tolerance against cadmium stress.
|
Page generated in 0.0886 seconds