Global ETD Search

1	Cytotoxicity and Proteomes Studies of AN30 on Oral Cancer Cell Jhong, Rong-Chang 14 August 2012 (has links) The effect of natural compound AN30 on oral cancer cell line (Ca9-22), and normal oral cell line (HGF-1) was investigated. AN30 was purified from native fern plant Thelypteris torresiana. It had been reported to induce cell cycle arrest, apoptosis and DNA damage, cell growth inhibition in lung cancer cells, breast cancer cells and prostate cancer cells, but the effect in oral cancer is unknown. In a preliminary test, AN30 shows high toxicity to oral cancer cells but not normal oral cells. After treatment of AN30 to oral cancer cells, caspase activation and comet assays were performed to verity apoptosis induction. 2-D electrophoresis was used to find out the differentially expressed proteins between normal and cancer oral cell line under AN30 treatment. The proteins were further identified by LC/MS and western blot. The mechanism of these candidate proteins on drug toxicity remains further investigation. oral cancer proteomes natural compound
2	Développement et application de méthodes bioinformatiques pour l'analyse des protéines contenant des répétitions en tandem / Development and application of bioinformatics methods for the identification and characterisation of tandem repeat in protein sequences Richard, François D. 21 October 2016 (has links) De nos jours, l’augmentation du volume des données de séquençage est bien plus forte que celle de notre capacité à analyser ces données. En lien avec ce déluge de données et le besoin urgent de nouveaux outils bioinformatiques pour les analyser, notre travail consiste à développer de nouveaux algorithmes pour mieux comprendre les relations entre séquence, structure, et fonction des protéines. Les protéines contiennent de larges portions de séquences périodiques, qui forment des motifs d’acides aminés répétés les uns à la suite des autres que l’on appelle des répétitions en tandem. Elles se retrouvent dans 14% des protéines. De nombreuses études ont montré leur importance fonctionnelle ainsi que leur implication dans de nombreuses maladies humaines, notamment le cancer. Ici, nous montrons l’importance d’adopter une approche incluant plusieurs outils de détection de répétition en tandem afin de s’assurer d’obtenir le jeu de données le plus complet. Nous avons ainsi réalisé un pipeline approprié, et développé deux outils spécifiques : un filtre, pour gagner en rapidité, et un score, pour sélectionner les répétitions les plus pertinentes dans les régions structurées des protéines. Enfin, nous avons utilisé ce pipeline sur une sélection de 94 protéomes. Cette analyse a permis de mettre à jour le précédent recensement des répétitions, montrant que 64% des protéines contenaient des répétitions en tandem. Elle a également permis de mieux comprendre les répétions en tandem dans leurs caractéristiques, leurs compositions et leurs implications dans les maladies humaines. / Today, the growth of protein sequencing data significantly exceeds the growth of capacities to analyze these data. In line with this data deluge and urgent needs in new bioinformatics tools our work deals with the development of new algorithms to better understand the sequence-structure-function relationship. Proteins contain a large portion of periodic sequences representing arrays of repeats that are directly adjacent to each other, so called tandem repeats (TRs). TRs occur at least in 14% of all proteins. Highly divergent, they range from a single amino acid repetition to domains of 100 or more repeated residues. Numerous studies demonstrated the fundamental functional importance of such TRs and their involvement in human diseases, especially cancers. Here we show the importance of integrating several TR detectors to get the most complete set of TRs in proteomes. We designed an appropriate pipeline and developed a filter to speed the process as well as a new scoring module to select relevant structured TRs. In addition, we undertook a large scale analysis of TRs in 94 proteomes. This large scale analysis allowed us to update previous census of TR showing that TRs occurs in 64% of all proteins and leads to a better understanding of TR in terms of their characteristics, composition and implication in human disease. Bioinformatique Répétitions en tandem Séquences Protéomes Bioinformatics Tandem repeats Sequences Proteomes
3	Demonstration of Protein-Based Human Identification Using the Hair Shaft Proteome Parker, G.J., Leppert, T., Anex, D.S., Hilmer, J.K., Matsunami, N., Baird, L., Stevens, J., Parsawar, K., Durbin-Johnson, B.P., Rocke, D.M., Nelson, C., Fairbanks, D.J., Wilson, Andrew S., Rice, R.H., Woodward, S.R., Bothner, B., Hart, B.R., Leppert, M. 2016 July 1921 (has links) Yes / Human identification from biological material is largely dependent on the ability to characterize genetic polymorphisms in DNA. Unfortunately, DNA can degrade in the environment, sometimes below the level at which it can be amplified by PCR. Protein however is chemically more robust than DNA and can persist for longer periods. Protein also contains genetic variation in the form of single amino acid polymorphisms. These can be used to infer the status of non-synonymous single nucleotide polymorphism alleles. To demonstrate this, we used mass spectrometry-based shotgun proteomics to characterize hair shaft proteins in 66 European-American subjects. A total of 596 single nucleotide polymorphism alleles were correctly imputed in 32 loci from 22 genes of subjects’ DNA and directly validated using Sanger sequencing. Estimates of the probability of resulting individual non-synonymous single nucleotide polymorphism allelic profiles in the European population, using the product rule, resulted in a maximum power of discrimination of 1 in 12,500. Imputed non-synonymous single nucleotide polymorphism profiles from European–American subjects were considerably less frequent in the African population (maximum likelihood ratio = 11,000). The converse was true for hair shafts collected from an additional 10 subjects with African ancestry, where some profiles were more frequent in the African population. Genetically variant peptides were also identified in hair shaft datasets from six archaeological skeletal remains (up to 260 years old). This study demonstrates that quantifiable measures of identity discrimination and biogeographic background can be obtained from detecting genetically variant peptides in hair shaft protein, including hair from bioarchaeological contexts. / The Technology Commercialization Innovation Program (Contracts #121668, #132043) of the Utah Governors Office of Commercial Development, the Scholarship Activities Peptides Proteomic databases Membrane proteins Genetic loci Alleles Proteomes Archaeology Haplotypes
4	Computational Studies on Structures and Functions of Single and Multi-domain Proteins Mehrotra, Prachi January 2017 (has links) (PDF) Proteins are essential for the growth, survival and maintenance of the cell. Understanding the functional roles of proteins helps to decipher the working of macromolecular assemblies and cellular machinery of living organisms. A thorough investigation of the link between sequence, structure and function of proteins, helps in building a comprehensive understanding of the complex biological systems. Proteins have been observed to be composed of single and multiple domains. Analysis of proteins encoded in diverse genomes shows the ubiquitous nature of multi-domain proteins. Though the majority of eukaryotic proteins are multi-domain in nature, 3-D structures of only a small proportion of multi-domain proteins are known due to difficulties in crystallizing such proteins. While functions of individual domains are generally extensively studied, the complex interplay of functions of domains is not well understood for most multi-domain proteins. Paucity of structural and functional data, affects our understanding of the evolution of structure and function of multi-domain proteins. The broad objective of this thesis is to achieve an enhanced understanding of structure and function of protein domains by computational analysis of sequence and structural data. Special attention is paid in the first few chapters of this thesis on the multi-domain proteins. Classification of multi-domain proteins by implementation of an alignment-free sequence comparison method has been achieved in Chapters 2 and 3. Studies on organization, interactions and interdependence of domain-domain interactions in multi-domain proteins with respect to sequential separation between domains and N to C-terminal domain order have been described in Chapters 4 and 5. The functional and structural repertoire of organisms can be comprehensively studied and compared using functional and structural domain annotations. Chapter 6, 7 and 8 represent the proteome-wide structure and function comparisons of various pathogenic and non-pathogenic microorganisms. These comparisons help in identifying proteins implicated in virulence of the pathogen and thus predict putative targets for disease treatment and prevention. Chapter 1 forms an introduction to the main subject area of this thesis. Starting with describing protein structure and function, details of the four levels of hierarchical organization of protein structure have been provided, along with the databases that document protein sequences and structures. Classification of protein domains considered as the realm of function, structure and evolution has been described. The usefulness of classification of proteins at the domain level has been highlighted in terms of providing an enhanced understanding of protein structure and function and also their evolutionary relatedness. The details of structure, function and evolution of multi-domain proteins have also been outlined in chapter 1. ! Chapter 2 aims to achieve a biologically meaningful classification scheme for multi-domain protein sequences. The overall function of a multi-domain protein is determined by the functional and structural interplay of its constituent domains. Traditional sequence-based methods utilize only the domain-level information to classify proteins. This does not take into account the contributions of accessory domains and linker regions towards the overall function of a multi-domain protein. An alignment-free protein sequence comparison tool, CLAP (CLAssification of Proteins) previously developed in this laboratory, was assessed and improved when the author joined the group. CLAP was developed especially to handle multi-domain protein sequences without a requirement of defining domain boundaries and sequential order of domains (domain architecture). ! The working principle of CLAP involves comparison of all against all windows of 5-residue sequence patterns between two protein sequences. The sequences compared could be full-length comprising of all the domains in the two proteins. This compilation of comparison is represented as the Local Matching Scores (LMS) between protein sequences (nslab.iisc.ernet.in/clap/). It has been previously shown that the execution time of CLAP is ~7 times faster than other protein sequence comparison methods that employ alignment of sequences. In Chapter 2, CLAP-based classification has been carried out on two test datasets of proteins containing (i) Tyrosine phosphatase domain family and (ii) SH3-domain family. The former dataset comprises both single and multi-domain proteins that sometimes consist of domain repeats of the tyrosine phosphatase domain. The latter dataset consists only of multi-domain proteins with one copy of the SH3-domain. At the domain-level CLAP-based classification scheme resulted in a clustering similar to that obtained from an alignment-based method, ClustalW. CLAP-based clusters obtained for full-length datasets were shown to comprise of proteins with similar functions and domain architectures. Hence, a protein classification scheme is shown to work efficiently that is independent of domain definitions and requires only the full-length amino acid sequences as input.! Chapter 3 explores the limitations of CLAP in large-scale protein sequence comparisons. The potential advantages of full-length protein sequence classification, combined with the availability of the alignment-free sequence comparison tool, CLAP, motivated the conceptualization of full-length sequence classification of the entire protein repertoire. Before undertaking this mammoth task, working of CLAP was tested for a large dataset of 239,461 protein sequences. Chapter 3 discusses the technical details of computation, storage and retrieval of CLAP scores for a large dataset in a feasible timeframe. CLAP scores were examined for protein pairs of same domain architecture and ~22% of these showed 0 CLAP similarity scores. This led to investigation of the sensitivity of CLAP with respect to sequence divergence. Several test datasets of proteins belonging to the same SCOP fold were constructed and CLAP-based classification of these proteins was examined at inter and intra-SCOP family level. CLAP was successful in efficiently clustering evolutionary related proteins (defined as proteins within the same SCOP superfamily) if their sequence identity >35%. At lower sequence identities, CLAP fails to recognize any evolutionary relatedness. Another test dataset consisting of two-domain proteins with domain order swapped was constructed. Domain order swap refers to domain architectures of type AB and BA, consisting of domains A and B. A condition that the sequence identities of homologous domains were greater than 35% was imposed. CLAP could effectively cluster together proteins of the same domain architectures in this case. Thus, the sequence identity threshold of 35% at the domain-level improves the accuracy of CLAP. The analysis also showed that for highly divergent sequences, the expectation of 5-residue pattern match was likely a stringent criterion. Thus, a modification in the 5-residue identical pattern match criterion, by considering even similar residue and gaps within matched patterns may be required to effectuate CLAP-based clustering of remotely related protein sequences. Thus, this study highlights the limitations of CLAP with respect to large-scale analysis and its sensitivity to sequence divergence. ! Chapters 4 and 5 discuss the computational analysis of inter-domain interactions with respect to sequential distance and domain order. Knowledge of domain composition and 3-D structures of individual domains in a multi-domain protein may not be sufficient to predict the tertiary structure of the multi-domain protein. Substantial information about the nature of domain-domain interfaces helps in prediction of the tertiary as well as the quaternary structure of a protein. Therefore, chapter 4 explores the possible relationship between the sequential distance separating two domains in a multi-domain protein and the extent of their interaction. With increasing sequential separation between any two domains, the extent of inter-domain interactions showed a gradual decrease. The trend was more apparent when sequential separation between domains is measured in terms of number of intervening domains. Irrespective of the linker length, extensive interactions were seen more often between contiguous domains than between non-contiguous domains. Contiguous domains show a broader interface area and lower proportion of non-interacting domains (interface area: 0 Å2 to - 4400 Å2, 2.3% non-interacting domains) than non-contiguous domains (interface area: 0 Å2 to - 2000 Å2, 34.7% non-interacting domains). Additionally, as inter-protein interactions are mediated through constituent domains, rules of protein-protein interactions were applied to domain-domain interactions. Tight binding between domains is denoted as putative permanent domain-domain interactions and domains that may dissociate and associate with relatively weak interactions to regulate functional activity are denoted as putative transient domain-domain interactions. An interface area threshold of 600 Å2 was utilized as a binary classifier to distinguish between putative permanent and putative transient domain-domain interactions. Therefore, the state of interaction of a domain pair is defined as either putative permanent or putative transient interaction. Contiguous domains showed a predominance of putative permanent nature of inter-domain interface, whereas non-contiguous domains showed a prevalence of putative transient interfaces. The state of interaction of various SCOP superfamily pairs was studied across different proteins in the dataset. SCOP superfamily pairs mostly showed a conserved state of interaction, i.e. either putative permanent or putative transient in all their occurrences across different proteins. Thus, it is noted that contiguous domains interact extensively more often than non-contiguous domains and specific superfamily pairs tend to interact in a conserved manner. In conclusion, a combination of interface area and other inter-domain properties along with experimental validation will help strengthen the binary classification scheme of putative permanent and transient domain-domain interactions.! Chapter 5 provides structural analysis of domain pairs occurring in different sequential domain orders in mutli-domain proteins. The function and regulation of a multi-domain protein is predominantly determined by the domain-domain interactions. These in turn are influenced by the sequential order of domains in a protein. With domains defined using evolutionary and structural relatedness (SCOP superfamily), their conservation of structure and function was studied across domain order reversal. A domain order reversal indicates different sequential orders of the concerned domains, which may be identified in proteins of same or different domain compositions. Domain order reversals of domains A and B can be indicated in protein pair consisting of the domain architectures xAxBx and xBxAx, where x indicates 0 or more domains. A total of 161 pairs of domain order reversals were identified in 77 pairs of PDB entries. For most of the comparisons between proteins with different domain composition and architecture, large differences in the relative spatial orientation of domains were observed. Although preservation of state of interaction was observed for ~75% of the comparisons, none of the inter-domain interfaces of domains in different order displayed high interface similarity. These domain order reversals in multi-domain proteins are contributed by a limited number of 15 SCOP superfamilies. Majority of the superfamilies undergoing order reversal either function as transporters or regulatory domains and very few are enzymes. A higher proportion of domain order reversals were observed in domains separated by 0 or 1 domains than those separated by more than 1 domain. A thorough analysis of various structural features of domains undergoing order reversal indicates that only one order of domains is strongly preferred over all possible orders. This may be due to either evolutionary selection of one of the orders and its conservation throughout generations, or the fact that domain order reversals rarely conserve the interface between the domains. Further studies (Chapters 6 to 8) utilize the available computational techniques for structural and functional annotation of proteins encoded in a few bacterial genomes. Based on these annotations, proteome-wide structure and function comparisons were performed between two sets of pathogenic and non-pathogenic bacteria. The first study compares the pathogenic Mycobacterium tuberculosis to the closely related organism Mycobacterium smegmatis which is non-pathogenic. The second study primarily identified biologically feasible host-pathogen interactions between the human host and the pathogen Leptospira interrogans and also compared leptospiral-host interactions of the pathogenic Leptospira interrogans and of the saprophytic Leptospira biflexa with the human host. Chapter 6 describes the function and structure annotation of proteins encoded in the genome of M. smegmatis MC2-155. M. smegmatis is a widely used model organism for understanding the pathophysiology of M. tuberculosis, the primary causative agent of tuberculosis in humans. M. smegmatis and M. tuberculosis species of the mycobacterial genus share several features like a similar cell-wall architecture, the ability to oxidise carbon monoxide aerobically and share a huge number of homologues. These features render M. smegmatis particularly useful in identifying critical cellular pathways of M. tuberculosis to inhibit its growth in the human host. In spite of the similarities between M. smegmatis and M. tuberculosis, there are stark differences between the two due to their diverse niche and lifestyle. While there are innumerable studies reporting the structure, function and interaction properties of M. tuberculosis proteins, there is a lack of high quality annotation of M. smegmatis proteins. This makes the understanding of the biology of M. smegmatis extremely important for investigating its competence as a good model organism for M. tuberculosis. With the implementation of available sequence and structural profile-based search procedures, functional and structural characterization could be achieved for ~92% of the M. smegmatis proteome. Structural and functional domain definitions were obtained for a total of 5695 of 6717 proteins in M. smegmatis. Residue coverage >70% was achieved for 4567 proteins, which constitute ~68% of the proteome. Domain unassigned regions more than 30 residues were assessed for their potential to be associated to a domain. For 1022 proteins with no recognizable domains, putative structural and functional information was inferred for 328 proteins by the use of distance relationship detection and fold recognition methods. Although 916 sequences of 1022 proteins with no recognizable domains were found to be specific to M. smegmatis species, 98 of these are specific to its MC2-155 strain. Of the 1828 M. smegmatis proteins classified as conserved hypothetical proteins, 1038 proteins were successfully characterized. A total of 33 Domains of Unknown Function (DUFs) occurring in M. smegmatis could be associated to structural domains. A high representation of the tetR and GntR family of transcription regulators was noted in the functional repertoire of M. smegmatis proteome. As M. smegmatis is a soil-dwelling bacterium, transcriptional regulators are crucial for helping it to adapt and survive the environmental stress. Similarly, the ABC transporter and MFS domain families are highly represented in the M. smegmatis proteome. These are important in enabling the bacteria to uptake carbohydrate from diverse environmental sources. A lower number of virulent proteins were identified in M. smegmatis, which justifies its non-pathogenicity. Thus, a detailed functional and structural annotation of the M. smegmatis proteome was achieved in Chapter 6. Chapter 7 delineates the similarities and difference in the structure and function of proteins encoded in the genomes of the pathogenic M. tuberculosis and the non-pathogenic M. smegmatis. The protocol employed in Chapter 6 to achieve the proteome-wide structure and function annotation of M. smegmatis was also applied to M. tuberculosis proteome in Chapter 7. The number of proteins encoded by the genome of M. smegmatis strain MC2-155 (6717 proteins) is comparatively higher than that in M. tuberculosis strain H37Rv (4018 proteins). A total of 2720 high confidence orthologues sharing ≥30% sequence identity were identified in M. tuberculosis with respect to M. smegmatis. Based on the orthologue information, specific functional clusters, essential proteins, metabolic pathways, transporters and toxin-antitoxin systems of M. tuberculosis were inspected for conservation in M. smegmatis. Among the several categories analysed, 53 metabolic pathways, 44 membrane transporter proteins belonging to secondary transporters and ATP-dependent transporter classes, 73 toxin-antitoxin systems, 23 M. tuberculosis-specific targets, 10 broad-spectrum targets and 34 targets implicated in persistence of M. tuberculosis could not detect any orthologues in M. smegmatis. Several of the MFS superfamily transporters act as drug efflux pumps and are hence associated with drug resistance in M. tuberculosis. The relative abundances of MFS and ABC superfamily transporters are higher in M. smegmatis than in M. tuberculosis. As these transporters are involved in carbohydrate uptake, their higher representation in M. smegmatis than in M. tuberculosis highlights the lack of proficiency of M. tuberculosis to assimilate diverse carbon sources. In the case of porins, MspA-like and OmpA-like porins are selectively present in either M. smegmatis or M. tuberculosis. These differences help to elucidate protein clusters for which M. smegmatis may not be the best model organism to study M. tuberculosis proteins.! At the domain-level, ATP-binding domain of ABC transporters, tetracycline transcriptional regulator (tetR) domain family, major facilitator superfamily (MFS) domain family, AMP-binding domain family and enoyl-CoA hydrolase domain family are highly represented in both M. smegmatis and M. tuberculosis proteomes. These domains play an essential role in the carbohydrate uptake systems and drug-efflux pumps among other diverse functions in mycobacteria. There are several differentially represented domain families in M. tuberculosis and M. smegmatis. For example, the pentapeptide-repeat domain, PE, PPE and PIN domains although abundantly present in M. tuberculosis, are very rare in M. smegmatis. Therefore, such uniquely or differentially represented functional and structural domains in M. tuberculosis as compared to M. smegmatis may be linked to pathogenicity or adaptation of M. tuberculosis in the host. Hence, major differences between M. tuberculosis and M. smegmatis were identified, not only in terms of domain populations but also in terms of domain combinations. Thus, Chapter 7 highlights the similarities and differences between M. smegmatis and M. tuberculosis proteomes in terms of structure and function. These differences provide an understanding of selective utilization of M. smegmatis as a model organism to study M. tuberculosis. ! In Chapter 8, computational tools have been employed to predict biologically feasible host-pathogen interactions between the human host and the pathogenic, Leptospira interrogans. Sensitive profile-based search procedures were used to specifically identify practical drug targets in the genome of Leptospira interrogans, the causative agent of the globally widespread zoonotic disease, Leptospirosis. Traditionally, the genus Leptospira is classified into two species complex- the pathogenic L. interrogans and the non-pathogenic saprophyte L. biflexa. The pathogen gains entry into the human host through direct or indirect contact with fluids of infected animals. Several ambiguities exist in the understanding of L. interrogans pathogenesis. An integration of multiple computational approaches guided by experimentally derived protein-protein interactions, was utilized for recognition of host-pathogen protein-protein interactions. The initial step involved the identification of similarities of host and L. interrogans proteins with crystal structures of experimentally known transient protein-protein complexes. Further, conservation of interfacial nature was used to obtain high confidence predictions for putative host-pathogen protein-protein interactions. These predictions were subjected to further selection based on subcellular localization of proteins of the human host and L. interrogans, and tissue-specific expression profiles of the host proteins. A total of 49 protein-protein interactions mediated by 24 L. interrogans proteins and 17 host proteins were identified and these may be subjected to further experimental investigations to assess their in vivo relevance. The functional relevance of similarities and differences between the pathogenic and non-pathogenic leptospires in terms of interactions with the host has also been explored. For this, protein-protein interactions across human host and the non-pathogenic saprophyte L. biflexa were also predicted. Nearly 39 leptospiral-host interactions were recognized to be similar across both the pathogen and saprophyte in the context of processes that influence the host. The overlapping leptospiral-host interactions of L. interrogans and L. biflexa proteins with the human host proteins are primarily associated with establishment of its entry into the human host. These include adhesion of the leptospiral proteins to host cells, survival in host environment such as iron acquisition and binding to components of extracellular matrix and plasma. The disjoint sets of leptospiral-host interactions are species-specific interactions, more importantly indicative of the establishment of infection by L. interrogans in the human host and immune clearance of L. biflexa by the human host. With respect to L. interrogans, these specific interactions include interference with blood coagulation cascade and dissemination to target organs by means of disruption of cell junction assembly. On the other hand, species-specific interactions of L. biflexa proteins include those with components of host immune system. ! In spite of the limited availability of experimental evidence, these help in identifying functionally relevant interactions between host and pathogen by integrating multiple lines of evidence. Thus, inferences from computational prediction of host-pathogen interactions act as guidelines for experimental studies investigating the in vivo relevance of these predicted protein-protein interactions. This will further help in developing effective measures for treatment and disease prevention. In summary, Chapters 2 and 3 describe the implementation, advantages and limitations of the alignment-free full-length sequence comparison method, CLAP. Chapter 4 and 5 are dedicated to understand the domain-domain interactions in multi-domain protein sequences and structures. In Chapters 6, 7 and 8 the computational analyses of the mycobacterial species and leptospiral species helped in an enhanced understanding of the functional repertoire of these bacteria. These studies were undertaken by utilizing the biological sequence data available in public databases and implementation of powerful homology-detection techniques. The supplemental data associated with the chapters is provided in a compact disc attached with this thesis.! Proteins - Building Blocks Protein Sequences Protein Domain Hidden Markov Models (HMM) Multi-domain Proteins Mycobacterium smegmatis MC2-155 Mycobacterium tuberculosis Proteomes Leptospira Interrogans Leptospira Biflexa Proteomes Leptospira Biflexa Genomes Mycobacterium tuberculosis H37Rv Mathematics
5	Bioinformatics mining of the dark matter proteome for cancer targets discovery Unknown Date (has links) Mining the human genome for therapeutic target(s) discovery promises novel outcome. Over half of the proteins in the human genome however, remain uncharacterized. These proteins offer a potential for new target(s) discovery for diverse diseases. Additional targets for cancer diagnosis and therapy are urgently needed to help move away from the cytotoxic era to a targeted therapy approach. Bioinformatics and proteomics approaches can be used to characterize novel sequences in the genome database to infer putative function. The hypothesis that the amino acid motifs and proteins domains of the uncharacterized proteins can be used as a starting point to predict putative function of these proteins provided the framework for the research discussed in this dissertation. / Includes bibliography. / Thesis (M.S.)--Florida Atlantic University, 2015. / FAU Electronic Theses and Dissertations Collection Bioinformatics Cancer -- Genetic aspects Drug development -- Data processing Genomics Medical informatics Proteomes -- Data processing Tumors -- Immunological aspects
6	Transcriptional profiling of shell calcification in bivalves Yarra, Tejaswi January 2018 (has links) Mollusc shells are unique adaptations that serve to protect the organisms that make them, and are a defining feature of the phylum. However the molecular underpinnings of shell forming processes are still largely unexplored. To further understand mollusc shell formation, I studied three bivalve species in this project: the blue mussel Mytilus edulis, the Pacific oyster Crassostrea gigas, and the king scallop Pecten maximus. While previous analyses of the shell proteomes showed species specificity, transcriptomes of the mantle tissues revealed more commonalities. To reconcile these differences, I studied differential gene expression in shell damage-repair experiments and during the formation of the first larval shell, to produce a comprehensive overview of shell formation processes. Expression data showed large biological variability between individuals, requiring matched-pair experimental designs to detect differential gene expression during shell repair. Loci differentially expressed during shell repair and in the larvae encoded shell matrix proteins, transmembrane transporters, and novel transcripts. A large number of shell matrix proteins, encoded in differentially expressed loci, were common in all three species during shell formation, indicating that shell forming proteins between different species may be more common than previously thought. Differential expression of transmembrane transporters during shell repair indicated that the animals may be regulating bicarbonate ions during shell formation. Finally, the experiments revealed novel transcripts, with unknown annotations to public datasets, that may putatively be involved in shell formation.
7	Estudo do proteoma e imunoproteoma salivar do carrapato de bovinos, Rhipicephalus (Boophilus) microplus, para identificação e caracterização de antígenos silenciosos / Study of salivary proteome and immunoproteome of cattle tick, Rhipicephalus (Boophilus) microplus, for identification and characterization of silent antigens Garcia, Gustavo Rocha 29 April 2013 (has links) Infestações com Rhipicephalus microplus, o carrapato dos bovinos, causam enormes prejuízos econômicos para a pecuária. Os carrapatos estão desenvolvendo resistência aos carrapaticidas que, além dessa desvantagem, deixam resíduos em carne e leite. Vacinas anticarrapato representam uma alternativa sustentável de controle de infestações, mas as atualmente disponíveis têm efeitos parciais e transitórios. Surge, assim, a necessidade de identificar novos antígenos vacinais. Para alcançar esse objetivo este trabalho explora o fato de que bovinos apresentam fenótipos contrastantes e herdáveis de infestações que são específicos de certas raças. Além disso, o nível de imunidade do hospedeiro afeta a transcrição de genes de glândulas salivares do carrapato, órgão que produz proteínas que medeiam o parasitismo. A hipótese de trabalho é a que os diferentes níveis da imunidade anticarrapato do hospedeiro afetam, também, a composição salivar do parasita. Assim, em carrapatos alimentando-se em hospedeiros resistentes as proteínas que são cruciais ao parasitismo poderão estar ausentes ou deficientes na sua saliva e por isso os carrapatos não terminam sua refeição de sangue. A neutralização dessas mesmas proteínas pela imunidade humoral pode ter o mesmo efeito e por isso, essas proteínas constituem bons antígenos vacinais. Assim, o objetivo do trabalho foi identificar novos antígenos vacinais em saliva de fêmeas e glândulas salivares de ninfas, machos e fêmeas de carrapatos alimentados em hospedeiros resistentes e suscetíveis, bem como em larvas não alimentadas oriundas de ovos de fêmeas alimentadas nestes mesmos hospedeiros. Para isso, foram empregadas abordagens de sequenciamento de nova geração \"RNA-Seq\" (454) e abordagens proteômicas, como análise diferencial em gel (DIGE) e Western Blots (imunoproteoma) seguido de sequenciamento de massa, além da tecnologia de identificação de proteínas multidimensionais (ou Multidimensional Protein Identification Technology, MudPIT) para descrever o proteoma das glândulas salivares e da saliva de fêmeas. A análise transcriptômica resultou no sequencimanto de 1.999.086 reads que permitiu identificar e classificar 11.676 sequências codificadoras (CDS), muitas das quais (3.600 CDS) contêm peptídeo sinal que é indicativo de secreção, portanto podendo estar presente na saliva e Resumo Gustavo Rocha Garcia apresentar função importante na hematofagia. Por meio de MudPIT, identificamos 321 proteínas salivares diferentes, além de 126 proteínas no DIGE e 266 proteínas nos imunoproteomas. Muitas dessas proteínas podem ser consideradas antígenos potenciais por estarem associadas com a hematofagia/parasitismo, tais como proteases, nucleases, inibidores de proteases, peptídeos antimicrobianos, proteínas de fixação, entre outros, inclusive proteínas ainda não caracterizadas. A maioria dos genes codificantes dessas proteínas está mais expressa em carrapatos alimentados em hospedeiros suscetíveis, principalmente em carrapatos machos. Além disso, muitas dessas proteínas não são reconhecidas por soros bovinos, inclusive soros de bovinos infestados, embora soros de bovinos infestados e resistentes ao carrapato apresente a maioria das reatividades. O conjunto dos resultados sugere que em nível de proteína a composição da saliva também é afetada pelos diferentes níveis de imunidade dos hospedeiros, além de variar com o ciclo de vida do carrapato. Desse modo, concluímos que as estratégias de investigação empregadas foram satisfatórias para identificar um conjunto de antígenos salivares do carrapato R. microplus que representam proteínas alvos para compor vacinas multicomponentes anticarrapato. / Infestation with Rhipicephalus microplus, the cattle tick, causes huge economic losses to livestock. Ticks are developing resistance to acaricides that, besides this disadvantage, leave residues in meat and milk. The anti tick vaccines represent a sustainable alternative of the infestations control, but the currently available has partial and transient effects. Thus arises the need to identify new vaccine antigens. To achieve this goal, this work explores the fact that cattle exhibit contrasting phenotypes and inheritable of infestations that are specific to certain breeds. Furthermore, the level of immunity of the host affects gene transcription tick salivary gland, organ that produces proteins that mediate the parasitism. The working hypothesis is that different levels of anti tick immunity of host affect also the salivary composition of the parasite. So in ticks feeding on resistant hosts the proteins that are crucial to parasitism may be absent or deficient in their saliva, and by this the ticks do not finish your meal blood. The neutralization of these same proteins by humoral immunity can have the same effect and by this, these proteins are good vaccine antigens. So, the aim of the study was to identify new vaccine antigens in saliva from females and salivary glands of nymphs, males and females of ticks fed on resistant and susceptible hosts as well as in unfed larvae originating from eggs of females fed on these same hosts. To this, were employed sequencing approaches of new generation \"RNA-Seq\" (454) and proteomic approaches, such as differential analysis in gel (DIGE) and Western Blots (immunoproteomics) followed by sequencing mass, besides the Multidimensional Protein Identification Technology (MudPIT) to describe the proteome of the salivary glands and saliva of females. The transcriptomics analysis identified 11,676 coding sequences (CDS), many of which (3,600 CDS) contain predicted signal peptide indicative of secretion, therefore may be present in saliva and provide an important function in blood feeding. Through MudPIT, we identify 321 different salivary proteins, besides 126 proteins in DIGE and 266 proteins in immunoproteomics. Many of these proteins may be considered as potential antigens to be associated with the blood meal/ parasitism, such as proteases, nucleases, protease inhibitors, antimicrobial peptides, proteins of attachment, among Abstract Gustavo Rocha Garcia others, including proteins not yet characterized. Most of the genes encoding of these proteins are more expressed in ticks fed on susceptible hosts, especially in male ticks. Moreover, many of these proteins are not recognized by bovine sera, including sera from infested hosts, although sera from infested and resistant host to tick present the most reactivities. The overall results suggest that in protein level, the composition of saliva is also affected by the different levels of immunity of the host, besides vary with the tick life cycle. Thus, we conclude that the research strategies employed were satisfactory to identify a set of tick salivary antigens from R. microplus that represent target proteins for composing anti tick multicomponent vaccines. carrapato Rhipicephalus microplus immunoproteome imunoproteoma MudPIT and DIGE. MudPIT e DIGE proteomas salivares do carrapato Rhipicephalus microplus tick tick salivary proteomes
8	Analysis of the subsequence composition of biosequences Cunial, Fabio 07 May 2012 (has links) Measuring the amount of information and of shared information in biological strings, as well as relating information to structure, function and evolution, are fundamental computational problems in the post-genomic era. Classical analyses of the information content of biosequences are grounded in Shannon's statistical telecommunication theory, while the recent focus is on suitable specializations of the notions introduced by Kolmogorov, Chaitin and Solomonoff, based on data compression and compositional redundancy. Symmetrically, classical estimates of mutual information based on string editing are currently being supplanted by compositional methods hinged on the distribution of controlled substructures. Current compositional analyses and comparisons of biological strings are almost exclusively limited to short sequences of contiguous solid characters. Comparatively little is known about longer and sparser components, both from the point of view of their effectiveness in measuring information and in separating biological strings from random strings, and from the point of view of their ability to classify and to reconstruct phylogenies. Yet, sparse structures are suspected to grasp long-range correlations and, at short range, they are known to encode signatures and motifs that characterize molecular families. In this thesis, we introduce and study compositional measures based on the repertoire of distinct subsequences of any length, but constrained to occur with a predefined maximum gap between consecutive symbols. Such measures highlight previously unknown laws that relate subsequence abundance to string length and to the allowed gap, across a range of structurally and functionally diverse polypeptides. Measures on subsequences are capable of separating only few amino acid strings from their random permutations, but they reveal that random permutations themselves amass along previously undetected, linear loci. This is perhaps the first time in which the vocabulary of all distinct subsequences of a set of structurally and functionally diverse polypeptides is systematically counted and analyzed. Another objective of this thesis is measuring the quality of phylogenies based on the composition of sparse structures. Specifically, we use a set of repetitive gapped patterns, called motifs, whose length and sparsity have never been considered before. We find that extremely sparse motifs in mitochondrial proteomes support phylogenies of comparable quality to state-of-the-art string-based algorithms. Moving from maximal motifs -- motifs that cannot be made more specific without losing support -- to a set of generators with decreasing size and redundancy, generally degrades classification, suggesting that redundancy itself is a key factor for the efficient reconstruction of phylogenies. This is perhaps the first time in which the composition of all motifs of a proteome is systematically used in phylogeny reconstruction on a large scale. Extracting all maximal motifs, or even their compact generators, is infeasible for entire genomes. In the last part of this thesis, we study the robustness of measures of similarity built around the dictionary of LZW -- the variant of the LZ78 compression algorithm proposed by Welch -- and of some of its recently introduced gapped variants. These algorithms use a very small vocabulary, they perform linearly in the input strings, and they can be made even faster than LZ77 in practice. We find that dissimilarity measures based on maximal strings in the dictionary of LZW support phylogenies that are comparable to state-of-the-art methods on test proteomes. Introducing a controlled proportion of gaps does not degrade classification, and allows to discard up to 20% of each input proteome during comparison. Subsequences Compositional complexity Phylogeny reconstruction Alignment-free sequence comparison Sparse motifs LZW LZWA Variance computation Protein domains Proteomes Phylogeny Polypeptides Molecular biology Algorithms
9	Mécanismes moléculaires impliqués dans la régulation post-traductionnelle du système de sécrétion du type VI chez Pseudomonas aeruginosa / Molecular mechanisms involved in the post-translational regulation of type VI secretion system in Pseudomonas aeruginosa Casabona, Maria Guillermina 13 May 2013 (has links) La bactérie à Gram-négatif Pseudomonas aeruginosa est un pathogène humain opportuniste qui peut causer des infections chroniques pouvant conduire à la mort des patients, et plus particulièrement ceux atteints de la mucoviscidose. Il a été montré qu‘un de ses trois systèmes de sécrétion de type VI (SST6) est actif durant les infections chroniques, le SST6-H1. P. aeruginosa est capable d'injecter des toxines de type bactériolytique directement dans le périplasme des autres bactéries à Gram-négatif grâce au SST6-H1, ce qui laisse penser que cette nanomachine pourrait être capitale dans la compétitivité de P. aeruginosa dans les niches polymicrobiennes, comme par exemple un poumon infecté. Cette nanomachine insérée dans l'enveloppe bactérienne est régulée au niveau post-traductionnel par une voie de phosphorylation ressemblant à celles des eucaryotes. Cette voie est constituée par une kinase, PpkA, et une phosphatase, PppA, qui modulent ensemble le niveau de phosphorylation de la protéine Fha1. Nous avons démontré que quatre protéines spécifiques de Pseudomonas appelées TagT, TagS, TagR et TagQ, agissent en amont du couple PpkA/PppA, et sont indispensables pour l'activation du SST6-H1. De plus, elles sont aussi nécessaires lors de compétitions entre P. aeruginosa et d'autres bactéries. Nous avons montré que TagR, connue comme étant une protéine périplasmique, est en fait associée à la membrane externe et cette localisation dépend de TagQ, une lipoprotéine ancrée dans le feuillet interne de la membrane externe. TagT et TagS forment un transporteur de type ABC qui a une activité d'ATPase. L'association de TagR à la membrane externe a été mise en évidence par des études de protéomique à haut débit qui avaient pour but la caractérisation des membranes externe et interne de P. aeruginosa. Grâce à l'analyse des résultats, unmodèle de l'assemblage du SST6-H1 au sein de l'enveloppe a pu être proposé. Ce travail a permis l'identification de plus de 1700 protéines, parmi elles un complexe multi-protéique incluant MagD, une protéine homologue à la macroglobuline humaine. Les résultats obtenus lors de la caractérisation de ce complexe sont aussi présentés dans ce manuscrit. / Pseudomonas aeruginosa is a human opportunistic pathogen that can cause severe infections and death in chronically infected cystic fibrosis (CF) patients. It has been shown that one of its three Type VI Secretion Systems (T6SS), the H1-T6SS, is active during chronic infections in CF patients. P. aeruginosa injects bacteriolytic toxins directly into other Gram-negative bacteria by means of its H1-T6SS, which could be of high importance in its outcome in complex niches such as an infected lung. This trans-envelope nanomachine is posttranslationally regulated by a eukaryotic-like phosphorylation pathway, which includes a kinase-phosphatase pair, PpkA and PppA, respectively. In this work, TagT, TagS, TagR and TagQ, Pseudomonas specific T6SS proteins that are encoded in the same operon as Ppka, PppA and Fha1, were analysed functionally and biochemically. We found that these four proteins are indispensable for the activation of H1-T6SS, by acting upstream of the phosphorylation checkpoint. Moreover, they were also needed for intra- and inter-species fitness mediated by H1-T6SS. We discovered that TagR, a periplasmic protein, associates with the outer membrane (OM) of P. aeruginosa in a TagQ-dependent manner. TagQ is an OM lipoprotein that faces the periplasm. TagT and TagS form a membrane-bound complex, an ABC transporter, with ATPase activity. TagR association with the OM was discovered by shotgun mass spectrometry analyses of the OM and the inner membrane (IM) of P. aeruginosa. In this work, the IM and OM sub-proteomes of P. aeruginosa are also presented, with highlights on T6SS global assembly. Moreover, these two sub-proteomes allowed the identification of a novel envelope-associated complex with macroglobulin-like protein, MagD. The studies concerning this protein and its partners in P. aeruginosa are also presented in this manuscript. Pseudomonas aeruginosa SST6 Régulation post-traductionnelle Protéomique à haut débit Sous-protéome Pseudomonas aeruginosa T6SS Posttranslational regulation Shotgun proteomics IM-OM sub-proteomes
10	Estudo do proteoma e imunoproteoma salivar do carrapato de bovinos, Rhipicephalus (Boophilus) microplus, para identificação e caracterização de antígenos silenciosos / Study of salivary proteome and immunoproteome of cattle tick, Rhipicephalus (Boophilus) microplus, for identification and characterization of silent antigens Gustavo Rocha Garcia 29 April 2013 (has links) Infestações com Rhipicephalus microplus, o carrapato dos bovinos, causam enormes prejuízos econômicos para a pecuária. Os carrapatos estão desenvolvendo resistência aos carrapaticidas que, além dessa desvantagem, deixam resíduos em carne e leite. Vacinas anticarrapato representam uma alternativa sustentável de controle de infestações, mas as atualmente disponíveis têm efeitos parciais e transitórios. Surge, assim, a necessidade de identificar novos antígenos vacinais. Para alcançar esse objetivo este trabalho explora o fato de que bovinos apresentam fenótipos contrastantes e herdáveis de infestações que são específicos de certas raças. Além disso, o nível de imunidade do hospedeiro afeta a transcrição de genes de glândulas salivares do carrapato, órgão que produz proteínas que medeiam o parasitismo. A hipótese de trabalho é a que os diferentes níveis da imunidade anticarrapato do hospedeiro afetam, também, a composição salivar do parasita. Assim, em carrapatos alimentando-se em hospedeiros resistentes as proteínas que são cruciais ao parasitismo poderão estar ausentes ou deficientes na sua saliva e por isso os carrapatos não terminam sua refeição de sangue. A neutralização dessas mesmas proteínas pela imunidade humoral pode ter o mesmo efeito e por isso, essas proteínas constituem bons antígenos vacinais. Assim, o objetivo do trabalho foi identificar novos antígenos vacinais em saliva de fêmeas e glândulas salivares de ninfas, machos e fêmeas de carrapatos alimentados em hospedeiros resistentes e suscetíveis, bem como em larvas não alimentadas oriundas de ovos de fêmeas alimentadas nestes mesmos hospedeiros. Para isso, foram empregadas abordagens de sequenciamento de nova geração \"RNA-Seq\" (454) e abordagens proteômicas, como análise diferencial em gel (DIGE) e Western Blots (imunoproteoma) seguido de sequenciamento de massa, além da tecnologia de identificação de proteínas multidimensionais (ou Multidimensional Protein Identification Technology, MudPIT) para descrever o proteoma das glândulas salivares e da saliva de fêmeas. A análise transcriptômica resultou no sequencimanto de 1.999.086 reads que permitiu identificar e classificar 11.676 sequências codificadoras (CDS), muitas das quais (3.600 CDS) contêm peptídeo sinal que é indicativo de secreção, portanto podendo estar presente na saliva e Resumo Gustavo Rocha Garcia apresentar função importante na hematofagia. Por meio de MudPIT, identificamos 321 proteínas salivares diferentes, além de 126 proteínas no DIGE e 266 proteínas nos imunoproteomas. Muitas dessas proteínas podem ser consideradas antígenos potenciais por estarem associadas com a hematofagia/parasitismo, tais como proteases, nucleases, inibidores de proteases, peptídeos antimicrobianos, proteínas de fixação, entre outros, inclusive proteínas ainda não caracterizadas. A maioria dos genes codificantes dessas proteínas está mais expressa em carrapatos alimentados em hospedeiros suscetíveis, principalmente em carrapatos machos. Além disso, muitas dessas proteínas não são reconhecidas por soros bovinos, inclusive soros de bovinos infestados, embora soros de bovinos infestados e resistentes ao carrapato apresente a maioria das reatividades. O conjunto dos resultados sugere que em nível de proteína a composição da saliva também é afetada pelos diferentes níveis de imunidade dos hospedeiros, além de variar com o ciclo de vida do carrapato. Desse modo, concluímos que as estratégias de investigação empregadas foram satisfatórias para identificar um conjunto de antígenos salivares do carrapato R. microplus que representam proteínas alvos para compor vacinas multicomponentes anticarrapato. / Infestation with Rhipicephalus microplus, the cattle tick, causes huge economic losses to livestock. Ticks are developing resistance to acaricides that, besides this disadvantage, leave residues in meat and milk. The anti tick vaccines represent a sustainable alternative of the infestations control, but the currently available has partial and transient effects. Thus arises the need to identify new vaccine antigens. To achieve this goal, this work explores the fact that cattle exhibit contrasting phenotypes and inheritable of infestations that are specific to certain breeds. Furthermore, the level of immunity of the host affects gene transcription tick salivary gland, organ that produces proteins that mediate the parasitism. The working hypothesis is that different levels of anti tick immunity of host affect also the salivary composition of the parasite. So in ticks feeding on resistant hosts the proteins that are crucial to parasitism may be absent or deficient in their saliva, and by this the ticks do not finish your meal blood. The neutralization of these same proteins by humoral immunity can have the same effect and by this, these proteins are good vaccine antigens. So, the aim of the study was to identify new vaccine antigens in saliva from females and salivary glands of nymphs, males and females of ticks fed on resistant and susceptible hosts as well as in unfed larvae originating from eggs of females fed on these same hosts. To this, were employed sequencing approaches of new generation \"RNA-Seq\" (454) and proteomic approaches, such as differential analysis in gel (DIGE) and Western Blots (immunoproteomics) followed by sequencing mass, besides the Multidimensional Protein Identification Technology (MudPIT) to describe the proteome of the salivary glands and saliva of females. The transcriptomics analysis identified 11,676 coding sequences (CDS), many of which (3,600 CDS) contain predicted signal peptide indicative of secretion, therefore may be present in saliva and provide an important function in blood feeding. Through MudPIT, we identify 321 different salivary proteins, besides 126 proteins in DIGE and 266 proteins in immunoproteomics. Many of these proteins may be considered as potential antigens to be associated with the blood meal/ parasitism, such as proteases, nucleases, protease inhibitors, antimicrobial peptides, proteins of attachment, among Abstract Gustavo Rocha Garcia others, including proteins not yet characterized. Most of the genes encoding of these proteins are more expressed in ticks fed on susceptible hosts, especially in male ticks. Moreover, many of these proteins are not recognized by bovine sera, including sera from infested hosts, although sera from infested and resistant host to tick present the most reactivities. The overall results suggest that in protein level, the composition of saliva is also affected by the different levels of immunity of the host, besides vary with the tick life cycle. Thus, we conclude that the research strategies employed were satisfactory to identify a set of tick salivary antigens from R. microplus that represent target proteins for composing anti tick multicomponent vaccines. carrapato Rhipicephalus microplus imunoproteoma MudPIT e DIGE proteomas salivares do carrapato immunoproteome MudPIT and DIGE. Rhipicephalus microplus tick tick salivary proteomes

Search results