Global ETD Search

81	A knowledgebase of stress reponsive gene regulatory elements in arabidopsis Thaliana Adam, Muhammed Saleem January 2011 (has links) <p>Stress responsive genes play a key role in shaping the manner in which plants process and respond to environmental stress. Their gene products are linked to DNA transcription and its consequent translation into a response product. However, whilst these genes play a significant role in manufacturing responses to stressful stimuli, transcription factors coordinate access to these genes, specifically by accessing a gene&rsquo / s promoter region which houses transcription factor binding sites. Here transcriptional elements play a key role in mediating responses to environmental stress where each transcription factor binding site may constitute a potential response to a stress signal. Arabidopsis thaliana, a model organism, can be used to identify the mechanism of how transcription factors shape a plant&rsquo / s survival in a stressful environment. Whilst there are numerous plant stress research groups, globally there is a shortage of publicly available stress responsive gene databases. In addition a number of previous databases such as the Generation Challenge Programme&rsquo / s comparative plant stressresponsive gene catalogue, Stresslink and DRASTIC have become defunct whilst others have stagnated. There is currently a single Arabidopsis thaliana stress response database called STIFDB which was launched in 2008 and only covers abiotic stresses as handled by major abiotic stress responsive transcription factor families. Its data was sourced from microarray expression databases, contains numerous omissions as well as numerous erroneous entries and has not been updated since its inception.The Dragon Arabidopsis Stress Transcription Factor database (DASTF) was developed in response to the current lack of stress response gene resources. A total of 2333 entries were downloaded from SWISSPROT, manually curated and imported into DASTF. The entries represent 424 transcription factor families. Each entry has a corresponding SWISSPROT, ENTREZ GENBANK and TAIR accession number. The 5&rsquo / untranslated regions (UTR) of 417 families were scanned against TRANSFAC&rsquo / s binding site catalogue to identify binding sites. The relational database consists of two tables, namely a transcription factor table and a transcription factor family table called DASTF_TF and TF_Family respectively. Using a two-tier client-server architecture, a webserver was built with PHP, APACHE and MYSQL and the data was loaded into these tables with a PYTHON script. The DASTF database contains 60 entries which correspond to biotic stress and 167 correspond to abiotic stress while 2106 respond to biotic and/or abiotic stress. Users can search the database using text, family, chromosome and stress type search options. Online tools have been integrated into the DASTF&nbsp / database, such as HMMER, CLUSTALW, BLAST and HYDROCALCULATOR. User&rsquo / s can upload sequences to identify which transcription factor family their sequences belong to by using HMMER. The website can be accessed at http://apps.sanbi.ac.za/dastf/ and two updates per year are envisaged.</p>
82	A knowledgebase of stress reponsive gene regulatory elements in arabidopsis Thaliana Adam, Muhammed Saleem January 2011 (has links) <p>Stress responsive genes play a key role in shaping the manner in which plants process and respond to environmental stress. Their gene products are linked to DNA transcription and its consequent translation into a response product. However, whilst these genes play a significant role in manufacturing responses to stressful stimuli, transcription factors coordinate access to these genes, specifically by accessing a gene&rsquo / s promoter region which houses transcription factor binding sites. Here transcriptional elements play a key role in mediating responses to environmental stress where each transcription factor binding site may constitute a potential response to a stress signal. Arabidopsis thaliana, a model organism, can be used to identify the mechanism of how transcription factors shape a plant&rsquo / s survival in a stressful environment. Whilst there are numerous plant stress research groups, globally there is a shortage of publicly available stress responsive gene databases. In addition a number of previous databases such as the Generation Challenge Programme&rsquo / s comparative plant stressresponsive gene catalogue, Stresslink and DRASTIC have become defunct whilst others have stagnated. There is currently a single Arabidopsis thaliana stress response database called STIFDB which was launched in 2008 and only covers abiotic stresses as handled by major abiotic stress responsive transcription factor families. Its data was sourced from microarray expression databases, contains numerous omissions as well as numerous erroneous entries and has not been updated since its inception.The Dragon Arabidopsis Stress Transcription Factor database (DASTF) was developed in response to the current lack of stress response gene resources. A total of 2333 entries were downloaded from SWISSPROT, manually curated and imported into DASTF. The entries represent 424 transcription factor families. Each entry has a corresponding SWISSPROT, ENTREZ GENBANK and TAIR accession number. The 5&rsquo / untranslated regions (UTR) of 417 families were scanned against TRANSFAC&rsquo / s binding site catalogue to identify binding sites. The relational database consists of two tables, namely a transcription factor table and a transcription factor family table called DASTF_TF and TF_Family respectively. Using a two-tier client-server architecture, a webserver was built with PHP, APACHE and MYSQL and the data was loaded into these tables with a PYTHON script. The DASTF database contains 60 entries which correspond to biotic stress and 167 correspond to abiotic stress while 2106 respond to biotic and/or abiotic stress. Users can search the database using text, family, chromosome and stress type search options. Online tools have been integrated into the DASTF&nbsp / database, such as HMMER, CLUSTALW, BLAST and HYDROCALCULATOR. User&rsquo / s can upload sequences to identify which transcription factor family their sequences belong to by using HMMER. The website can be accessed at http://apps.sanbi.ac.za/dastf/ and two updates per year are envisaged.</p>
83	Large-Scale Structural Analysis of Protein-ligand Interactions : Exploring New Paradigms in Anti-Tubercular Drug Discovery Anand, Praveen January 2015 (has links) (PDF) BIOLOGICAL processes are governed through specific interactions of macromolecules. The three-dimensional structural information of the macromolecules is necessary to understand the basis of molecular recognition. A large number of protein structures have been determined at a high resolution using various experimental techniques such as X-ray crystallography, NMR, electron microscopy and made publicly available through the Protein Data Bank. In the recent years, comprehending function by studying a large number of related proteins is proving to be very fruitful for understanding their biological role and gaining mechanistic insights into molecular recognition. Availability of large-scale structural data has indeed made this task of predicting the protein function from three-dimensional structure, feasible. Structural bioinformatics, a branch of bioinformatics, has evolved into a separate discipline to rationalize and classify the information present in three-dimensional structures and derive meaningful biological insights. This has provided a better understanding of biological processes at a higher resolution in several cases. Most of the structural bioinformatics approaches so far, have focused on fold-level analysis of proteins and their relationship to sequences. It has long been recognized that sequence-fold or fold-function relationships are highly complex. Information on one aspect cannot be readily extrapolated to the other. To a significant extent, this can be overcome by understanding similarities in proteins by comparing their binding site structures. In this thesis, the primary focus is on analyzing the small-molecule ligand binding sites in protein structures, as most of the biological processes ranging from enzyme catalysis to complex signaling cascades are mediated through protein-ligand interactions. Moreover, given that the precise geometry and the chemical properties of the residues at the ligand binding sites dictate the molecular recognition capabilities, focusing on these sites at the structural level, is likely to yield more direct insights on protein function. The study of binding sites at the structural level poses several problems mainly because the residues at the site may be sequentially discontinuous but spatially proximal. Further, the order of the binding site residues in primary sequence, in most of cases has no significance for ligand binding. Compounding these difficulties are additional factors such as, non-uniform contribution to binding from different residues, and size-variations in binding sites even across closely related proteins. As a result, methods available to study ligand-binding sites in proteins, especially on a large-scale are limited, warranting exploration of new approaches. In the present work, new methods and tools have been developed to address some of these challenges in binding site analysis. First, a novel tool for site-based function annotation of protein structures, called PocketAnnotate was developed ( http://proline.biochem.iisc.ernet. in/pocketannotate/). PocketAnnotate, detects the putative binding sites from a given protein structure and compares them to known binding sites in PDB to derive functional annotation in terms of ligand association. Since the tool derives functional annotation at the level of binding sites, it has an advantage over other methods that solely utilize fold or sequence information. This becomes even more important for cases where there is no detectable homology with entries in existing databases, as Pocket Annotate does not depend on evolutionary based information for annotation. Second, a web-accessible tool for in silico almandine scanning mutations of binding site residues called ABS-Scan has been developed ( http://proline.biochem.iisc.ernet.in/abscan/). This tool helps in assessing the contribution of the individual residues of binding sites in the protein towards ligand recognition. All residues, one at a time, in a binding site are mutated systematically to an alanine and the ability of the corresponding mutant to bind a given ligand is analyzed. The contribution of each residue towards ligand binding is calculated through a G value derived by comparing the binding affinity to the wild-type protein-ligand complex. Third, a database called Protein-Ligand Interaction Clusters (PLIC) has been developed to identify and analyze the information of similarity across binding sites in PDB, which has been provided in the form of a web-accessible database ( http://proline.biochem.iisc.ernet/ PLIC). Protein-ligand interactions are primarily explored using three different computational approaches - (i) binding site characteristics including pocket shape, nature of residues and interaction profiles with different kinds of chemical probes, (ii) atomic contacts between protein and ligands (iii) binding energetics involved in interactions derived from scoring functions developed for docking. The information on variations in these features derived from different computational tools is also included in the database for enabling the characterization of the binding sites. As a case study to demonstrate the usefulness of these tools, they have been applied to decipher the complexity of S-adenosyl methionine interactions with the protein. Around 1,213 binding sites of SAM or SAM-like compounds could be extracted from the PLIC database. The SAM or SAM-like compounds were observed to interact with ∼18 different protein-fold types. The variations in different protein-ligand contacts across fold types were analyzed. The fold-specific interaction properties and contribution of individual residues towards SAM binding are identified. The tools developed and example analyses using them are described in Chapter 2. Chapter 3 describes a large-scale pocketome analysis from structural complexes in PDB, in an effort to characterize the known pocket space of protein-ligand interactions. Tools devel-opted as described in Chapter 2 are used for this. A set of 84,846 binding sites compiled from PDB, have been comprehensively analyzed with an objective of obtaining (a) classification of binding sites, (b) sequence-fold-site relationships among proteins, (c) a minimal set of physicochemical attributes sufficient to explain ligand recognition specificity and (d) site-type specific signatures in terms of physicochemical features. A new method to describe binding sites was developed in the form of BScIds such that the structural fold information is well captured. Binding sites and similarities among them were abstracted in the form of networks where each node represents a binding site and an edge between two nodes represents significant similarity between the sites at the structural level. Pocketome networks were constructed from the large-scale information on protein-ligand interactions in the PLIC database. The large pocketome network was then studied to derive relationships between protein folds and chemical entities they interact with. A classification of the binding pockets was achieved by analyzing the pocketome network using graph theoretical approaches combined with clustering methods. 10,858 clusters were identified from the network, each indicating a site-type. Thus, it can be said that there are about 10,858 site-types. Classification of ligand associations into specific site-types helps greatly in resolving the complex relationships by yielding specific site-type ligand associations. The observed classification was further probed to understand the basis of ligand recognition by representing the pockets through feature vectors. These features capture a wide range of physicochemical properties that can be used to derive site-type specific signatures and explore the pocket-space of protein-ligand interactions. A principal component analysis of these features reveals that binding site feature space is continuous in the entire PDB and minor changes in specific features can give rise to significant differences in ligand specificity, consequently defining their distinct functional roles. The weights were also derived for these features through the use of different information theoretic approaches to explain the multiple-specificity of protein-ligand interactions. Analysis of binding sites arising from contribution of residues from different protein fold-types revealed increasing diversity of physicochemical properties at the site, supporting the hypothesis that combination of folds could give rise to new binding sites. Given that a finer appreciation of the molecular mechanisms within the cell is possible only with the structural information, the next objective was to explore if a structural view of an entire proteome can be obtained and if a pocketome could be constructed and analyzed. With this in mind, the causative agent of tuberculosis - Mycobacterium tuberculosis (Mtb) was chosen. Mtb is also being studied in the laboratory from a systems biology perspective, which enabled exploration of how systems and the structural perspectives could be combined and applied for drug discovery. Chapters 4 to 6 describe this effort. The genome sequence of Mycobacterium tuberculosis (Mtb) H37Rv, indicates the presence of ∼4,000 protein coding genes, of which experimentally determined structures are available for ∼300 proteins. Further, advances in homology modeling methods have made it feasible to obtain structural models for many more proteins in the proteome. Chapter 4 describes the efforts for obtaining the Mtb structural proteome, through which the three-dimensional struc-tures were derived for ∼70% of the proteins in the genome. Functional annotation of each protein was derived based on fold-based functional assignments, binding-site comparisons and consequent ligand associations. PocketAnnotate, a site-based function annotation pipeline was utilized for this purpose and is described in Chapter 2. Besides these, the annotation covers detection of various sequence and sub-structural motifs and quaternary structure predictions based on the corresponding templates. The study provides a unique opportunity to obtain a global perspective of the fold distribution in the genome. The annotation indicates that cellular metabolism can be achieved with only 219 unique folds. New insights about the folds that predominate in the genome, as well as the fold-combinations that make up multi-domain proteins are also obtained. 1,728 binding pockets have been associated with ligands through binding site identification and sub-structure similarity analyses, yielding a list of ligands that can participate in various biochemical events in the mycobacterial cell. A web-accessible database MtbStructuralproteome has been developed to make the data and the analyses available to the community, ( http://proline.physics.iisc.ernet.in/Tbstructuralannotation). The resource, being one of the first to be based on structure-based functional annotations at a genome scale, is expected to be useful for better understanding of tuberculosis and for application in drug discovery. The reported annotation pipeline is fairly generic and can be applied to other genomes as well. Chapter 5 describes the characterization of the Mtb pocketome. For the structural models of the Mtb proteome described in chapter 4, a genome-scale binding site prediction exercise was carried out using three different computational methods and subsequently obtaining consensus predictions. The three methods were independent and were based on considering geometry, inter-molecular energies with probes and sequence conservations in evolutionarily related proteins respectively. In all, 13,858 consensus binding pockets were predicted in 2,877 proteins. The pocket space within Mtb was then explored through systematic all-pair comparisons of binding sites. The number of site-types within Mtb was found to be 6,584, as compared to the ∼400 structural folds and 1,831 unique sequence families. This reveals that the pocket space is larger than the sequence or fold-space, suggesting that variations at the site-level contribute significantly to functional repertoire of the organism. By comparing the pockets with the PDB sites enclosing known ligands, around 6906 binding sites were observed to exhibit significant similarity in the entire pockets to some or the other known binding site in PDB. 1,213 metabolites could be mapped onto 665 enzymes covering most of the metabolic pathways. The identified ligands serve as a predicted metabolome for unit abundances of the proteins. A list of proteins containing unique pockets is also identified. The binding pockets, similarities they share within Mtb and the ligands mapped onto them are all made available in a web-accessible database at http://proline.biochem.iisc.ernet.in/mtbpocketome/. The availability of structural information of the pocketome at a genome-scale opens up several opportunities in drug discovery. They can be directly applied for understanding mechanism of drug action, predicting adverse effects and pharmacodynamics of a drug. Moreover, it enables exploration of new ideas in drug discovery. Polypharmacology is a new concept that aims at modulating multiple drug targets through a single chemical entity. Currently, there are no established approaches to either select appropriate target sets or design polypharmacological drugs. In this study, a structural-proteomics approach is explored to first characterize the pocketome and then utilize it to identify similar binding sites. The knowledge of similarity relationships between the binding sites within the genome can be used in identifying possible polypharmacological drug targets. A pocket similarity based clustering of binding site residues resulted in identification of binding site sets, each having a theoretical potential to interact with a common ligand. A polypharmacological index was formulated to rank targets by incorporating a measure of drug ability and similarity to other pockets within the proteome. By comparing with known drug binding sites from databases such as the Drug Bank, the study has yielded a ready shortlist that includes sets of promising drug targets with polypharmacological possibilities and at the same time has identified possible drug candidates either directly for repurposing or at the least as significant lead clues that can be used to design new drug molecules against the entire group of proteins in each set. This analysis presents a rational approach to identify targets with polypharmacological potential, clues about lead compounds and a list of candidates for drug repurposing. This thesis demonstrates the feasibility of utilizing the structural bioinformatics approaches at a genome-scale. The tools developed for analyzing large-scale data on protein-ligand inter-actions could be applied to characterize the pocket-space of protein-ligand interactions. The network theory approaches applied in this work, make large-scale data tractable and enable binding-site typing. The binding site analysis at a genome-scale for Mtb is first of its kind and has provided novel insights into the pocket space. The binding site analysis performed on a genome-scale for Mtb provided an opportunity to rationalize the polypharmacological target selection and explore drugs for repurposing in TB. In the larger context, structural modelling of a proteome, mapping the small-molecule binding space in it and understanding the determinants of small-molecule recognition forms a major step in defining a proteome at higher resolution. This in turn will serve as a valuable input towards the emerging field of structural-systems biology, which seeks to understand the biological models at a systems level without compromising on the resolution of the study. Protein-ligand Interactions Anti-tubercular Drug Discovery Protein Structure Mycobacterium tuberculosis Protein–ligand Complex Proteome PDB Pocketome Mycobacterium tuberculosis FtsZ Helicobacter pylori DprA Binding Site Analyses PocketAnnotate Mycobacterium tuberculosis Pocketome Biochemistry
84	Transcriptional regulatory networks in the mouse hippocampus MacPherson, Cameron Ross January 2007 (has links) Magister Scientiae - MSc / Neurological diseases are socially disabling and often mortal. To efficiently combat these diseases, a deep understanding of involved cellular processes, gene functions and anatomy is required. However, differential regulation of genes across anatomy is not sufficiently well understood. This study utilized large-scale gene expression data to define the regulatory networks of genes expressing in the hippocampus to which multiple disease pathologies may be associated. Specific aims were: ident i fy key regulatory transcription factors (TFs) responsible for observed gene expression patterns, reconstruct transcription regulatory networks, and prioritize likely TFs responsible for anatomically restricted gene expression. Most of the analysis was restricted to the CA3 sub-region of Ammon’s horn within the hippocampus. We identified 155 core genes expressing throughout the CA3 sub-region and predicted corresponding TF binding site (TFBS) distributions. Our analysis shows plausible transcription regulatory networks for twelve clusters of co-expressed genes. We demonstrate the validity of the predictions by re-clustering genes based on TFBS distributions and found that genes tend to be correctly assigned to groups of previously identified co-expressing genes with sensitivity of 67.74% and positive predictive value of 100%. Taken together, this study represents one of the first to merge anatomical architecture, expression profiles and transcription regulatory potential on such a large scale in hippocampal sub-anatomy. / South Africa Brain / Neuro-anatomy Hippocampus Ammon's horn Neurodegenerative disease Alzheimer's disease Gene expression Transcription regulation Regulatory potential Transcription factor Promoter Transcription factor binding site Transcription regulatory network Allen Brain Atlas
85	Structural bioinformatics tools for the comparison and classification of protein interactions Garma, L. D. (Leonardo D.) 08 August 2017 (has links) Abstract Most proteins carry out their functions through interactions with other molecules. Thus, proteins taking part in similar interactions are likely to carry out related functions. One way to determine whether two proteins do take part in similar interactions is by quantifying the likeness of their structures. This work focuses on the development of methods for the comparison of protein-protein and protein-ligand interactions, as well as their application to structure-based classification schemes. A method based on the MultiMer-align (or MM-align) program was developed and used to compare all known dimeric protein complexes. The results of the comparison demonstrates that the method improves over MM-align in a significant number of cases. The data was employed to classify the complexes, resulting in 1,761 different protein-protein interaction types. Through a statistical model, the number of existing protein-protein interaction types in nature was estimated at around 4,000. The model allowed the establishment of a relationship between the number of quaternary families (sequence-based groups of protein-protein complexes) and quaternary folds (structure-based groups). The interactions between proteins and small organic ligands were studied using sequence-independent methodologies. A new method was introduced to test three similarity metrics. The best of these metrics was subsequently employed, together with five other existing methodologies, to conduct an all-to-all comparison of all the known protein-FAD (Flavin-Adenine Dinucleotide) complexes. The results demonstrates that the new methodology captures the best the similarities between complexes in terms of protein-ligand contacts. Based on the all-to-all comparison, the protein-FAD complexes were subsequently separated into 237 groups. In the majority of cases, the classification divided the complexes according to their annotated function. Using a graph-based description of the FAD-binding sites, each group could be further characterized and uniquely described. The study demonstrates that the newly developed methods are superior to the existing ones. The results indicate that both the known protein-protein and the protein-FAD interactions can be classified into a reduced number of types and that in general terms these classifications are consistent with the proteins' functions. / Tiivistelmä Suurin osa proteiinien toiminnasta tapahtuu vuorovaikutuksessa muiden molekyylien kanssa. Proteiinit, jotka osallistuvat samanlaisiin vuorovaikutuksiin todennäköisesti toimivat samalla tavalla. Kahden proteiinin todennäköisyys esiintyä samanlaisissa vuorovaikutustilanteissa voidaan määrittää tutkimalla niiden rakenteellista samankaltaisuutta. Tämä väitöskirjatyö käsittelee proteiini-proteiini- ja proteiini-ligandi -vuorovaikutusten vertailuun käytettyjen menetelmien kehitystä, ja niiden soveltamista rakenteeseen perustuvissa luokittelujärjestelmissä. Tunnettuja dimeerisiä proteiinikomplekseja tutkittiin uudella MultiMer-align-ohjelmaan (MM-align) perustuvalla menetelmällä. Vertailun tulokset osoittavat, että uusi menetelmä suoriutui MM-alignia paremmin merkittävässä osassa tapauksista. Tuloksia käytettiin myös kompleksien luokitteluun, jonka tuloksena oli 1761 erilaista proteiinien välistä vuorovaikutustyyppiä. Luonnossa esiintyvien proteiinien välisten vuorovaikutusten määrän arvioitiin tilastollisen mallin avulla olevan noin 4000. Tilastollisen mallin avulla saatiin vertailtua sekä sekvenssin (”quaternary families”) sekä rakenteen (”quaternary folds”) mukaan ryhmiteltyjen proteiinikompleksien määriä. Proteiinien ja pienien orgaanisten ligandien välisiä vuorovaikutuksia tutkittiin sekvenssistä riippumattomilla menetelmillä. Uudella menetelmällä testattiin kolmea eri samankaltaisuutta mittaavaa metriikkaa. Näistä parasta käytettiin viiden muun tunnetun menetelmän kanssa vertailemaan kaikkia tunnettuja proteiini-FAD (Flavin-Adenine-Dinucleotide, flaviiniadeniinidinukleotidi) -komplekseja. Proteiini-ligandikontaktien osalta uusi menetelmä kuvasi kompleksien samankaltaisuutta muita menetelmiä paremmin. Vertailun tuloksia hyödyntäen proteiini-FAD-kompleksit luokiteltiin edelleen 237 ryhmään. Suurimmassa osassa tapauksista luokittelujärjestelmä oli onnistunut jakamaan kompleksit ryhmiin niiden toiminnallisuuden mukaisesti. Ryhmät voitiin määritellä yksikäsitteisesti kuvaamalla FAD:n sitoutumispaikka graafisesti. Väitöskirjatyö osoittaa, että siinä kehitetyt menetelmät ovat parempia kuin aikaisemmin käytetyt menetelmät. Tulokset osoittavat, että sekä proteiinien väliset että proteiini-FAD -vuorovaikutukset voidaan luokitella rajattuun määrään vuorovaikutustyyppejä ja yleisesti luokittelu on yhtenevä proteiinien toiminnan suhteen. binding site comparison flavin adenine dinucleotide (FAD) protein-protein interactions structural alignment structural bioinformatics structural similarity flaviiniadeniinidinukleotidi (FAD) proteiini-proteiini vuorovaikutus rakenteellinen samankaltaisuus rakenteeseen perustuva bioinformatiikka rakenteiden rinnastus sitoutumispaikkavertailu
86	Computational Methods for Inferring Transcription Factor Binding Sites Morozov, Vyacheslav January 2012 (has links) Position weight matrices (PWMs) have become a tool of choice for the identification of transcription factor binding sites in DNA sequences. PWMs are compiled from experimentally verified and aligned binding sequences. PWMs are then used to computationally discover novel putative binding sites for a given protein. DNA-binding proteins often show degeneracy in their binding requirement, the overall binding specificity of many proteins is unknown and remains an active area of research. Although PWMs are more reliable predictors than consensus string matching, they generally result in a high number of false positive hits. A previous study introduced a novel method to PWM training based on the known motifs to sample additional putative binding sites from a proximal promoter area. The core idea was further developed, implemented and tested in this thesis with a large scale application. Improved mono- and dinucleotide PWMs were computed for Drosophila melanogaster. The Matthews correlation coefficient was used as an optimization criterion in the PWM refinement algorithm. New PWMs keep an account of non-uniform background nucleotide distributions on the promoters and consider a larger number of new binding sites during the refinement steps. The optimization included the PWM motif length, the position on the promoter, the threshold value and the binding site location. The obtained predictions were compared for mono- and dinucleotide PWM versions with initial matrices and with conventional tools. The optimized PWMs predicted new binding sites with better accuracy than conventional PWMs. machine learning transcriptional regulatory sites transcription factor protein binding PWM PSSM DNA sequence optimization statistics binding site prediction computational methods Matthews correlation Drosophila melanogaster binding motif weight matrix DNA sequence analysis
87	Cartographie des interfaces protéine-protéine et recherche de cavités droguables / Cartography of protein-protein interfaces and research of drugable cavities Da Silva, Franck 23 September 2016 (has links) Les interfaces protéine-protéine sont au cœur de nombreux mécanismes physiologiques du vivant. Les caractériser au niveau moléculaire est un donc enjeu crucial pour la recherche de nouveaux candidat-médicaments. Nous proposons ici de nouvelles méthodes d’analyse des interfaces protéine-protéine à visée pharmaceutique. Notre protocole automatisé détecte les interfaces au sein des structures de la Protein Data Bank afin de définir les zones d’interactions à potentiel pharmacologique, les cavités droguables, les ligands présents à l’interface ainsi que les pharmacophores directement déduits à partir des cavités. Notre méthode permet de réaliser un état de l’art des informations disponibles autour des interfaces protéine-protéine ainsi que de prédire de nouvelles cibles potentielles pour des molécules candidats médicaments. / Protein-protein interfaces are involved in many physiological mechanisms of living cells. Their characterization at the molecular level is therefore crucial in drug discovery.We propose here new methods for the analysis protein-protein interfaces of pharmaceutical interest. Our automated protocol detects the biologicaly relevant interfaces within the Protein Data Bank structures, droguables cavities, ligands present at the interface and pharmacophores derived directly from the cavities. Our method enables a state-of- the-art of all available structural information about protein-protein interfaces and predicts potential new targets for drug candidates. Base de données Bioinformatique Cavité Chémoinformatique Classifieurs Interface protéine-protéine Pharmacophore Protéine Site de liaison Database Computational biology Cavities Computational chemistry Classifiers Protein-protein interface Pharmacophore Protein Binding site 541.2 572.8
88	Strukturní a funkční charakterizace inhibice flavivirové methyltransferasy / Structural and functional characterization of a flaviviral methyltransferase Kúdelová, Veronika January 2021 (has links) Recently, non-cellular viral agents became the focus of a large number of scientific groups. A prominent and widespread group of these viruses are flaviviruses, which include, for example, Zika virus, Dengue fever virus, tick-borne encephalitis virus and West Nile virus. There is a considerable diversity among these viruses, however, highly conserved proteins can be found throughout this viral genus. The largest and most conserved protein encoded by flaviviruses is the nonstructural NS5 protein. Its N-terminal domain bears the methyltransferase (MTase) activity. Thanks to the methylation of its genome, it allows the virus to initiate translation and at the same time mask it from the host's immune system. By blocking the active site of this enzyme with a small molecule, viral infection could be stopped not only in one flavivirus, but, due to the high conservation of MTases, in all other flaviviruses. This diploma thesis deals with the aforementioned MTase domain of the NS5 protein, specifically of the West Nile virus (WNV). After designing an insert encoding the WNV MTase domain, amplifying it and ligating it into the vector, the MTase domain was prepared by a recombinant expression, followed by purification. Subsequently, complexes of the protein with small molecules (MTase ligands) were formed, in...
89	Multi-Scale Host-Aware Modeling for Analysis and Tuning of Synthetic Gene Circuits for Bioproduction Santos Navarro, Fernando Nóbel 20 June 2022 (has links) [ES] Esta Tesis ha sido dedicada al modelado multiescala considerando al anfitrión celular para el análisis y ajuste de circuitos genéticos sintéticos para bioproducción. Los objetivos principales fueron: 1. El desarrollo de un modelo que considere el anfitrión celular de tamaño reducido enfocado para simulación y análisis. 2. El desarrollo de herramientas de programación para el modelado y la simulación, orientada a la biología sintética. 3. La implementación de un modelo multiescala que considere las escalas relevantes para la bioproducción (biorreactor, célula y circuito sintético). 4. El análisis del controlador antitético considerando las interacciones célula-circuito, como ejemplo de aplicación de las herramientas desarrolladas. 5. El desarrollo y la validación experimental de leyes de control robusto para biorreactores continuos. El trabajo presentado en esta Tesis cubre las tres escalas del proceso de bioproducción. La primera escala es el biorreactor: esta escala considera la dinámica macroscópica del sustrato y la biomasa, y como estas dinámica se conecta con el estado interno de las células. La segunda escala es la célula anfitriona: esta escala considera la dinámica interna de la célula y la competencia por los recursos limitados compartidos para la expresión de proteínas. La tercera escala es el circuito genético sintético: esta escala considera la dinámica de expresión de los circuitos sintéticos exógenos y la carga que inducen en la célula anfitriona. Por último, como <<cuarta>> escala, parte de la Tesis se ha dedicado a desarrollar herramientas de software para el modelado y la simulación. Este documento se divide en siete capítulos. El Capítulo 1 es una introducción general al trabajo de la Tesis y su justificación; también presenta un mapa visual de la Tesis y enumera las principales contribuciones. El Capítulo 2 muestra el desarrollo del modelo del anfitrión celular (los Capítulos 4 y 5 hacen uso de este modelo para sus simulaciones). El Capítulo 3 presenta OneModel: una herramienta de software desarrollada en la Tesis que facilita el modelado y la simulación en biología sintética, en particular, facilita el uso del modelo del anfitrión celular. El Capítulo 4 utiliza el modelo del anfitrión celular para montar el modelo multiescala que considera el biorreactor y analiza el título, la productividad y el rendimiento en la expresión de una proteína exógena. El Capítulo 5 analiza un circuito más complejo, el recientemente propuesto y muy citado controlador biomolecular antitético, utilizando el modelo del anfitrión celular. El Capítulo 6 muestra el diseño de estrategias de control no lineal que permiten controlar la concentración de biomasa en un biorreactor continuo de forma robusta. El Capítulo 7 resume y presenta las principales conclusiones de la Tesis. En el Apéndice A se muestra el desarrollo teórico del modelo del anfitrión celular. Esta Tesis destaca la importancia de estudiar la carga celular en los sistemas biológicos, ya que estos efectos son muy notables y generan interacciones entre circuitos aparentemente independientes. La Tesis proporciona herramientas para modelar, simular y diseñar circuitos genéticos sintéticos teniendo en cuenta estos efectos de carga y permite el desarrollo de modelos que conecten estos fenómenos en los circuitos genéticos sintéticos, que van desde la dinámica intracelular de la expresión génica hasta la dinámica macroscópica de la población de células dentro del biorreactor. / [CA] Aquesta Tesi tracta del modelat multiescala considerant l'amfitrió ce\lgem ular per a l'anàlisi i ajust de circuits genètics sintètics per a bioproducció. Els objectius principals van ser: 1. El desenvolupament d'un model de grandària reduïda que considere l'amfitrió ce\lgem ular, enfocat al seu ús en simulació i anàlisi. 2. El desenvolupament d'eines de programari per al modelatge i la simulació, orientada a la biologia sintètica. 3. La implementació d'un model multiescala que considere les escales rellevants per a la bioproducció (bioreactor, cè\lgem ula i circuit sintètic). 4. L'anàlisi del controlador antitètic considerant les interacciones cè\lgem ula-circuit, com a exemple d'aplicació de les eines desenvolupades. 5. El desenvolupament i la validació experimental de lleis de control robust per a bioreactors continus. El treball presentat en aquesta Tesi cobreix les tres escales del procés de bioproducció. La primera escala és el bioreactor: aquesta escala considera la dinàmica macroscòpica del substrat i la biomassa, i com aquestes dinàmiques es connecten amb l'estat intern de les cè\lgem ules. La segona escala és la cè\lgem ula amfitriona: aquesta escala considera la dinàmica interna de la cè\lgem ula i la competència pels recursos limitats compartits per a l'expressió de proteïnes. La tercera escala és la del circuit genètic sintètic: aquesta escala considera la dinàmica d'expressió de circuits sintètics exógens i la càrrega que indueixen en la cè\lgem ula amfitriona. Finalment, com a <<quarta>> escala, part de la Tesi s'ha dedicat a desenvolupar eines de programari per al modelatge i la simulació. Aquest document es divideix en set capítols. El Capítol 1 és una introducció general al treball de la Tesi i la seua justificació; també presenta un mapa visual de la Tesi i enumera les principals contribucions. El Capítol 2 mostra el desenvolupament del model de l'amfitrió ce\lgem ular (els Capítols 4 i 5 fan ús d'aquest model per a les seues simulacions). El Capítol 3 presenta OneModel: una eina de programari desenvolupada en la Tesi que facilita el modelatge i la simulació en biologia sintètica, en particular, facilita l'ús del model de l'amfitrió ce\lgem ular. El Capítol 4 utilitza el model de l'amfitrió ce\lgem ular per a muntar el model multiescala que considera el bioreactor i analitza el títol, la productivitat i el rendiment en l'expressió d'una proteïna exògena. El Capítol 5 analitza un circuit més complex, el recentment proposat i molt citat controlador biomolecular antitètic, utilitzant el model de l'amfitrió ce\lgem ular. El Capítol 6 mostra el disseny d'estratègies de control no lineal que permeten controlar la concentració de biomassa en un bioreactor continu de manera robusta. El Capítol 7 resumeix i presenta les principals conclusions de la Tesi. En l'Apèndix A es mostra el desenvolupament teòric del model de l'amfitrió ce\lgem ular. Aquesta Tesi destaca la importància d'estudiar la càrrega ce\lgem ular en els sistemes biològics, ja que aquests efectes són molt notables i generen interaccions entre circuits aparentment independents. La Tesi proporciona eines per a modelar, simular i dissenyar circuits genètics sintètics tenint en compte aquests efectes de càrrega i permet el desenvolupament de models que connecten aquests fenòmens en els circuits genètics sintètics, que van des de la dinàmica intrace\lgem ular de l'expressió gènica fins a la dinàmica macroscòpica de la població de cè\lgem ules dins del bioreactor. / [EN] This Thesis was devoted to the multi-scale host-aware analysis and tuning of synthetic gene circuits for bioproduction. The main objectives were: 1. The development of a reduced-size host-aware model for simulation and analysis purposes. 2. The development of a software toolbox for modeling and simulation, oriented to synthetic biology. 3. The implementation of a multi-scale model that considers the scales relevant to bioproduction (bioreactor, cell, and synthetic circuit). 4. The host-aware analysis of the antithetic controller, as an example of the application of the developed tools. 5. The development and experimental validation of robust control laws for continuous bioreactors. The work presented in this Thesis covers the three scales of the bioproduction process. The first scale is the bioreactor: this scale considers the macroscopic substrate and biomass dynamics and how these dynamics connect to the internal state of the cells. The second scale is the host cell: this scale considers the internal dynamics of the cell and the competition for limited shared resources for protein expression. The third scale is the synthetic genetic circuit: this scale considers the dynamics of expressing exogenous synthetic circuits and the burden they induce on the host cell. Finally, as a <<fourth>> scale, part of the Thesis was devoted to developing software tools for modeling and simulation. This document is divided into seven chapters. Chapter 1 is an overall introduction to the Thesis work and its justification; it also presents a visual map of the Thesis and lists the main contributions. Chapter 2 shows the development of the host-aware model (Chapters 4 and 5 make use of this model for their simulations). Chapter 3 presents OneModel: a software tool developed in the Thesis that facilitates modeling and simulation for synthetic biology---in particular, it facilitates the use of the host-aware model---. Chapter 4 uses the host-aware model to assemble the multi-scale model considering the bioreactor and analyzes the titer, productivity (rate), and yield in expressing an exogenous protein. Chapter 5 analyzes a more complex circuit, the recently proposed and highly cited antithetic biomolecular controller, using the host-aware model. Chapter 6 shows the design of nonlinear control strategies that allow controlling the concentration of biomass in a continuous bioreactor in a robust way. Chapter 7 summarizes and presents the main conclusions of the Thesis. Appendix A shows the theoretical development of the host-aware model. This Thesis emphasizes the importance of studying cell burden in biological systems since these effects are very noticeable and generate interactions between seemingly unconnected circuits. The Thesis provides tools to model, simulate and design synthetic genetic circuits taking into account these burden effects and allowing the development of models that connect phenomena in synthetic genetic circuits, ranging from the intracelullar dynamics of gene expression to the macroscopic dynamics of the population of cells inside the bioreactor. / This research was funded by MCIN/AEI/10.13039/501100011033 grant number PID2020-117271RB-C21, and MINECO/AEI, EU grant number DPI2017-82896- C2-1-R. The author was recipient of the grant “Programa para la Formación de Personal Investigador (FPI) de la Universitat Politècnica de València — Subprograma 1 (PAID-01-2017)”. The author was also a grantee of the predoctoral stay “Ayudas para Movilidad de Estudiantes de Doctorado de la Universitat Politècnica de València 2019”. The Control Theory and Systems Biology Lab of the ETH Zürich is acknowledged for accepting the author in their facilities as predoctoral stay and their valuable collaboration sharing knowledge. / Santos Navarro, FN. (2022). Multi-Scale Host-Aware Modeling for Analysis and Tuning of Synthetic Gene Circuits for Bioproduction [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/183473 / TESIS / Premios Extraordinarios de tesis doctorales Sitio de unión al ribosoma Biología sintética Carga celular Biorreactor Expresión genética Modelado Simulación Regulación dinámica RBS Promotores Promoters Dynamic regulation Simulation Modeling Gene expression Bioreactor Cell loading Synthetic biology Ribosome Binding Site (RBS) INGENIERIA DE SISTEMAS Y AUTOMATICA
90	Úloha variabilních řetězců na rozhraní podjednotek ve formování ATP-vazebné kapsy a funkci P2X4 receptoru / Role of variable chains at the interface between subunits in forming ATP-binding pocket and function of P2X4 receptor Tvrdoňová, Vendula January 2014 (has links) 7 ABSTRACT Crystallization of the zebrafish P2X4 receptor in both open and closed states revealed conformational differences in the ectodomain structures, including the dorsal fin and left flipper domains. The role of these domains in forming of ATP-binding pocket and receptor function was investigated by using alanine scanning mutagenesis of the R203- L214 (dorsal fin) and the D280-N293 (left flipper) sequences of the rat P2X4 receptor and by examination of the responsiveness to ATP and orthosteric analog agonists 2- (methylthio)adenosine 5'-triphosphate, adenosine 5'-(γ-thio)triphosphate, 2'(3'-O-(4- benzoylbenzoyl)adenosine 5'-triphosphate, and α,β-methyleneadenosine 5'- triphosphate. ATP potency/efficacy was reduced in 15 out of 26 alanine mutants. The R203A, N204A, and N293A mutants were essentially non-functional, but receptor function was restored by ivermectin, an allosteric modulator. The I205A, T210A, L214A, P290A, G291A, and Y292A mutants exhibited significant changes in the responsiveness to orthosteric analog agonists. In contrast, the responsiveness of L206A, N208A, D280A, T281A, R282A, and H286A mutants to analog agonists was comparable to that of the wild type receptor. These experiments, together with homology modeling, indicate that residues of the first group located in the upper part of...

Search results