Spelling suggestions: "subject:"geoinformatics"" "subject:"bioinformatics""
1 |
Discovery and characterization of polyamine analogues as inhibitors of the Plasmodium falciparum polyamine pathway using cheminformaticsDe Bruin, Jurgens Jacobus.. January 2009 (has links)
Thesis (M.Sc.)(Biochemistry))--University of Pretoria, 2008. / Includes summary.
|
2 |
Development of tools to provide prioritisation and guidance in the development of chemical probes and small molecule leadsBradley, Anthony Richard January 2015 (has links)
Experimental methodological developments in measuring protein-ligand interactions for small molecule (<900 Da) drug discovery have led to an influx of data. In some instances this data overload has been overwhelming and can complicate rather than inform decision-making during drug discovery. The focus of this thesis is thus to develop novel methods to contextualise and extract useful information that helps medicinal and computational chemists make sense of available data and improve the productivity of drug discovery. Specifically I have developed computational tools to structure and analyse protein-ligand interaction data. I generated 48 novel ligand-bound protein structures for the bromodomain of BAZ2B and the JmjC domain of KDM4 using experimental fragment soaking. This quantity of structures required time consuming and subjective analysis. A 3D interactive visualisation tool, WONKA, was therefore developed. WONKA displays interesting and unusual features (e.g. residue motions) within ensembles of protein-ligand structures and allows for sharing of observations between scientists. WONKA does not consider protein-ligand activity data, so I developed OOMMPPAA. OOMMPPAA is an interactive 3D visualisation tool that incorporates protein-ligand activity data with protein-ligand structural data using 3D matched molecular pairs. OOMMPPAA highlights nuanced structure activity relationships (SAR) and summarises available protein-ligand activity data in the protein context. WONKA and OOMMPPAA form a data model and platform to analyse structural and activity data. The extensibility and utility of this data model are demonstrated by the development of two further tools. The first, GLOOP, suggests ligand modifications from large datasets, and provides quantification of the importance of putatively important moieties. The second, LLOOMMPPAA, designs synthetically tractable molecules that explore a diverse range of protein-ligand interactions. LLOOMMPPAA has been shown experimentally to provide useful SAR. The tools described in this thesis provide novel analyses of and a framework for investigating protein-ligand interaction data.
|
3 |
Application of Data Pipelining Technology in Cheminformatics and BioinformaticsMao, Linyong 12 1900 (has links)
Submitted to the faculty of the University Graduate School in partial fulfillment of the requirements for the degree Master of Sciences in the School of Informatics Indiana University December 2002 / Data pipelining is the processing, analysis, and mining of large volumes of data through a branching network of computational steps. A data pipelining system consists of a collection of modular computational components and a network for streaming data between them. By defining a logical path for data through a network of computational components and configuring each component accordingly, a user can create a protocol to perform virtually any desired function with data and extract knowledge from them. A set of data pipelines were constructed to explore the relationship between the biodegradability and structural properties of halogenated aliphatic compounds in a data set in which each compound has one degradation rate and nine structure-derived properties. After training, the data pipeline was able to calculate the degradation rates of new compounds with a relatively accurate rate. A second set of data pipelines was generated to cluster new DNA sequences. The data pipelining technology was applied to identify a core sequence to represent a DNA cluster and construct the 95% confidence distance interval for the cluster. The result shows that 74% of the DNA sequences were correctly clustered and there was no false clustering.
|
4 |
Explorace chemického prostoru za pomoci scaffold hoppingu / Scaffold hopping-based exploration of chemical spaceMikeš, Marek January 2014 (has links)
This work is based on the Molpher SW project, which is client-server application aiding exploration of chemical space between two input molecules. Aim of master thesis was modify the current version of program to manage scaffold hopping technique. This technique represents molecule in a simplified way. The simpler molecule is called scaffold. First of all there was need to define seve- ral levels of granularity and for each level define morphing operators. Server was modified with respect for parallelization. Experimental exploration of chemical space with and without the new feature is part of this work too. Powered by TCPDF (www.tcpdf.org)
|
5 |
Application of multivariate statistics and machine learning to phenotypic imaging and chemical high-content dataWildenhain, Jan January 2016 (has links)
Image-based high-content screens (HCS) hold tremendous promise for cell-based phenotypic screens. Challenges related to HCS include not only storage and management of data, but critical analysis of the complex image-based data. I implemented a data storage and screen management framework and developed approaches for data analysis of a number high-content microscopy screen formats. I visualized and analysed pilot screens to develop a robust multi-parametric assay for the identification of genes involved in DNA damage repair in HeLa cells. Further, I developed and implemented new approaches for image processing and screen data normalization. My analyses revealed that the ubiquitin ligase RNF8 plays a central role in DNA-damage response and that a related ubiquitin ligase RNF168 causes the cellular and developmental phenotypes characteristic for the RIDDLE syndrome. My approaches also uncovered a role for the MMS22LTONSL complex in DSB repair and its role in the recombination-dependent repair of stalled or collapsed replication forks. The discovery of novel bioactive molecules is a challenge because the fraction of active candidate molecules is usually small and confounded by noise in experimental readouts. Cheminformatics can improve robustness of chemical high-throughput screens and functional genomics data sets by taking structure-activity relationships into account. I applied statistics, machine learning and cheminformatics to different data sets to discern novel bioactive compounds. I showed that phenothiazines and apomorphines are regulators for cell differentiation in murine embryonic stem cells. Further, I pioneered computational methods for the identification of structural features that influence the degradation and retention of compounds in the nematode C. elegans. I used chemoinformatics to assemble a comprehensive screening library of previously approved drugs for redeployment in new bioassays. A combination of chemical genetic interactions, cheminformatics and machine learning allowed me to predict novel synergistic antifungal small molecule combinations from sensitized screens with the drug library. In another study on the biological effects of commonly prescribed psychoactive compounds, I discovered a strong link between lipophilicity and bioactivity of compounds in yeast and unexpected off-target effects that could account for unwanted side effects in humans. I also investigated structure-activity relationships and assessed the chemical diversity of a compound collection that was used to probe chemical-genetic interactions in yeast. Finally, I have made these methods and tools available to the scientific community, including an open source software package called MolClass that allows researchers to make predictions about bioactivity of small molecules based on their chemical structure.
|
6 |
Computational approaches to predicting drug induced toxicityMarchese Robinson, Richard Liam January 2013 (has links)
Novel approaches and models for predicting drug induced toxicity in silico are presented. Typically, these were based on Quantitative Structure-Activity Relationships (QSAR). The following endpoints were modelled: mutagenicity, carcinogenicity, inhibition of the hERG ion channel and the associated arrhythmia - Torsades de Pointes. A consensus model was developed based on Derek for WindowsTM and Toxtree and used to filter compounds as part of a collaborative effort resulting in the identification of potential starting points for anti-tuberculosis drugs. Based on the careful selection of data from the literature, binary classifiers were generated for the identification of potent hERG inhibitors. These were found to perform competitively with, or better than, those computational approaches previously presented in the literature. Some of these models were generated using Winnow, in conjunction with a novel proposal for encoding molecular structures as required by this algorithm. The Winnow models were found to perform comparably to models generated using the Support Vector Machine and Random Forest algorithms. These studies also emphasised the variability in results which may be obtained when applying the same approaches to different train/test combinations. Novel approaches to combining chemical information with Ultrafast Shape Recognition (USR) descriptors are introduced: Atom Type USR (ATUSR) and a combination between a proposed Atom Type Fingerprint (ATFP) and USR (USR-ATFP). These were applied to the task of predicting protein-ligand interactions - including the prediction of hERG inhibition. Whilst, for some of the datasets considered, either ATUSR or USR-ATFP was found to perform marginally better than all other descriptor sets to which they were compared, most differences were statistically insignificant. Further work is warranted to determine the advantages which ATUSR and USR-ATFP might offer with respect to established descriptor sets. The first attempts to construct QSAR models for Torsades de Pointes using predicted cardiac ion channel inhibitory potencies as descriptors are presented, along with the first evaluation of experimentally determined inhibitory potencies as an alternative, or complement to, standard descriptors. No (clear) evidence was found that 'predicted' ('experimental') 'IC-descriptors' improve performance. However, their value may lie in the greater interpretability they could confer upon the models. Building upon the work presented in the preceding chapters, this thesis ends with specific proposals for future research directions.
|
7 |
Ligand-based Methods for Data Management and ModellingAlvarsson, Jonathan January 2015 (has links)
Drug discovery is a complicated and expensive process in the billion dollar range. One way of making the drug development process more efficient is better information handling, modelling and visualisation. The majority of todays drugs are small molecules, which interact with drug targets to cause an effect. Since the 1980s large amounts of compounds have been systematically tested by robots in so called high-throughput screening. Ligand-based drug discovery is based on modelling drug molecules. In the field known as Quantitative Structure–Activity Relationship (QSAR) molecules are described by molecular descriptors which are used for building mathematical models. Based on these models molecular properties can be predicted and using the molecular descriptors molecules can be compared for, e.g., similarity. Bioclipse is a workbench for the life sciences which provides ligand-based tools through a point and click interface. The aims of this thesis were to research, and develop new or improved ligand-based methods and open source software, and to work towards making these tools available for users through the Bioclipse workbench. To this end, a series of molecular signature studies was done and various Bioclipse plugins were developed. An introduction to the field is provided in the thesis summary which is followed by five research papers. Paper I describes the Bioclipse 2 software and the Bioclipse scripting language. In Paper II the laboratory information system Brunn for supporting work with dose-response studies on microtiter plates is described. In Paper III the creation of a molecular fingerprint based on the molecular signature descriptor is presented and the new fingerprints are evaluated for target prediction and found to perform on par with industrial standard commercial molecular fingerprints. In Paper IV the effect of different parameter choices when using the signature fingerprint together with support vector machines (SVM) using the radial basis function (RBF) kernel is explored and reasonable default values are found. In Paper V the performance of SVM based QSAR using large datasets with the molecular signature descriptor is studied, and a QSAR model based on 1.2 million substances is created and made available from the Bioclipse workbench.
|
8 |
Hierarchická vizualizace chemického prostoru / Hierarchical visualization of the chemical spaceVelkoborský, Jakub January 2016 (has links)
The purpose of this thesis was to design and implement a hierarchical approach to visualization of the chemical space. Such visualization is a challenging yet important topic used in diverse fields ranging from material engineering to drug design. Especially in drug design, modern methods of high- throughput screening generate large amounts of data that would benefit from hierarchical analysis. One possible approach to hierarchical classification of molecules is a structure based classification based on molecular scaffolds. The scaffolds are widely used by medicinal chemists to group molecules of similar properties. A few scaffold-based hierarchical visualization methods have been proposed. However, to our best knowledge, there exists no tool that would provide a scaffold-based hierarchical visualization of molecular data sets on the background of known chemical space. In this thesis, such tool was created. First, a scaffold tree hierarchy based on ring topologies was designed. Next, this hierarchy was used to analyze frequency of scaffolds extracted from molecules in PubChem Compound database. Subsequently, the PubChem Compound scaffold frequency data was used as a background for visualization of molecular data sets. The visualization is performed by a client-server application implemented as a part of...
|
9 |
Cheminformatics for genome-scale metabolic reconstructionsMay, John W. January 2015 (has links)
Genome-scale metabolic reconstructions are an important resource in the study of metabolism. They provide both a system and component level view of the biochemical transformations of metabolites. As more reconstructions have been created it remains a challenge to integrate and reason about their contents. This thesis focuses on the development of computational methods to allow on-demand comparison and alignment of metabolic reconstructions. A novel method is introduced that utilises chemical structure representations to identify equivalent metabolites between reconstructions. Using a graph theoretic representation allows the identification and reasoning of metabolites that have a non-exact match. A key advantage is that the method uses the contents of reconstructions directly and does not rely on the creation or use of a common reference. To annotate reconstructions with chemical structure representations an interactive desktop application is introduced. The application assists in the creation and curation of metabolic information using manual, semi-auto\-mated, and automated methods. Chemical structure representations can be retrieved, drawn, or generated to allow precise metabolite annotation. In processing chemical information, efficient and optimised algorithms are required. Several areas are addressed and implementations have been contributed to the Chemistry Development Kit. Rings are a fundamental property of chemical structures therefore multiple ring definitions and fast algorithms are explored. Conversion and standardisation between structure representations present a challenge. Efficient algorithms to determine aromaticity, assign a Kekul? form, and generate tautomers are detailed. Many enzymes are selective and specific to stereochemistry. Methods for the identification, depiction, comparison, and description of stereochemistry are described.
|
10 |
Molecular similarity and xenobiotic metabolismAdams, Samuel E. January 2010 (has links)
MetaPrint2D, a new software tool implementing a data-mining approach for predicting sites of xenobiotic metabolism has been developed. The algorithm is based on a statistical analysis of the occurrences of atom centred circular fingerprints in both substrates and metabolites. This approach has undergone extensive evaluation and been shown to be of comparable accuracy to current best-in-class tools, but is able to make much faster predictions, for the first time enabling chemists to explore the effects of structural modifications on a compound’s metabolism in a highly responsive and interactive manner. MetaPrint2D is able to assign a confidence score to the predictions it generates, based on the availability of relevant data and the degree to which a compound is modelled by the algorithm. In the course of the evaluation of MetaPrint2D a novel metric for assessing the performance of site of metabolism predictions has been introduced. This overcomes the bias introduced by molecule size and the number of sites of metabolism inherent to the most commonly reported metrics used to evaluate site of metabolism predictions. This data mining approach to site of metabolism prediction has been augmented by a set of reaction type definitions to produce MetaPrint2D-React, enabling prediction of the types of transformations a compound is likely to undergo and the metabolites that are formed. This approach has been evaluated against both historical data and metabolic schemes reported in a number of recently published studies. Results suggest that the ability of this method to predict metabolic transformations is highly dependent on the relevance of the training set data to the query compounds. MetaPrint2D has been released as an open source software library, and both MetaPrint2D and MetaPrint2D-React are available for chemists to use through the Unilever Centre for Molecular Science Informatics website.
|
Page generated in 0.0937 seconds