Global ETD Search

1	Ligand-based Methods for Data Management and Modelling Alvarsson, Jonathan January 2015 (has links) Drug discovery is a complicated and expensive process in the billion dollar range. One way of making the drug development process more efficient is better information handling, modelling and visualisation. The majority of todays drugs are small molecules, which interact with drug targets to cause an effect. Since the 1980s large amounts of compounds have been systematically tested by robots in so called high-throughput screening. Ligand-based drug discovery is based on modelling drug molecules. In the field known as Quantitative Structure–Activity Relationship (QSAR) molecules are described by molecular descriptors which are used for building mathematical models. Based on these models molecular properties can be predicted and using the molecular descriptors molecules can be compared for, e.g., similarity. Bioclipse is a workbench for the life sciences which provides ligand-based tools through a point and click interface. The aims of this thesis were to research, and develop new or improved ligand-based methods and open source software, and to work towards making these tools available for users through the Bioclipse workbench. To this end, a series of molecular signature studies was done and various Bioclipse plugins were developed. An introduction to the field is provided in the thesis summary which is followed by five research papers. Paper I describes the Bioclipse 2 software and the Bioclipse scripting language. In Paper II the laboratory information system Brunn for supporting work with dose-response studies on microtiter plates is described. In Paper III the creation of a molecular fingerprint based on the molecular signature descriptor is presented and the new fingerprints are evaluated for target prediction and found to perform on par with industrial standard commercial molecular fingerprints. In Paper IV the effect of different parameter choices when using the signature fingerprint together with support vector machines (SVM) using the radial basis function (RBF) kernel is explored and reasonable default values are found. In Paper V the performance of SVM based QSAR using large datasets with the molecular signature descriptor is studied, and a QSAR model based on 1.2 million substances is created and made available from the Bioclipse workbench. QSAR ligand-based drug discovery bioclipse information system cheminformatics bioinformatics
2	SWI-Prolog as a Semantic Web Tool for semantic querying in Bioclipse: Integration and performance benchmarking Lampa, Samuel January 2010 (has links) The huge amounts of data produced in high-throughput techniques in the life sciences and the need for integration of heterogeneous data from disparate sources in new fields such as Systems Biology and translational drug development require better approaches to data integration. The semantic web is anticipated to provide solutions through new formats for knowledge representation and management. Software libraries for semantic web formats are becoming mature, but there exist multiple tools based on foundationally different technologies. SWI-Prolog, a tool with semantic web support, was integrated into the Bioclipse bio- and cheminformatics workbench software and evaluated in terms of performance against non Prolog-based semantic web tools in Bioclipse, Jena and Pellet, for querying a data set consisting of mostly numerical, NMR shift values, in the semantic web format RDF. The integration has given access to the convenience of the Prolog language for working with semantic data and defining data management workflows in Bioclipse. The performance comparison shows that SWI-Prolog is superior in terms of performance over Jena and Pellet for this specific dataset and suggests Prolog-based tools as interesting for further evaluations. Semantic Web Prolog Bioclipse RDF SPARQL NMR shift Eclipse Java Bioinformatics Bioinformatik
3	Bioclipse : Integration of Data and Software in the Life Sciences Spjuth, Ola January 2009 (has links) New high throughput experimental techniques have turned the life sciences into a data-intensive field. Scientists are faced with new types of problems, such as managing voluminous sources of information, integrating heterogeneous data, and applying the proper analysis algorithms; all to end up with reliable conclusions. These challenges call for an infrastructure of algorithms and technologies to supply researchers with the tools and methods necessary to maximize the usefulness of the data. eScience has emerged as a promising technology to take on these challenges, and denotes integrated science carried out in highly distributed network environments, or science that makes use of large data sets and requires high performance computing resources. In this thesis I present standards, exchange formats, algorithms, and software implementations for empowering researchers in the life sciences with the tools of eScience. The work is centered around Bioclipse - an extensible workbench developed in the frame of this thesis - which provides users with instruments for carrying out integrated research and where technical details are hidden under simple graphical interfaces. Bioclipse is a Rich Client that takes full advantage of the many offerings of eScience, such as networked databases and online services. The benefits of mixing local and remote software in a unifying platform are demonstrated with an integrated approach for predicting metabolic sites in chemical structures. To overcome the limitations of the commonly used technologies for interacting with networked services, I also present a new technology using the XMPP protocol. This enables service discovery and asynchronous communication between the client and server, which is ideal for long-running analyses. To maximize the usefulness of the available data there is a need for standards, ontologies, and exchange formats, in order to define what information should be captured and how it should be structured and exchanged. A novel format for exchanging QSAR data sets in a fully interoperable and reproducible form is presented, together with an implementation in Bioclipse that takes advantage of eScience components during the setup process. Bioclipse has been well received by the scientific community, attracted a large group of international users and developers, and has been awarded three international prizes for its innovative character. With continued development, the project has a good chance of becoming an important component in a sustainable infrastructure for the life sciences. Bioclipse integration life sciences bioinformatics cheminformatics chemoinformatics eclipse rich client xmpp qsar-ml web service standard ontology Bioinformatics Bioinformatik

1

Page generated in 0.0256 seconds