Return to search

Substructural analysis techniques for structure-property correlation within computerised chemical information systems

The work described in this thesis involves a novel method of substructural analysis, with potential application for structure- property correlation and information retrieval within computerised chemical information systems. A review is given of the development of the concept of chemical structure and its representation, its application in computerised chemical information systems, and methods for correlating structure with molecular properties. A method is presented for derivation of structural features, representing the whole structure, from Wiswesser Line Notation (WLN) by computer program. These features are then used as variables in statistical analysis procedures: in this work multiple regression analysis and cluster analysis are used. This procedure allows for a rapid, convenient and thorough analysis of large data-sets. The type of structural features used may be easily varied, allowing for investi- gation of factors such as ring substitution patterns, group interactions, and three-dimensional structure. The method is applicable to sets of diverse or structurally related compounds. Statistical tests of the results enable quantitative testing of hypotheses. Multiple regression analysis allows a direct, quantitative correlation between structure and molecular property, and subsequent property prediction. It is applied to sets of aliphatic, alicyclic aromatic, and heterocyclic compounds, including sets of highly diverse structures. Properties examined include biological effects, toxicty, pK, thermochemical properties, boiling point, solubility, and partition coefficient. Some of these properties are highly dependent upon electronic and steric effects, and hence upon relative position of substituents, and on three-dimensional structure. Highly significant correlations are obtained in all cases, and the potential for property prediction is demonstrated. Cluster analysis is applied to several sets of structures. Intuitively sensible classifications are obtained, and the potential for both property prediction and information retrieval discussed. Since these techniques involve the widely used WLN, relatively simple COBOL programs, and standard statistical packages, they should be applicable within operational environments.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:449232
Date January 1978
CreatorsBawden, David
PublisherUniversity of Sheffield
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://etheses.whiterose.ac.uk/3038/

Page generated in 0.002 seconds