Return to search

Crystallization properties of molecular materials : prediction and rule extraction by machine learning

Crystallization is an increasingly important process in a variety of applications from drug development to single crystal X-ray diffraction structure determination. However, while there is a good deal of research into prediction of molecular crystal structure, the factors that cause a molecule to be crystallizable have so far remained poorly understood. The aim of this project was to answer the seemingly straightforward question: can we predict how easily a molecule will crystallize? The Cambridge Structural Database contains almost a million examples of materials from the scientific literature that have crystallized. Models for the prediction of crystallization propensity of organic molecular materials were developed by training machine learning algorithms on carefully curated sets of molecules which are either observed or not observed to crystallize, extracted from a database of commercially available molecules. The models were validated computationally and experimentally, while feature extraction methods and high resolution powder diffraction studies were used to understand the molecular and structural features that determine the ease of crystallization. This led to the development of a new molecular descriptor which encodes information about the conformational flexibility of a molecule. The best models gave error rates of less than 5% for both cross-validation data and previously-unseen test data, demonstrating that crystallization propensity can be predicted with a high degree of accuracy. Molecular size, flexibility and nitrogen atom environments were found to be the most influential factors in determining the ease of crystallization, while microstructural features determined by powder diffraction showed almost no correlation with the model predictions. Further predictions on co-crystals show scope for extending the methodology to other relevant applications.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:740849
Date January 2017
CreatorsWicker, Jerome
ContributorsCooper, Richard ; David, Bill
PublisherUniversity of Oxford
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttps://ora.ox.ac.uk/objects/uuid:34beef4e-e499-4248-8fa6-7e8d8344f02c

Page generated in 0.0018 seconds