Return to search

Machine Learning Models in Fullerene/Metallofullerene Chromatography Studies

Machine learning methods are now extensively applied in various scientific research areas to make models. Unlike regular models, machine learning based models use a data-driven approach. Machine learning algorithms can learn knowledge that are hard to be recognized, from available data. The data-driven approaches enhance the role of algorithms and computers and then accelerate the computation using alternative views. In this thesis, we explore the possibility of applying machine learning models in the prediction of chromatographic retention behaviors. Chromatographic separation is a key technique for the discovery and analysis of fullerenes. In previous studies, differential equation models have achieved great success in predictions of chromatographic retentions. However, most of the differential equation models require experimental measurements or theoretical computations for many parameters, which are not easy to obtain. Fullerenes/metallofullerenes are rigid and spherical molecules with only carbon atoms, which makes the predictions of chromatographic retention behaviors as well as other properties much simpler than other flexible molecules that have more variations on conformations. In this thesis, I propose the polarizability of a fullerene molecule is able to be estimated directly from the structures. Structural motifs are used to simplify the model and the models with motifs provide satisfying predictions. The data set contains 31947 isomers and their polarizability data and is split into a training set with 90% data points and a complementary testing set. In addition, a second testing set of large fullerene isomers is also prepared and it is used to testing whether a model can be trained by small fullerenes and then gives ideal predictions on large fullerenes. / Machine learning models are capable to be applied in a wide range of areas, such as scientific research. In this thesis, machine learning models are applied to predict chromatography behaviors of fullerenes based on the molecular structures. Chromatography is a common technique for mixture separations, and the separation is because of the difference of interactions between molecules and a stationary phase. In real experiments, a mixture usually contains a large family of different compounds and it requires lots of work and resources to figure out the target compound. Therefore, models are extremely import for studies of chromatography. Traditional models are built based on physics rules, and involves several parameters. The physics parameters are measured by experiments or theoretically computed. However, both of them are time consuming and not easy to be conducted. For fullerenes, in my previous studies, it has been shown that the chromatography model can be simplified and only one parameter, polarizability, is required. A machine learning approach is introduced to enhance the model by predicting the molecular polarizabilities of fullerenes based on structures. The structure of a fullerene is represented by several local structures. Several types of machine learning models are built and tested on our data set and the result shows neural network gives the best predictions.

Identiferoai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/93737
Date08 August 2019
CreatorsLiu, Xiaoyang
ContributorsComputer Science and Application, Cao, Young, Dorn, Harry C., Heath, Lenwood S.
PublisherVirginia Tech
Source SetsVirginia Tech Theses and Dissertation
Languageen_US
Detected LanguageEnglish
TypeThesis
FormatETD, application/pdf
RightsCreative Commons Attribution-NonCommercial-NoDerivatives 4.0 International, http://creativecommons.org/licenses/by-nc-nd/4.0/

Page generated in 0.0013 seconds