1 |
Transformer-based Model for Molecular Property Prediction with Self-Supervised Transfer LearningLin, Lyu January 2020 (has links)
Molecular property prediction has a vast range of applications in the chemical industry. A powerful molecular property prediction model can promote experiments and production processes. The idea behind this degree program lies in the use of transfer learning to predict molecular properties. The project is divided into two parts. The first part is to build and pre-train the model. The model, which is constructed with pure attention-based Transformer Layer, is pre-trained through a Masked Edge Recovery task with large-scale unlabeled data. Then, the performance of this pre- trained model is tested with different molecular property prediction tasks and finally verifies the effectiveness of transfer learning.The results show that after self-supervised pre-training, this model shows its excellent generalization capability. It is possible to be fine-tuned with a short period and performs well in downstream tasks. And the effectiveness of transfer learning is reflected in the experiment as well. The pre-trained model not only shortens the task- specific training time but also obtains better performance and avoids overfitting due to too little training data for molecular property prediction. / Prediktion av molekylers egenskaper har en stor mängd tillämpningar inom kemiindustrin. Kraftfulla metoder för att predicera molekylära egenskaper kan främja vetenskapliga experiment och produktionsprocesser. Ansatsen i detta arbete är att använda överförd inlärning (eng. transfer learning) för att predicera egenskaper hos molekyler. Projektet är indelat i två delar. Den första delen fokuserar på att utveckla och förträna en modell. Modellen består av Transformer-lager med attention- mekanismer och förtränas genom att återställa maskerade kanter i molekylgrafer från storskaliga mängder icke-annoterad data. Efteråt utvärderas prestandan hos den förtränade modellen i en mängd olika uppgifter baserade på prediktion av molekylegenskaper vilket bekräftar fördelen med överförd inlärning.Resultaten visar att modellen efter självövervakad förträning besitter utmärkt förmåga till att generalisera. Den kan finjusteras med liten tidskostnad och presterar väl i specialiserade uppgifter. Effektiviteten hos överförd inlärning visas också i experimenten. Den förtränade modellen förkortar inte bara tiden för uppgifts-specifik inlärning utan uppnår även bättre prestanda och undviker att övertränas på grund otillräckliga mängder data i uppgifter för prediktion av molekylegenskaper.
|
2 |
Sublimation temperature prediction of OLED materials : using machine learningNorinder, Niklas January 2023 (has links)
Organic light-emitting diodes (OLED) are and have been the future of display technology for a minute. Looking back, display technology has moved from cathode-ray tube displays (CRTs) to liquid crystal displays (LCDs). Whereas CRT displays were clunky and had quite high powerconsumption, LCDs were thinner, lighter and consumed less energy. This technological shift has made it possible to create smaller and more portable screens, aiding in the development of personal electronics. Currently, however, LCDs place at the top of the display hierarchy is being challenged by OLED displays, providing higher pixel density and overall higher performance.OLED displays consist of thin layers of organic semiconductors, and are instrumental in the development of folding displays; small displays for virtual reality and augmented reality applications; as well as development of displays that are energy-efficient. In the creation of OLED displays, the organic semiconducting material is vaporized and adhered to a thin film through vapor deposition techniques. One way of aiding in the creation of organic electroluminescent (OEL) materials and OLEDs is through in silico analysis of sublimationtemperatures through machine learning. This master’s thesis inhabits that space, aiming to create a deeper understanding of the OEL materials through sublimation temperature prediction using ensemble learning (light gradient-boosting machine) and deep learning (convolutional neural network) methods. Through analysis of experimental OEL data, it is found that the sublimation temperatures of OLED materials can be predicted with machine learning regression using molecular descriptors, with an R2 score of ~0.86, Mean Absolute Error of ~13°C, Mean Absolute Percentage Error of ~3.1%, and Normalized Mean Absolute Error of ~0.56.
|
3 |
Pre-training Molecular Transformers Through Reaction Prediction / Förträning av molekylär transformer genom reaktionsprediktionBroberg, Johan January 2022 (has links)
Molecular property prediction has the ability to improve many processes in molecular chemistry industry. One important application is the development of new drugs where molecular property prediction can decrease both the cost and time of finding new drugs. The current trend is to use graph neural networks or transformers which tend to need moderate and large amounts of data respectively to perform well. Because of the scarceness of molecular property data it is of great interest to find an effective method to transfer learning from other more data-abundant problems. In this thesis I present an approach to pre-train transformer encoders on reaction prediction in order to improve performance on downstream molecular property prediction tasks. I have built a model based on the full transformer architecture but modify it for the purpose of pre-training the encoder. Model performance and specifically the effect of pre-training is tested by predicting lipophilicity, HIV inhibition and hERG channel blocking using both pre-trained models and models without any pre-training. The results demonstrate a tendency for improvement of performance on all molecular property prediction tasks using the suggested pre-training but this tendency for improvement is not statistically significant. The major limitation with the conclusive evaluation stems from the limited simulations due to computational constraints
|
Page generated in 0.1055 seconds