Global ETD Search

1	Development and Application of Machine Learning Methods to Selected Problems of Theoretical Solid State Physics Hoock, Benedikt Andreas 16 August 2022 (has links) In den letzten Jahren hat sich maschinelles Lernen als hilfreiches Werkzeug zur Vorhersage von simulierten Materialeigenschaften erwiesen. Somit können aufwendige Berechnungen mittels Dichtefunktionaltheorie umgangen werden und bereits bekannte Materialien besser verstanden oder sogar neuartige entdeckt werden. Eine zentrale Rolle spielt dabei der Deskriptor, ein möglichst interpretierbarer Satz von Materialkenngrößen. Diese Arbeit präsentiert einen Ansatz zur Auffindung von Deskriptoren für periodische Multikomponentensysteme, deren Eigenschaften durch die genaue atomare Anordnung mitbeinflusst wird. Primäre Features von Einzel-, Paar- und Tetraederclustern werden über die Superzelle gemittelt und weiter algebraisch kombiniert. Aus den so erzeugten Kandidaten wird mittels Dimensionalitätsreduktion ein geeigneter Deskriptor identifiziert. Zudem stellt diese Arbeit Strategien vor bei der Modellfindung Kreuzvalidierung einzusetzen, sodass stabilere und idealerweise besser generalisierbare Deskriptoren gefunden werden. Es werden außerdem mehrere Fehlermaße untersucht, die die Qualität der Deskriptoren bezüglich Genauigkeit, Komplexität der Formeln und Berücksichtung der atomaren Anordnung charakterisieren. Die allgemeine Methodik wurde in einer teilweise parallelisierten Python-Software implementiert. Als konkrete Problemstellungen werden Modelle für die Gitterkonstante und die Mischenergie von ternären Gruppe-IV Zinkblende-Legierungen "gelernt", mit einer Genauigkeit von 0.02 Å bzw. 0.02 eV. Datenbeschaffung, -analyse, und -bereinigung werden im Hinblick auf die Zielgrößen als auch auf die primären Features erläutert, sodass umfassende Analysen und die Parametrisierung der Methodik an diesem Testdatensatz durchgeführt werden können. Als weitere Anwendung werden Gitterkonstante und Bandlücken von binären Oktett-Verbindungen vorhergesagt. Die präsentierten Deskriptoren werden mit den Fehlermaßen evaluiert und ihre physikalische Relevanz wird abschließend disktutiert. / In the last years, machine learning methods have proven as a useful tool for the prediction of simulated material properties. They may replace effortful calculations based on density functional theory, provide a better understanding of known materials or even help to discover new materials. Here, an essential role is played by the descriptor, a desirably interpretable set of material parameters. This PhD thesis presents an approach to find descriptors for periodic multi-component systems where also the exact atomic configuration influences the physical characteristics. We process primary features of one-atom, two-atom and tetrahedron clusters by an averaging scheme and combine them further by simple algebraic operations. Compressed sensing is used to identify an appropriate descriptor out from all candidate features. Furthermore, we develop elaborate cross-validation based model selection strategies that may lead to more robust and ideally better generalizing descriptors. Additionally, we study several error measures which estimate the quality of the descriptors with respect to accuracy, complexity of their formulas and the capturing of configuration effects. These generally formulated methods were implemented in a partially parallelized Python program. Actual learning tasks were studied on the problem of finding models for the lattice constant and the energy of mixing of group-IV ternary compounds in zincblende structure where an accuracy of 0.02 Å and 0.02 eV is reached, respectively. We explain the practical preparation steps of data acquisition, analysis and cleaning for the target properties and the primary features, and continue with extensive analyses and the parametrization of the developed methodology on this test case. As an additional application we predict lattice constants and band gaps of octet binary compounds. The presented descriptors are assessed quantitatively by the error measures and, finally, their physical meaning is discussed. Maschinelles Lernen LASSO SISSO Gitterkonstante Mischungsenergie ternäre Gruppe-IV Legierungen symbolische Regression Deskriptor computergestützte Festkörperphysik Machine Learning LASSO SISSO lattice constant energy of mixing group-IV ternary compounds symbolic regression descriptor computational materials science 530 Physik ddc:530
2	Multi-fidelity Machine Learning for Perovskite Band Gap Predictions Panayotis Thalis Manganaris (16384500) 16 June 2023 (has links) <p>A wide range of optoelectronic applications demand semiconductors optimized for purpose.</p> <p>My research focused on data-driven identification of ABX3 Halide perovskite compositions for optimum photovoltaic absorption in solar cells.</p> <p>I trained machine learning models on previously reported datasets of halide perovskite band gaps based on first principles computations performed at different fidelities.</p> <p>Using these, I identified mixtures of candidate constituents at the A, B or X sites of the perovskite supercell which leveraged how mixed perovskite band gaps deviate from the linear interpolations predicted by Vegard's law of mixing to obtain a selection of stable perovskites with band gaps in the ideal range of 1 to 2 eV for visible light spectrum absorption.</p> <p>These models predict the perovskite band gap using the composition and inherent elemental properties as descriptors.</p> <p>This enables accurate, high fidelity prediction and screening of the much larger chemical space from which the data samples were drawn.</p> <p><br></p> <p>I utilized a recently published density functional theory (DFT) dataset of more than 1300 perovskite band gaps from four different levels of theory, added to an experimental perovskite band gap dataset of \textasciitilde{}100 points, to train random forest regression (RFR), Gaussian process regression (GPR), and Sure Independence Screening and Sparsifying Operator (SISSO) regression models, with data fidelity added as one-hot encoded features.</p> <p>I found that RFR yields the best model with a band gap root mean square error of 0.12 eV on the total dataset and 0.15 eV on the experimental points.</p> <p>SISSO provided compound features and functions for direct prediction of band gap, but errors were larger than from RFR and GPR.</p> <p>Additional insights gained from Pearson correlation and Shapley additive explanation (SHAP) analysis of learned descriptors suggest the RFR models performed best because of (a) their focus on identifying and capturing relevant feature interactions and (b) their flexibility to represent nonlinear relationships between such interactions and the band gap.</p> <p>The best model was deployed for predicting experimental band gap of 37785 hypothetical compounds.</p> <p>Based on this, we identified 1251 stable compounds with band gap predicted to be between 1 and 2 eV at experimental accuracy, successfully narrowing the candidates to about 3% of the screened compositions.</p> Compound semiconductors Organic semiconductors Data engineering and data science halide perovskites band gap feature extraction and representation SISSO random forest regression analysis Gaussian Process Regression Analysis SHapley Additive exPlanations (SHAP) combinatorial datasets data augmentation method Lead Free Perovskite Solar Cells Density Functional Theory (DFT) multi-fidelity data multi-task learning (MTL)

Search results

Development and Application of Machine Learning Methods to Selected Problems of Theoretical Solid State Physics

Multi-fidelity Machine Learning for Perovskite Band Gap Predictions