• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 67
  • 6
  • 3
  • 3
  • 3
  • 3
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 102
  • 102
  • 44
  • 43
  • 27
  • 25
  • 23
  • 22
  • 21
  • 20
  • 20
  • 19
  • 16
  • 16
  • 15
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
91

Generative Adversarial Networks for Image-to-Image Translation on Street View and MR Images

Karlsson, Simon, Welander, Per January 2018 (has links)
Generative Adversarial Networks (GANs) is a deep learning method that has been developed for synthesizing data. One application for which it can be used for is image-to-image translations. This could prove to be valuable when training deep neural networks for image classification tasks. Two areas where deep learning methods are used are automotive vision systems and medical imaging. Automotive vision systems are expected to handle a broad range of scenarios which demand training data with a high diversity. The scenarios in the medical field are fewer but the problem is instead that it is difficult, time consuming and expensive to collect training data. This thesis evaluates different GAN models by comparing synthetic MR images produced by the models against ground truth images. A perceptual study is also performed by an expert in the field. It is shown by the study that the implemented GAN models can synthesize visually realistic MR images. It is also shown that models producing more visually realistic synthetic images not necessarily have better results in quantitative error measurements, when compared to ground truth data. Along with the investigations on medical images, the thesis explores the possibilities of generating synthetic street view images of different resolution, light and weather conditions. Different GAN models have been compared, implemented with our own adjustments, and evaluated. The results show that it is possible to create visually realistic images for different translations and image resolutions.
92

On the Keyword Extraction and Bias Analysis, Graph-based Exploration and Data Augmentation for Abusive Language Detection in Low-Resource Settings

Peña Sarracén, Gretel Liz de la 07 April 2024 (has links)
Tesis por compendio / [ES] La detección del lenguaje abusivo es una tarea que se ha vuelto cada vez más importante en la era digital moderna, donde la comunicación se produce a través de diversas plataformas en línea. El aumento de las interacciones en estas plataformas ha provocado un aumento de la aparición del lenguaje abusivo. Abordar dicho contenido es crucial para mantener un entorno en línea seguro e inclusivo. Sin embargo, esta tarea enfrenta varios desafíos que la convierten en un área compleja y que demanda de continua investigación y desarrollo. En particular, detectar lenguaje abusivo en entornos con escasez de datos presenta desafíos adicionales debido a que el desarrollo de sistemas automáticos precisos a menudo requiere de grandes conjuntos de datos anotados. En esta tesis investigamos diferentes aspectos de la detección del lenguaje abusivo, prestando especial atención a entornos con datos limitados. Primero, estudiamos el sesgo hacia palabras clave abusivas en modelos entrenados para la detección del lenguaje abusivo. Con este propósito, proponemos dos métodos para extraer palabras clave potencialmente abusivas de colecciones de textos. Luego evaluamos el sesgo hacia las palabras clave extraídas y cómo se puede modificar este sesgo para influir en el rendimiento de la detección del lenguaje abusivo. El análisis y las conclusiones de este trabajo revelan evidencia de que es posible mitigar el sesgo y que dicha reducción puede afectar positivamente el desempeño de los modelos. Sin embargo, notamos que no es posible establecer una correspondencia similar entre la variación del sesgo y el desempeño de los modelos cuando hay escasez datos con las técnicas de reducción del sesgo estudiadas. En segundo lugar, investigamos el uso de redes neuronales basadas en grafos para detectar lenguaje abusivo. Por un lado, proponemos una estrategia de representación de textos diseñada con el objetivo de obtener un espacio de representación en el que los textos abusivos puedan distinguirse fácilmente de otros textos. Por otro lado, evaluamos la capacidad de redes neuronales convolucionales basadas en grafos para clasificar textos abusivos. La siguiente parte de nuestra investigación se centra en analizar cómo el aumento de datos puede influir en el rendimiento de la detección del lenguaje abusivo. Para ello, investigamos dos técnicas bien conocidas basadas en el principio de minimización del riesgo en la vecindad de instancias originales y proponemos una variante para una de ellas. Además, evaluamos técnicas simples basadas en el reemplazo de sinónimos, inserción aleatoria, intercambio aleatorio y eliminación aleatoria de palabras. Las contribuciones de esta tesis ponen de manifiesto el potencial de las redes neuronales basadas en grafos y de las técnicas de aumento de datos para mejorar la detección del lenguaje abusivo, especialmente cuando hay limitación de datos. Estas contribuciones han sido publicadas en conferencias y revistas internacionales. / [CA] La detecció del llenguatge abusiu és una tasca que s'ha tornat cada vegada més important en l'era digital moderna, on la comunicació es produïx a través de diverses plataformes en línia. L'augment de les interaccions en estes plataformes ha provocat un augment de l'aparició de llenguatge abusiu. Abordar este contingut és crucial per a mantindre un entorn en línia segur i inclusiu. No obstant això, esta tasca enfronta diversos desafiaments que la convertixen en una àrea complexa i contínua de recerca i desenvolupament. En particular, detectar llenguatge abusiu en entorns amb escassetat de dades presenta desafiaments addicionals pel fet que el desenvolupament de sistemes automàtics precisos sovint requerix de grans conjunts de dades anotades. En esta tesi investiguem diferents aspectes de la detecció del llenguatge abusiu, prestant especial atenció a entorns amb dades limitades. Primer, estudiem el biaix cap a paraules clau abusives en models entrenats per a la detecció de llenguatge abusiu. Amb este propòsit, proposem dos mètodes per a extraure paraules clau potencialment abusives de col·leccions de textos. Després avaluem el biaix cap a les paraules clau extretes i com es pot modificar este biaix per a influir en el rendiment de la detecció de llenguatge abusiu. L'anàlisi i les conclusions d'este treball revelen evidència que és possible mitigar el biaix i que esta reducció pot afectar positivament l'acompliment dels models. No obstant això, notem que no és possible establir una correspondència similar entre la variació del biaix i l'acompliment dels models quan hi ha escassetat dades amb les tècniques de reducció del biaix estudiades. En segon lloc, investiguem l'ús de xarxes neuronals basades en grafs per a detectar llenguatge abusiu. D'una banda, proposem una estratègia de representació textual dissenyada amb l'objectiu d'obtindre un espai de representació en el qual els textos abusius puguen distingir-se fàcilment d'altres textos. D'altra banda, avaluem la capacitat de models basats en xarxes neuronals convolucionals basades en grafs per a classificar textos abusius. La següent part de la nostra investigació se centra en analitzar com l'augment de dades pot influir en el rendiment de la detecció del llenguatge abusiu. Per a això, investiguem dues tècniques ben conegudes basades en el principi de minimització del risc en el veïnatge d'instàncies originals i proposem una variant per a una d'elles. A més, avaluem tècniques simples basades en el reemplaçament de sinònims, inserció aleatòria, intercanvi aleatori i eliminació aleatòria de paraules. Les contribucions d'esta tesi destaquen el potencial de les xarxes neuronals basades en grafs i de les tècniques d'augment de dades per a millorar la detecció del llenguatge abusiu, especialment quan hi ha limitació de dades. Estes contribucions han sigut publicades en revistes i conferències internacionals. / [EN] Abusive language detection is a task that has become increasingly important in the modern digital age, where communication takes place via various online platforms. The increase in online interactions has led to an increase in the occurrence of abusive language. Addressing such content is crucial to maintaining a safe and inclusive online environment. However, this task faces several challenges that make it a complex and ongoing area of research and development. In particular, detecting abusive language in environments with sparse data poses an additional challenge, since the development of accurate automated systems often requires large annotated datasets. In this thesis we investigate different aspects of abusive language detection, paying particular attention to environments with limited data. First, we study the bias toward abusive keywords in models trained for abusive language detection. To this end, we propose two methods for extracting potentially abusive keywords from datasets. We then evaluate the bias toward the extracted keywords and how this bias can be modified in order to influence abusive language detection performance. The analysis and conclusions of this work reveal evidence that it is possible to mitigate the bias and that such a reduction can positively affect the performance of the models. However, we notice that it is not possible to establish a similar correspondence between bias mitigation and model performance in low-resource settings with the studied bias mitigation techniques. Second, we investigate the use of models based on graph neural networks to detect abusive language. On the one hand, we propose a text representation framework designed with the aim of obtaining a representation space in which abusive texts can be easily distinguished from other texts. On the other hand, we evaluate the ability of models based on convolutional graph neural networks to classify abusive texts. The next part of our research focuses on analyzing how data augmentation can influence the performance of abusive language detection. To this end, we investigate two well-known techniques based on the principle of vicinal risk minimization and propose a variant for one of them. In addition, we evaluate simple techniques based on the operations of synonym replacement, random insertion, random swap, and random deletion. The contributions of this thesis highlight the potential of models based on graph neural networks and data augmentation techniques to improve abusive language detection, especially in low-resource settings. These contributions have been published in several international conferences and journals. / This research work was partially funded by the Spanish Ministry of Science and Innovation under the research project MISMIS-FAKEnHATE on Misinformation and Miscommunication in social media: FAKE news and HATE speech (PGC2018-096212-B-C31). The authors thank also the EU-FEDER Comunitat Valenciana 2014-2020 grant IDIFEDER/2018/025. This work was done in the framework of the research project on Fairness and Transparency for equitable NLP applications in social media, funded by MCIN/AEI/10.13039/501100011033 and by ERDF, EU A way of making EuropePI. FairTransNLP research project (PID2021-124361OB-C31) funded by MCIN/AEI/10.13039/501100011033 and by ERDF, EU A way of making Europe. Part of the work presented in this article was performed during the first author’s research visit to the University of Mannheim, supported through a Contact Fellowship awarded by the DAAD scholarship program “STIBET Doktoranden”. / Peña Sarracén, GLDL. (2024). On the Keyword Extraction and Bias Analysis, Graph-based Exploration and Data Augmentation for Abusive Language Detection in Low-Resource Settings [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/203266 / Compendio
93

Sur la génération d'exemples pour réduire le coût d'annotation

Piedboeuf, Frédéric 03 1900 (has links)
L'apprentissage machine moderne s'appuie souvent sur l'utilisation de jeux de données massifs, mais il existe de nombreux contextes où l'acquisition et la manipulation de grandes données n'est pas possible, et le développement de techniques d'apprentissage avec de petites données est donc essentiel. Dans cette thèse, nous étudions comment diminuer le nombre de données nécessaires à travers deux paradigmes d'apprentissage~: l'augmentation de données et l'apprentissage par requête synthétisée. La thèse s'organise en quatre volets, chacun démontrant une nouvelle facette concernant la génération d'exemples pour réduire le coût d'annotation. Le premier volet regarde l'augmentation de données pour des textes en anglais, ce qui nous permet d'établir une comparaison objective des techniques et de développer de nouveaux algorithmes. Le deuxième volet regarde ensuite l'augmentation de données dans les langues autres que l'anglais, et le troisième pour la tâche de génération de mots-clés en français. Finalement, le dernier volet s'intéresse à l'apprentissage par requête synthétisée, où les exemples générés sont annotés, en contraste à l'augmentation de données qui produit des exemples sans coût d'annotation supplémentaire. Nous montrons que cette technique permet de meilleures performances, particulièrement lorsque le jeu de données est large et l'augmentation de données souvent inefficace. / Modern machine learning often relies on the use of massive datasets, but there are many contexts where acquiring and handling large data is not feasible, making the development of techniques for learning with small data essential. In this thesis, we investigate how to reduce the amount of data required through two learning paradigms~: data augmentation and membership query synthesis. The thesis is organized into four parts, each demonstrating a new aspect of generating examples to reduce annotation costs. The first part examines data augmentation for English text, allowing us to make an objective comparison of techniques and develop new algorithms. The second one then explores data augmentation in languages other than English, and the third focuses on the task of keyword generation in French. Finally, the last part delves into membership query synthesis, where generated examples are annotated, in contrast to data augmentation, which produces examples without additional annotation costs. We show that this technique leads to better performance, especially when the dataset is large and data augmentation is often ineffective.
94

Enhancing Fairness in Facial Recognition: Balancing Datasets and Leveraging AI-Generated Imagery for Bias Mitigation : A Study on Mitigating Ethnic and Gender Bias in Public Surveillance Systems

Abbas, Rashad, Tesfagiorgish, William Issac January 2024 (has links)
Facial recognition technology has become a ubiquitous tool in security and personal identification. However, the rise of this technology has been accompanied by concerns over inherent biases, particularly regarding ethnic and gender. This thesis examines the extent of these biases by focusing on the influence of dataset imbalances in facial recognition algorithms. We employ a structured methodological approach that integrates AI-generated images to enhance dataset diversity, with the intent to balance representation across ethnics and genders. Using the ResNet and Vgg model, we conducted a series of controlled experiments that compare the performance impacts of balanced versus imbalanced datasets. Our analysis includes the use of confusion matrices and accuracy, precision, recall and F1-score metrics to critically assess the model’s performance. The results demonstrate how tailored augmentation of training datasets can mitigate bias, leading to more equitable outcomes in facial recognition technology. We present our findings with the aim of contributing to the ongoing dialogue regarding AI fairness and propose a framework for future research in the field.
95

Approche bayésienne de l'évaluation de l'incertitude de mesure : application aux comparaisons interlaboratoires

Demeyer, Séverine 04 March 2011 (has links)
La modélisation par équations structurelles est très répandue dans des domaines très variés et nous l'appliquons pour la première fois en métrologie dans le traitement de données de comparaisons interlaboratoires. Les modèles à équations structurelles à variables latentes sont des modèles multivariés utilisés pour modéliser des relations de causalité entre des variables observées (les données). Le modèle s'applique dans le cas où les données peuvent être regroupées dans des blocs disjoints où chaque bloc définit un concept modélisé par une variable latente. La structure de corrélation des variables observées est ainsi résumée dans la structure de corrélation des variables latentes. Nous proposons une approche bayésienne des modèles à équations structurelles centrée sur l'analyse de la matrice de corrélation des variables latentes. Nous appliquons une expansion paramétrique à la matrice de corrélation des variables latentes afin de surmonter l'indétermination de l'échelle des variables latentes et d'améliorer la convergence de l'algorithme de Gibbs utilisé. La puissance de l'approche structurelle nous permet de proposer une modélisation riche et flexible des biais de mesure qui vient enrichir le calcul de la valeur de consensus et de son incertitude associée dans un cadre entièrement bayésien. Sous certaines hypothèses l'approche permet de manière innovante de calculer les contributions des variables de biais au biais des laboratoires. Plus généralement nous proposons un cadre bayésien pour l'amélioration de la qualité des mesures. Nous illustrons et montrons l'intérêt d'une modélisation structurelle des biais de mesure sur des comparaisons interlaboratoires en environnement. / Structural equation modelling is a widespread approach in a variety of domains and is first applied here to interlaboratory comparisons in metrology. Structural Equation Models with latent variables (SEM) are multivariate models used to model causality relationships in observed variables (the data). It is assumed that data can be grouped into separate blocks each describing a latent concept modelled by a latent variable. The correlation structure of the observed variables is transferred into the correlation structure of the latent variables. A Bayesian approach of SEM is proposed based on the analysis of the correlation matrix of latent variables using parameter expansion to overcome identifiability issues and improving the convergence of the Gibbs sampler. SEM is used as a powerful and flexible tool to model measurement bias with the aim of improving the reliability of the consensus value and its associated uncertainty in a fully Bayesian framework. The approach also allows to compute the contributions of the observed variables to the bias of the laboratories, under additional hypotheses. More generally a global Bayesian framework is proposed to improve the quality of measurements. The approach is illustrated on the structural equation modelling of measurement bias in interlaboratory comparisons in environment.
96

Segmentace lézí roztroušené sklerózy pomocí hlubokých neuronových sítí / Segmentation of multiple sclerosis lesions using deep neural networks

Sasko, Dominik January 2021 (has links)
Hlavným zámerom tejto diplomovej práce bola automatická segmentácia lézií sklerózy multiplex na snímkoch MRI. V rámci práce boli otestované najnovšie metódy segmentácie s využitím hlbokých neurónových sietí a porovnané prístupy inicializácie váh sietí pomocou preneseného učenia (transfer learning) a samoriadeného učenia (self-supervised learning). Samotný problém automatickej segmentácie lézií sklerózy multiplex je veľmi náročný, a to primárne kvôli vysokej nevyváženosti datasetu (skeny mozgov zvyčajne obsahujú len malé množstvo poškodeného tkaniva). Ďalšou výzvou je manuálna anotácia týchto lézií, nakoľko dvaja rozdielni doktori môžu označiť iné časti mozgu ako poškodené a hodnota Dice Coefficient týchto anotácií je približne 0,86. Možnosť zjednodušenia procesu anotovania lézií automatizáciou by mohlo zlepšiť výpočet množstva lézií, čo by mohlo viesť k zlepšeniu diagnostiky individuálnych pacientov. Našim cieľom bolo navrhnutie dvoch techník využívajúcich transfer learning na predtrénovanie váh, ktoré by neskôr mohli zlepšiť výsledky terajších segmentačných modelov. Teoretická časť opisuje rozdelenie umelej inteligencie, strojového učenia a hlbokých neurónových sietí a ich využitie pri segmentácii obrazu. Následne je popísaná skleróza multiplex, jej typy, symptómy, diagnostika a liečba. Praktická časť začína predspracovaním dát. Najprv boli skeny mozgu upravené na rovnaké rozlíšenie s rovnakou veľkosťou voxelu. Dôvodom tejto úpravy bolo využitie troch odlišných datasetov, v ktorých boli skeny vytvárané rozličnými prístrojmi od rôznych výrobcov. Jeden dataset taktiež obsahoval lebku, a tak bolo nutné jej odstránenie pomocou nástroju FSL pre ponechanie samotného mozgu pacienta. Využívali sme 3D skeny (FLAIR, T1 a T2 modality), ktoré boli postupne rozdelené na individuálne 2D rezy a použité na vstup neurónovej siete s enkodér-dekodér architektúrou. Dataset na trénovanie obsahoval 6720 rezov s rozlíšením 192 x 192 pixelov (po odstránení rezov, ktorých maska neobsahovala žiadnu hodnotu). Využitá loss funkcia bola Combo loss (kombinácia Dice Loss s upravenou Cross-Entropy). Prvá metóda sa zameriavala na využitie predtrénovaných váh z ImageNet datasetu na enkodér U-Net architektúry so zamknutými váhami enkodéra, resp. bez zamknutia a následného porovnania s náhodnou inicializáciou váh. V tomto prípade sme použili len FLAIR modalitu. Transfer learning dokázalo zvýšiť sledovanú metriku z hodnoty približne 0,4 na 0,6. Rozdiel medzi zamknutými a nezamknutými váhami enkodéru sa pohyboval okolo 0,02. Druhá navrhnutá technika používala self-supervised kontext enkodér s Generative Adversarial Networks (GAN) na predtrénovanie váh. Táto sieť využívala všetky tri spomenuté modality aj s prázdnymi rezmi masiek (spolu 23040 obrázkov). Úlohou GAN siete bolo dotvoriť sken mozgu, ktorý bol prekrytý čiernou maskou v tvare šachovnice. Takto naučené váhy boli následne načítané do enkodéru na aplikáciu na náš segmentačný problém. Tento experiment nevykazoval lepšie výsledky, s hodnotou DSC 0,29 a 0,09 (nezamknuté a zamknuté váhy enkodéru). Prudké zníženie metriky mohlo byť spôsobené použitím predtrénovaných váh na vzdialených problémoch (segmentácia a self-supervised kontext enkodér), ako aj zložitosť úlohy kvôli nevyváženému datasetu.
97

AUGMENTATION AND CLASSIFICATION OF TIME SERIES FOR FINDING ACL INJURIES

Johansson, Marie-Louise January 2022 (has links)
This thesis addresses the problem where we want to apply machine learning over a small data set of multivariate time series. A challenge when classifying data is when the data set is small and overfitting is at risk. Augmentation of small data sets might avoid overfitting. The multivariate time series used in this project represent motion data of people with reconstructed ACLs and a control group. The approach was pairing motion data from the training set and using Euclidean Barycentric Averaging to create a new set of synthetic motion data so as to increase the size of the training set. The classifiers used were Dynamic Time Warping -One Nearest neighbour and Time Series Forest. In our example we found this way of increasing the training set a less productive strategy. We also found Time Series Forest to generally perform with higher accuracy on the chosen data sets, but there may be more effective augmentation strategies to avoid overfitting.
98

Approche bayésienne de l'évaluation de l'incertitude de mesure : application aux comparaisons interlaboratoires / Bayesian approach for the evaluation of measurement uncertainty applied to interlaboratory comparisons

Demeyer, Séverine 04 March 2011 (has links)
La modélisation par équations structurelles est très répandue dans des domaines très variés et nous l'appliquons pour la première fois en métrologie dans le traitement de données de comparaisons interlaboratoires. Les modèles à équations structurelles à variables latentes sont des modèles multivariés utilisés pour modéliser des relations de causalité entre des variables observées (les données). Le modèle s'applique dans le cas où les données peuvent être regroupées dans des blocs disjoints où chaque bloc définit un concept modélisé par une variable latente. La structure de corrélation des variables observées est ainsi résumée dans la structure de corrélation des variables latentes. Nous proposons une approche bayésienne des modèles à équations structurelles centrée sur l'analyse de la matrice de corrélation des variables latentes. Nous appliquons une expansion paramétrique à la matrice de corrélation des variables latentes afin de surmonter l'indétermination de l'échelle des variables latentes et d'améliorer la convergence de l'algorithme de Gibbs utilisé. La puissance de l'approche structurelle nous permet de proposer une modélisation riche et flexible des biais de mesure qui vient enrichir le calcul de la valeur de consensus et de son incertitude associée dans un cadre entièrement bayésien. Sous certaines hypothèses l'approche permet de manière innovante de calculer les contributions des variables de biais au biais des laboratoires. Plus généralement nous proposons un cadre bayésien pour l'amélioration de la qualité des mesures. Nous illustrons et montrons l'intérêt d'une modélisation structurelle des biais de mesure sur des comparaisons interlaboratoires en environnement. / Structural equation modelling is a widespread approach in a variety of domains and is first applied here to interlaboratory comparisons in metrology. Structural Equation Models with latent variables (SEM) are multivariate models used to model causality relationships in observed variables (the data). It is assumed that data can be grouped into separate blocks each describing a latent concept modelled by a latent variable. The correlation structure of the observed variables is transferred into the correlation structure of the latent variables. A Bayesian approach of SEM is proposed based on the analysis of the correlation matrix of latent variables using parameter expansion to overcome identifiability issues and improving the convergence of the Gibbs sampler. SEM is used as a powerful and flexible tool to model measurement bias with the aim of improving the reliability of the consensus value and its associated uncertainty in a fully Bayesian framework. The approach also allows to compute the contributions of the observed variables to the bias of the laboratories, under additional hypotheses. More generally a global Bayesian framework is proposed to improve the quality of measurements. The approach is illustrated on the structural equation modelling of measurement bias in interlaboratory comparisons in environment.
99

Multi-fidelity Machine Learning for Perovskite Band Gap Predictions

Panayotis Thalis Manganaris (16384500) 16 June 2023 (has links)
<p>A wide range of optoelectronic applications demand semiconductors optimized for purpose.</p> <p>My research focused on data-driven identification of ABX3 Halide perovskite compositions for optimum photovoltaic absorption in solar cells.</p> <p>I trained machine learning models on previously reported datasets of halide perovskite band gaps based on first principles computations performed at different fidelities.</p> <p>Using these, I identified mixtures of candidate constituents at the A, B or X sites of the perovskite supercell which leveraged how mixed perovskite band gaps deviate from the linear interpolations predicted by Vegard's law of mixing to obtain a selection of stable perovskites with band gaps in the ideal range of 1 to 2 eV for visible light spectrum absorption.</p> <p>These models predict the perovskite band gap using the composition and inherent elemental properties as descriptors.</p> <p>This enables accurate, high fidelity prediction and screening of the much larger chemical space from which the data samples were drawn.</p> <p><br></p> <p>I utilized a recently published density functional theory (DFT) dataset of more than 1300 perovskite band gaps from four different levels of theory, added to an experimental perovskite band gap dataset of \textasciitilde{}100 points, to train random forest regression (RFR), Gaussian process regression (GPR), and Sure Independence Screening and Sparsifying Operator (SISSO) regression models, with data fidelity added as one-hot encoded features.</p> <p>I found that RFR yields the best model with a band gap root mean square error of 0.12 eV on the total dataset and 0.15 eV on the experimental points.</p> <p>SISSO provided compound features and functions for direct prediction of band gap, but errors were larger than from RFR and GPR.</p> <p>Additional insights gained from Pearson correlation and Shapley additive explanation (SHAP) analysis of learned descriptors suggest the RFR models performed best because of (a) their focus on identifying and capturing relevant feature interactions and (b) their flexibility to represent nonlinear relationships between such interactions and the band gap.</p> <p>The best model was deployed for predicting experimental band gap of 37785 hypothetical compounds.</p> <p>Based on this, we identified 1251 stable compounds with band gap predicted to be between 1 and 2 eV at experimental accuracy, successfully narrowing the candidates to about 3% of the screened compositions.</p>
100

[en] CONVOLUTIONAL NETWORKS APPLIED TO SEMANTIC SEGMENTATION OF SEISMIC IMAGES / [pt] REDES CONVOLUCIONAIS APLICADAS À SEGMENTAÇÃO SEMÂNTICA DE IMAGENS SÍSMICAS

MATEUS CABRAL TORRES 10 August 2021 (has links)
[pt] A partir de melhorias incrementais em uma conhecida rede neural convolucional (U-Net), diferentes técnicas são avaliadas quanto às suas performances na tarefa de segmentação semântica em imagens sísmicas. Mais especificamente, procura-se a identificação e delineamento de estruturas salinas no subsolo, o que é de grande relevância na indústria de óleo e gás para a exploração de petróleo em camadas pré-sal, por exemplo. Além disso, os desafios apresentados no tratamento destas imagens sísmicas se assemelham em muito aos encontrados em tarefas de áreas médicas como identificação de tumores e segmentação de tecidos, o que torna o estudo da tarefa em questão ainda mais valioso. Este trabalho pretende sugerir uma metodologia adequada de abordagem à tarefa e produzir redes neurais capazes de segmentar imagens sísmicas com bons resultados dentro das métricas utilizadas. Para alcançar estes objetivos, diferentes estruturas de redes, transferência de aprendizado e técnicas de aumentação de dados são testadas em dois datasets com diferentes níveis de complexidade. / [en] Through incremental improvements in a well-known convolutional neural network (U-Net), different techniques are evaluated regarding their performance on the task of semantic segmentation of seismic images. More specifically, the objective is the better identification and outline of subsurface salt structures, which is a task of great relevance for the oil and gas industry in the exploration of pre-salt layers, for example. Besides that application, the challenges imposed by the treatment of seismic images also resemble those found in medical fields like tumor detection and tissue segmentation, which makes the study of this task even more valuable. This work seeks to suggest a suitable methodology for the task and to yield neural networks that are capable of performing semantic segmentation of seismic images with good results regarding specific metrics. For that purpose, different network structures, transfer learning and data augmentation techniques are applied in two datasets with different levels of complexity.

Page generated in 0.2322 seconds