• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 87
  • 2
  • 1
  • 1
  • Tagged with
  • 106
  • 106
  • 106
  • 71
  • 47
  • 47
  • 34
  • 33
  • 30
  • 30
  • 28
  • 23
  • 20
  • 18
  • 16
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
91

GAN-Based Synthesis of Brain Tumor Segmentation Data : Augmenting a dataset by generating artificial images

Foroozandeh, Mehdi January 2020 (has links)
Machine learning applications within medical imaging often suffer from a lack of data, as a consequence of restrictions that hinder the free distribution of patient information. In this project, GANs (generative adversarial networks) are used to generate data synthetically, in an effort to circumvent this issue. The GAN framework PGAN is trained on the brain tumor segmentation dataset BraTS to generate new, synthetic brain tumor masks with the same visual characteristics as the real samples. The image-to-image translation network SPADE is subsequently trained on the image pairs in the real dataset, to learn a transformation from segmentation masks to brain MR images, and is in turn used to map the artificial segmentation masks generated by PGAN to corresponding artificial MR images. The images generated by these networks form a new, synthetic dataset, which is used to augment the original dataset. Different quantities of real and synthetic data are then evaluated in three different brain tumor segmentation tasks, where the image segmentation network U-Net is trained on this data to segment (real) MR images into the classes in question. The final segmentation performance of each training instance is evaluated over test data from the real dataset with the Weighted Dice Loss metric. The results indicate a slight increase in performance across all segmentation tasks evaluated in this project, when including some quantity of synthetic images. However, the differences were largest when the experiments were restricted to using only 20 % of the real data, and less significant when the full dataset was made available. A majority of the generated segmentation masks appear visually convincing to an extent (although somewhat noisy with regards to the intra-tumoral classes), while a relatively large proportion appear heavily noisy and corrupted. However, the translation of segmentation masks to MR images via SPADE proved more reliable and consistent.
92

Understanding, improving, and generalizing generative models

Jolicoeur-Martineau, Alexia 08 1900 (has links)
Les modèles génératifs servent à générer des échantillons d'une loi de probabilité (ex. : du texte, des images, de la musique, des vidéos, des molécules, et beaucoup plus) à partir d'un jeu de données (ex. : une banque d'images, de texte, ou autre). Entrainer des modèles génératifs est une tâche très difficile, mais ces outils ont un très grand potentiel en termes d'applications. Par exemple, dans le futur lointain, on pourrait envisager qu'un modèle puisse générer les épisodes d'une émission de télévision à partir d'un script et de voix générés par d'autres modèles génératifs. Il existe plusieurs types de modèles génératifs. Pour la génération d'images, l'approche la plus fructueuse est sans aucun doute la méthode de réseaux adverses génératifs (GANs). Les GANs apprennent à générer des images par un jeu compétitif entre deux joueurs, le Discriminateur et le Générateur. Le Discriminateur tente de prédire si une image est vraie ou fausse, tandis que le Générateur tente de générer des images plus réalistes en apprenant à faire croire au discriminateur que ces fausses images générées sont vraies. En complétant ce jeu, les GANs arrivent à générer des images presque photo-réalistes. Il est souvent possible pour des êtres humains de distinguer les fausses images (générés par les GANs) des vraies images (ceux venant du jeu de données), mais la tâche devient plus difficile au fur et à mesure que cette technologie s'améliore. Le plus gros défaut des GANs est que les données générées par les GANs manquent souvent de diversité (ex. : les chats au visage aplati sont rares dans la banque d'images, donc les GANs génèrent juste des races de chats plus fréquentes). Ces méthodes souvent aussi souvent très instables. Il y a donc encore beaucoup de chemin à faire avant l'obtention d'images parfaitement photo-réalistes et diverses. De nouvelles méthodes telles que les modèles de diffusion à la base de score semblent produire de meilleurs résultats que les GANs, donc tout n'est pas gagné pour les GANs. C'est pourquoi cette thèse n'est pas concentrée seulement sur les GANs, mais aussi sur les modèles de diffusion. Notez que cette thèse est exclusivement concentrée sur la génération de données continues (ex. : images, musique, vidéos) plutôt que discrètes (ex. : texte), car cette dernière fait usage de méthodes complètement différentes. Le premier objectif de cette thèse est d'étudier les modèles génératifs de façon théorique pour mieux les comprendre. Le deuxième objectif de cette thèse est d'inventer de nouvelles astuces (nouvelles fonctions objectives, régularisations, architectures, etc.) permettant d'améliorer les modèles génératifs. Le troisième objectif est de généraliser ces approches au-delà de leur formulation initiale, pour permettre la découverte de nouveaux liens entre différentes approches. Ma première contribution est de proposer un discriminateur relativiste qui estime la probabilité qu'une donnée réelle, soit plus réaliste qu'une donnée fausse (inventée par un modèle générateur). Les GANs relativistes forment une nouvelle classe de fonctions de perte qui apportent beaucoup de stabilité durant l'entrainement. Ma seconde contribution est de prouver que les GANs relativistes forment une mesure de dissimilarité. Ma troisième contribution est de concevoir une variante adverse au appariement de score pour produire des données de meilleure qualité avec les modèles de diffusion. Ma quatrième contribution est d'améliorer la vitesse de génération des modèles de diffusion par la création d'une méthode numérique de résolution pour équations différentielles stochastiques (SDEs). / Generative models are powerful tools to generate samples (e.g., images, music, text) from an unknown distribution given a finite set of examples. Generative models are hard to train successfully, but they have the potential to revolutionize arts, science, and business. These models can generate samples from various data types (e.g., text, images, audio, videos, 3d). In the future, we can envision generative models being used to create movies or episodes from a TV show given a script (possibly also generated by a generative model). One of the most successful methods for generating images is Generative Adversarial Networks (GANs). This approach consists of a game between two players, the Discriminator and the Generator. The goal of the Discriminator is to classify an image as real or fake, while the Generator attempts to fool the Discriminator into thinking that the fake images it generates are real. Through this game, GANs are able to generate very high-quality samples, such as photo-realistic images. Humans are still generally able to distinguish real images (from the training dataset) from fake images (generated by GANs), but the gap is lessening as GANs become better over time. The biggest weakness of GANs is that they have trouble generating diverse data representative of the full range of the data distribution. Thus, there is still much progress to be made before GANs reach their full potential. New methods performing better than GANs are also appearing. One prime example is score-based diffusion models. This thesis focuses on generative models that seemed promising at the time for continuous data generation: GANs and score-based diffusion models. I seek to improve generative models so that they reach their full potential (Objective 1: Improving) and to understand these approaches better on a theoretical level (Objective 2: Theoretical understanding). I also want to generalize these approaches beyond their original setting (Objective 3: Generalizing), allowing the discovery of new connections between different concepts/fields. My first contribution is to propose using a relativistic discriminator, which estimates the probability that a given real data is more realistic than a randomly sampled fake data. Relativistic GANs form a new class of GAN loss functions that are much more stable with respect to optimization hyperparameters. My second contribution is to take a more rigorous look at relativistic GANs and prove that they are proper statistical divergences. My third contribution is to devise an adversarial variant to denoising score matching, which leads to higher quality data with score-based diffusion models. My fourth contribution is to significantly improve the speed of score-based diffusion models through a carefully devised Stochastic Differential Equation (SDE) solver.
93

Generating Extreme Value Distributions in Finance using Generative Adversarial Networks / Generering av Extremvärdesfördelningar inom Finans med hjälp av Generativa Motstridande Nätverk

Nord-Nilsson, William January 2023 (has links)
This thesis aims to develop a new model for stress-testing financial portfolios using Extreme Value Theory (EVT) and General Adversarial Networks (GANs). The current practice of risk management relies on mathematical or historical models, such as Value-at-Risk and expected shortfall. The problem with historical models is that the data which is available for very extreme events is limited, and therefore we need a method to interpolate and extrapolate beyond the available range. EVT is a statistical framework that analyzes extreme events in a distribution and allows such interpolation and extrapolation, and GANs are machine-learning techniques that generate synthetic data. The combination of these two areas can generate more realistic stress-testing scenarios to help financial institutions manage potential risks better. The goal of this thesis is to develop a new model that can handle complex dependencies and high-dimensional inputs with different kinds of assets such as stocks, indices, currencies, and commodities and can be used in parallel with traditional risk measurements. The evtGAN algorithm shows promising results and is able to mimic actual distributions, and is also able to extrapolate data outside the available data range. / Detta examensarbete handlar om att utveckla en ny modell för stresstestning av finansiella portföljer med hjälp av extremvärdesteori (EVT) och Generative Adversarial Networks (GAN). Dom modeller för riskhantering som används idag bygger på matematiska eller historiska modeller, som till exempel Value-at-Risk och Expected Shortfall. Problemet med historiska modeller är att det finns begränsat med data för mycket extrema händelser. EVT är däremot en del inom statistisk som analyserar extrema händelser i en fördelning, och GAN är maskininlärningsteknik som genererar syntetisk data. Genom att kombinera dessa två områden kan mer realistiska stresstestscenarier skapas för att hjälpa finansiella institutioner att bättre hantera potentiella risker. Målet med detta examensarbete är att utveckla en ny modell som kan hantera komplexa beroenden i högdimensionell data med olika typer av tillgångar, såsom aktier, index, valutor och råvaror, och som kan användas parallellt med traditionella riskmått. Algoritmen evtGAN visar lovande resultat och kan imitera verkliga fördelningar samt extrapolera data utanför tillgänglig datamängd.
94

Frontiers of Large Language Models: Empowering Decision Optimization, Scene Understanding, and Summarization Through Advanced Computational Approaches

de Curtò i Díaz, Joaquim 23 January 2024 (has links)
Tesis por compendio / [ES] El advenimiento de los Large Language Models (LLMs) marca una fase transformadora en el campo de la Inteligencia Artificial (IA), significando el cambio hacia sistemas inteligentes y autónomos capaces de una comprensión y toma de decisiones complejas. Esta tesis profundiza en las capacidades multifacéticas de los LLMs, explorando sus posibles aplicaciones en la optimización de decisiones, la comprensión de escenas y tareas avanzadas de resumen de video en diversos contextos. En el primer segmento de la tesis, el foco está en la comprensión semántica de escenas de Vehículos Aéreos No Tripulados (UAVs). La capacidad de proporcionar instantáneamente datos de alto nivel y señales visuales sitúa a los UAVs como plataformas ideales para realizar tareas complejas. El trabajo combina el potencial de los LLMs, los Visual Language Models (VLMs), y los sistemas de detección objetos de última generación para ofrecer descripciones de escenas matizadas y contextualmente precisas. Se presenta una implementación práctica eficiente y bien controlada usando microdrones en entornos complejos, complementando el estudio con métricas de legibilidad estandarizadas propuestas para medir la calidad de las descripciones mejoradas por los LLMs. Estos avances podrían impactar significativamente en sectores como el cine, la publicidad y los parques temáticos, mejorando las experiencias de los usuarios de manera exponencial. El segundo segmento arroja luz sobre el problema cada vez más crucial de la toma de decisiones bajo incertidumbre. Utilizando el problema de Multi-Armed Bandits (MAB) como base, el estudio explora el uso de los LLMs para informar y guiar estrategias en entornos dinámicos. Se postula que el poder predictivo de los LLMs puede ayudar a elegir el equilibrio correcto entre exploración y explotación basado en el estado actual del sistema. A través de pruebas rigurosas, la estrategia informada por los LLMs propuesta demuestra su adaptabilidad y su rendimiento competitivo frente a las estrategias convencionales. A continuación, la investigación se centra en el estudio de las evaluaciones de bondad de ajuste de las Generative Adversarial Networks (GANs) utilizando la Signature Transform. Al proporcionar una medida eficiente de similitud entre las distribuciones de imágenes, el estudio arroja luz sobre la estructura intrínseca de las muestras generadas por los GANs. Un análisis exhaustivo utilizando medidas estadísticas como las pruebas de Kruskal-Wallis proporciona una comprensión más amplia de la convergencia de los GANs y la bondad de ajuste. En la sección final, la tesis introduce un nuevo benchmark para la síntesis automática de vídeos, enfatizando la integración armoniosa de los LLMs y la Signature Transform. Se propone un enfoque innovador basado en los componentes armónicos capturados por la Signature Transform. Las medidas son evaluadas extensivamente, demostrando ofrecer una precisión convincente que se correlaciona bien con el concepto humano de un buen resumen. Este trabajo de investigación establece a los LLMs como herramientas poderosas para abordar tareas complejas en diversos dominios, redefiniendo la optimización de decisiones, la comprensión de escenas y las tareas de resumen de video. No solo establece nuevos postulados en las aplicaciones de los LLMs, sino que también establece la dirección para futuros trabajos en este emocionante y rápidamente evolucionante campo. / [CA] L'adveniment dels Large Language Models (LLMs) marca una fase transformadora en el camp de la Intel·ligència Artificial (IA), significat el canvi cap a sistemes intel·ligents i autònoms capaços d'una comprensió i presa de decisions complexes. Aquesta tesi profunditza en les capacitats multifacètiques dels LLMs, explorant les seues possibles aplicacions en l'optimització de decisions, la comprensió d'escenes i tasques avançades de resum de vídeo en diversos contexts. En el primer segment de la tesi, el focus està en la comprensió semàntica d'escenes de Vehicles Aeris No Tripulats (UAVs). La capacitat de proporcionar instantàniament dades d'alt nivell i senyals visuals situa els UAVs com a plataformes ideals per a realitzar tasques complexes. El treball combina el potencial dels LLMs, els Visual Language Models (VLMs), i els sistemes de detecció d'objectes d'última generació per a oferir descripcions d'escenes matisades i contextualment precises. Es presenta una implementació pràctica eficient i ben controlada usant microdrons en entorns complexos, complementant l'estudi amb mètriques de llegibilitat estandarditzades proposades per a mesurar la qualitat de les descripcions millorades pels LLMs. Aquests avenços podrien impactar significativament en sectors com el cinema, la publicitat i els parcs temàtics, millorant les experiències dels usuaris de manera exponencial. El segon segment arroja llum sobre el problema cada vegada més crucial de la presa de decisions sota incertesa. Utilitzant el problema dels Multi-Armed Bandits (MAB) com a base, l'estudi explora l'ús dels LLMs per a informar i guiar estratègies en entorns dinàmics. Es postula que el poder predictiu dels LLMs pot ajudar a triar l'equilibri correcte entre exploració i explotació basat en l'estat actual del sistema. A través de proves rigoroses, l'estratègia informada pels LLMs proposada demostra la seua adaptabilitat i el seu rendiment competitiu front a les estratègies convencionals. A continuació, la recerca es centra en l'estudi de les avaluacions de bondat d'ajust de les Generative Adversarial Networks (GANs) utilitzant la Signature Transform. En proporcionar una mesura eficient de similitud entre les distribucions d'imatges, l'estudi arroja llum sobre l'estructura intrínseca de les mostres generades pels GANs. Una anàlisi exhaustiva utilitzant mesures estadístiques com les proves de Kruskal-Wallis proporciona una comprensió més àmplia de la convergència dels GANs i la bondat d'ajust. En la secció final, la tesi introdueix un nou benchmark per a la síntesi automàtica de vídeos, enfatitzant la integració harmònica dels LLMs i la Signature Transform. Es proposa un enfocament innovador basat en els components harmònics capturats per la Signature Transform. Les mesures són avaluades extensivament, demostrant oferir una precisió convincent que es correlaciona bé amb el concepte humà d'un bon resum. Aquest treball de recerca estableix els LLMs com a eines poderoses per a abordar tasques complexes en diversos dominis, redefinint l'optimització de decisions, la comprensió d'escenes i les tasques de resum de vídeo. No solament estableix nous postulats en les aplicacions dels LLMs, sinó que també estableix la direcció per a futurs treballs en aquest emocionant i ràpidament evolucionant camp. / [EN] The advent of Large Language Models (LLMs) marks a transformative phase in the field of Artificial Intelligence (AI), signifying the shift towards intelligent and autonomous systems capable of complex understanding and decision-making. This thesis delves deep into the multifaceted capabilities of LLMs, exploring their potential applications in decision optimization, scene understanding, and advanced summarization tasks in diverse contexts. In the first segment of the thesis, the focus is on Unmanned Aerial Vehicles' (UAVs) semantic scene understanding. The capability of instantaneously providing high-level data and visual cues positions UAVs as ideal platforms for performing complex tasks. The work combines the potential of LLMs, Visual Language Models (VLMs), and state-of-the-art detection pipelines to offer nuanced and contextually accurate scene descriptions. A well-controlled, efficient practical implementation of microdrones in challenging settings is presented, supplementing the study with proposed standardized readability metrics to gauge the quality of LLM-enhanced descriptions. This could significantly impact sectors such as film, advertising, and theme parks, enhancing user experiences manifold. The second segment brings to light the increasingly crucial problem of decision-making under uncertainty. Using the Multi-Armed Bandit (MAB) problem as a foundation, the study explores the use of LLMs to inform and guide strategies in dynamic environments. It is postulated that the predictive power of LLMs can aid in choosing the correct balance between exploration and exploitation based on the current state of the system. Through rigorous testing, the proposed LLM-informed strategy showcases its adaptability and its competitive performance against conventional strategies. Next, the research transitions into studying the goodness-of-fit assessments of Generative Adversarial Networks (GANs) utilizing the Signature Transform. By providing an efficient measure of similarity between image distributions, the study sheds light on the intrinsic structure of the samples generated by GANs. A comprehensive analysis using statistical measures, such as the test Kruskal-Wallis, provides a more extensive understanding of the GAN convergence and goodness of fit. In the final section, the thesis introduces a novel benchmark for automatic video summarization, emphasizing the harmonious integration of LLMs and Signature Transform. An innovative approach grounded in the harmonic components captured by the Signature Transform is put forth. The measures are extensively evaluated, proving to offer compelling accuracy that correlates well with the concept of a good summary. This research work establishes LLMs as powerful tools in addressing complex tasks across diverse domains, redefining decision optimization, scene understanding, and summarization tasks. It not only breaks new ground in the applications of LLMs but also sets the direction for future work in this exciting and rapidly evolving field. / De Curtò I Díaz, J. (2023). Frontiers of Large Language Models: Empowering Decision Optimization, Scene Understanding, and Summarization Through Advanced Computational Approaches [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/202200 / Compendio
95

<b>Advanced Algorithms for X-ray CT Image Reconstruction and Processing</b>

Madhuri Mahendra Nagare (17897678) 05 February 2024 (has links)
<p dir="ltr">X-ray computed tomography (CT) is one of the most widely used imaging modalities for medical diagnosis. Improving the quality of clinical CT images while keeping the X-ray dosage of patients low has been an active area of research. Recently, there have been two major technological advances in the commercial CT systems. The first is the use of Deep Neural Networks (DNN) to denoise and sharpen CT images, and the second is use of photon counting detectors (PCD) which provide higher spectral and spatial resolution compared to the conventional energy-integrating detectors. While both techniques have potential to improve the quality of CT images significantly, there are still challenges to improve the quality further.</p><p dir="ltr"><br></p><p dir="ltr">A denoising or sharpening algorithm for CT images must retain a favorable texture which is critically important for radiologists. However, commonly used methodologies in DNN training produce over-smooth images lacking texture. The lack of texture is a systematic error leading to a biased estimator.</p><p><br></p><p dir="ltr">In the first portion of this thesis, we propose three algorithms to reduce the bias, thereby to retain the favorable texture. The first method proposes a novel approach to designing a loss function that penalizes bias in the image more while training a DNN, producing more texture and detail in results. Our experiments verify that the proposed loss function outperforms the commonly used mean squared error loss function. The second algorithm proposes a novel approach to designing training pairs for a DNN-based sharpener. While conventional sharpeners employ noise-free ground truth producing over-smooth images, the proposed Noise Preserving Sharpening Filter (NPSF) adds appropriately scaled noise to both the input and the ground truth to keep the noise texture in the sharpened result similar to that of the input. Our evaluations show that the NPSF can sharpen noisy images while producing desired noise level and texture. The above two algorithms merely control the amount of texture retained and are not designed to produce texture that matches to a target texture. A Generative Adversarial Network (GAN) can produce the target texture. However, naive application of GANs can introduce inaccurate or even unreal image detail. Therefore, we propose a Texture Matching GAN (TMGAN) that uses parallel generators to separate anatomical features from the generated texture, which allows the GAN to be trained to match the target texture without directly affecting the underlying CT image. We demonstrate that TMGAN generates enhanced image quality while also producing texture that is desirable for clinical application.</p><p><br></p><p dir="ltr">In the second portion of this research, we propose a novel algorithm for the optimal statistical processing of photon-counting detector data for CT reconstruction. Current reconstruction and material decomposition algorithms for photon counting CT are not able to utilize simultaneously both the measured spectral information and advanced prior models. We propose a modular framework based on Multi-Agent Consensus Equilibrium (MACE) to obtain material decomposition and reconstructions using the PCD data. Our method employs a detector agent that uses PCD measurements to update an estimate along with a prior agent that enforces both physical and empirical knowledge about the material-decomposed sinograms. Importantly, the modular framework allows the two agents to be designed and optimized independently. Our evaluations on simulated data show promising results.</p>
96

Generative Adversarial Networks for Image-to-Image Translation on Street View and MR Images

Karlsson, Simon, Welander, Per January 2018 (has links)
Generative Adversarial Networks (GANs) is a deep learning method that has been developed for synthesizing data. One application for which it can be used for is image-to-image translations. This could prove to be valuable when training deep neural networks for image classification tasks. Two areas where deep learning methods are used are automotive vision systems and medical imaging. Automotive vision systems are expected to handle a broad range of scenarios which demand training data with a high diversity. The scenarios in the medical field are fewer but the problem is instead that it is difficult, time consuming and expensive to collect training data. This thesis evaluates different GAN models by comparing synthetic MR images produced by the models against ground truth images. A perceptual study is also performed by an expert in the field. It is shown by the study that the implemented GAN models can synthesize visually realistic MR images. It is also shown that models producing more visually realistic synthetic images not necessarily have better results in quantitative error measurements, when compared to ground truth data. Along with the investigations on medical images, the thesis explores the possibilities of generating synthetic street view images of different resolution, light and weather conditions. Different GAN models have been compared, implemented with our own adjustments, and evaluated. The results show that it is possible to create visually realistic images for different translations and image resolutions.
97

[pt] SINTETIZAÇÃO DE IMAGENS ÓTICAS MULTIESPECTRAIS A PARTIR DE DADOS SAR/ÓTICOS USANDO REDES GENERATIVAS ADVERSARIAS CONDICIONAIS / [en] SYNTHESIS OF MULTISPECTRAL OPTICAL IMAGES FROM SAR/OPTICAL MULTITEMPORAL DATA USING CONDITIONAL GENERATIVE ADVERSARIAL NETWORKS

JOSE DAVID BERMUDEZ CASTRO 08 April 2021 (has links)
[pt] Imagens óticas são frequentemente afetadas pela presença de nuvens. Com o objetivo de reduzir esses efeitos, diferentes técnicas de reconstrução foram propostas nos últimos anos. Uma alternativa comum é explorar dados de sensores ativos, como Radar de Abertura Sintética (SAR), dado que são pouco dependentes das condições atmosféricas e da iluminação solar. Por outro lado, as imagens SAR são mais difíceis de interpretar do que as imagens óticas, exigindo um tratamento específico. Recentemente, as Redes Adversárias Generativas Condicionais (cGANs - Conditional Generative Adversarial Networks) têm sido amplamente utilizadas para aprender funções de mapeamento que relaciona dados de diferentes domínios. Este trabalho, propõe um método baseado em cGANSs para sintetizar dados óticos a partir de dados de outras fontes, incluindo dados de múltiplos sensores, dados multitemporais e dados em múltiplas resoluções. A hipótese desse trabalho é que a qualidade das imagens geradas se beneficia do número de dados utilizados como variáveis condicionantes para a cGAN. A solução proposta foi avaliada em duas bases de dados. Foram utilizadas como variáveis condicionantes dados corregistrados SAR, de uma ou duas datas produzidos pelo sensor Sentinel 1, e dados óticos de sensores da série Sentinel 2 e LANDSAT, respectivamente. Os resultados coletados dos experimentos demonstraram que a solução proposta é capaz de sintetizar dados óticos realistas. A qualidade das imagens sintetizadas foi medida de duas formas: primeiramente, com base na acurácia da classificação das imagens geradas e, em segundo lugar, medindo-se a similaridade espectral das imagens sintetizadas com imagens de referência. Os experimentos confirmaram a hipótese de que o método proposto tende a produzir melhores resultados à medida que se exploram mais variáveis condicionantes para a cGAN. / [en] Optical images from Earth Observation are often affected by the presence of clouds. In order to reduce these effects, different reconstruction techniques have been proposed in recent years. A common alternative is to explore data from active sensors, such as Synthetic Aperture Radar (SAR), as they are nearly independent on atmospheric conditions and solar lighting. On the other hand, SAR images are more difficult to interpret than optical images, requiring specific treatment. Recently, conditional Generative Adversarial Networks (cGANs) have been widely used to learn mapping functions that relate data of different domains. This work proposes a method based on cGANs to synthesize optical data from data of other sources: data of multiple sensors, multitemporal data and data at multiple resolutions. The working hypothesis is that the quality of the generated images benefits from the number of data used as conditioning variables for cGAN. The proposed solution was evaluated in two databases. As conditioning data we used co-registered data from SAR at one or two dates produced by the Sentinel 1 sensor, and optical images produced by the Sentinel 2 and LANDSAT satellite series, respectively. The experimental results demonstrated that the proposed solution is able to synthesize realistic optical data. The quality of the synthesized images was measured in two ways: firstly, based on the classification accuracy of the generated images and, secondly, on the spectral similarity of the synthesized images with reference images. The experiments confirmed the hypothesis that the proposed method tends to produce better results as we explore more conditioning data for the cGANs.
98

Prédiction et génération de données structurées à l'aide de réseaux de neurones et de décisions discrètes

Dutil, Francis 08 1900 (has links)
No description available.
99

Multi-player games in the era of machine learning

Gidel, Gauthier 07 1900 (has links)
Parmi tous les jeux de société joués par les humains au cours de l’histoire, le jeu de go était considéré comme l’un des plus difficiles à maîtriser par un programme informatique [Van Den Herik et al., 2002]; Jusqu’à ce que ce ne soit plus le cas [Silveret al., 2016]. Cette percée révolutionnaire [Müller, 2002, Van Den Herik et al., 2002] fût le fruit d’une combinaison sophistiquée de Recherche arborescente Monte-Carlo et de techniques d’apprentissage automatique pour évaluer les positions du jeu, mettant en lumière le grand potentiel de l’apprentissage automatique pour résoudre des jeux. L’apprentissage antagoniste, un cas particulier de l’optimisation multiobjective, est un outil de plus en plus utile dans l’apprentissage automatique. Par exemple, les jeux à deux joueurs et à somme nulle sont importants dans le domain des réseaux génératifs antagonistes [Goodfellow et al., 2014] ainsi que pour maîtriser des jeux comme le Go ou le Poker en s’entraînant contre lui-même [Silver et al., 2017, Brown andSandholm, 2017]. Un résultat classique de la théorie des jeux indique que les jeux convexes-concaves ont toujours un équilibre [Neumann, 1928]. Étonnamment, les praticiens en apprentissage automatique entrainent avec succès une seule paire de réseaux de neurones dont l’objectif est un problème de minimax non-convexe et non-concave alors que pour une telle fonction de gain, l’existence d’un équilibre de Nash n’est pas garantie en général. Ce travail est une tentative d'établir une solide base théorique pour l’apprentissage dans les jeux. La première contribution explore le théorème minimax pour une classe particulière de jeux non-convexes et non-concaves qui englobe les réseaux génératifs antagonistes. Cette classe correspond à un ensemble de jeux à deux joueurs et a somme nulle joués avec des réseaux de neurones. Les deuxième et troisième contributions étudient l’optimisation des problèmes minimax, et plus généralement, les inégalités variationnelles dans le cadre de l’apprentissage automatique. Bien que la méthode standard de descente de gradient ne parvienne pas à converger vers l’équilibre de Nash de jeux convexes-concaves simples, il existe des moyens d’utiliser des gradients pour obtenir des méthodes qui convergent. Nous étudierons plusieurs techniques telles que l’extrapolation, la moyenne et la quantité de mouvement à paramètre négatif. La quatrième contribution fournit une étude empirique du comportement pratique des réseaux génératifs antagonistes. Dans les deuxième et troisième contributions, nous diagnostiquons que la méthode du gradient échoue lorsque le champ de vecteur du jeu est fortement rotatif. Cependant, une telle situation peut décrire un pire des cas qui ne se produit pas dans la pratique. Nous fournissons de nouveaux outils de visualisation afin d’évaluer si nous pouvons détecter des rotations dans comportement pratique des réseaux génératifs antagonistes. / Among all the historical board games played by humans, the game of go was considered one of the most difficult to master by a computer program [Van Den Heriket al., 2002]; Until it was not [Silver et al., 2016]. This odds-breaking break-through [Müller, 2002, Van Den Herik et al., 2002] came from a sophisticated combination of Monte Carlo tree search and machine learning techniques to evaluate positions, shedding light upon the high potential of machine learning to solve games. Adversarial training, a special case of multiobjective optimization, is an increasingly useful tool in machine learning. For example, two-player zero-sum games are important for generative modeling (GANs) [Goodfellow et al., 2014] and mastering games like Go or Poker via self-play [Silver et al., 2017, Brown and Sandholm,2017]. A classic result in Game Theory states that convex-concave games always have an equilibrium [Neumann, 1928]. Surprisingly, machine learning practitioners successfully train a single pair of neural networks whose objective is a nonconvex-nonconcave minimax problem while for such a payoff function, the existence of a Nash equilibrium is not guaranteed in general. This work is an attempt to put learning in games on a firm theoretical foundation. The first contribution explores minimax theorems for a particular class of nonconvex-nonconcave games that encompasses generative adversarial networks. The proposed result is an approximate minimax theorem for two-player zero-sum games played with neural networks, including WGAN, StarCrat II, and Blotto game. Our findings rely on the fact that despite being nonconcave-nonconvex with respect to the neural networks parameters, the payoff of these games are concave-convex with respect to the actual functions (or distributions) parametrized by these neural networks. The second and third contributions study the optimization of minimax problems, and more generally, variational inequalities in the context of machine learning. While the standard gradient descent-ascent method fails to converge to the Nash equilibrium of simple convex-concave games, there exist ways to use gradients to obtain methods that converge. We investigate several techniques such as extrapolation, averaging and negative momentum. We explore these techniques experimentally by proposing a state-of-the-art (at the time of publication) optimizer for GANs called ExtraAdam. We also prove new convergence results for Extrapolation from the past, originally proposed by Popov [1980], as well as for gradient method with negative momentum. The fourth contribution provides an empirical study of the practical landscape of GANs. In the second and third contributions, we diagnose that the gradient method breaks when the game’s vector field is highly rotational. However, such a situation may describe a worst-case that does not occur in practice. We provide new visualization tools in order to exhibit rotations in practical GAN landscapes. In this contribution, we show empirically that the training of GANs exhibits significant rotations around Local Stable Stationary Points (LSSP), and we provide empirical evidence that GAN training converges to a stable stationary point, which is a saddle point for the generator loss, not a minimum, while still achieving excellent performance.
100

Segmentace lézí roztroušené sklerózy pomocí hlubokých neuronových sítí / Segmentation of multiple sclerosis lesions using deep neural networks

Sasko, Dominik January 2021 (has links)
Hlavným zámerom tejto diplomovej práce bola automatická segmentácia lézií sklerózy multiplex na snímkoch MRI. V rámci práce boli otestované najnovšie metódy segmentácie s využitím hlbokých neurónových sietí a porovnané prístupy inicializácie váh sietí pomocou preneseného učenia (transfer learning) a samoriadeného učenia (self-supervised learning). Samotný problém automatickej segmentácie lézií sklerózy multiplex je veľmi náročný, a to primárne kvôli vysokej nevyváženosti datasetu (skeny mozgov zvyčajne obsahujú len malé množstvo poškodeného tkaniva). Ďalšou výzvou je manuálna anotácia týchto lézií, nakoľko dvaja rozdielni doktori môžu označiť iné časti mozgu ako poškodené a hodnota Dice Coefficient týchto anotácií je približne 0,86. Možnosť zjednodušenia procesu anotovania lézií automatizáciou by mohlo zlepšiť výpočet množstva lézií, čo by mohlo viesť k zlepšeniu diagnostiky individuálnych pacientov. Našim cieľom bolo navrhnutie dvoch techník využívajúcich transfer learning na predtrénovanie váh, ktoré by neskôr mohli zlepšiť výsledky terajších segmentačných modelov. Teoretická časť opisuje rozdelenie umelej inteligencie, strojového učenia a hlbokých neurónových sietí a ich využitie pri segmentácii obrazu. Následne je popísaná skleróza multiplex, jej typy, symptómy, diagnostika a liečba. Praktická časť začína predspracovaním dát. Najprv boli skeny mozgu upravené na rovnaké rozlíšenie s rovnakou veľkosťou voxelu. Dôvodom tejto úpravy bolo využitie troch odlišných datasetov, v ktorých boli skeny vytvárané rozličnými prístrojmi od rôznych výrobcov. Jeden dataset taktiež obsahoval lebku, a tak bolo nutné jej odstránenie pomocou nástroju FSL pre ponechanie samotného mozgu pacienta. Využívali sme 3D skeny (FLAIR, T1 a T2 modality), ktoré boli postupne rozdelené na individuálne 2D rezy a použité na vstup neurónovej siete s enkodér-dekodér architektúrou. Dataset na trénovanie obsahoval 6720 rezov s rozlíšením 192 x 192 pixelov (po odstránení rezov, ktorých maska neobsahovala žiadnu hodnotu). Využitá loss funkcia bola Combo loss (kombinácia Dice Loss s upravenou Cross-Entropy). Prvá metóda sa zameriavala na využitie predtrénovaných váh z ImageNet datasetu na enkodér U-Net architektúry so zamknutými váhami enkodéra, resp. bez zamknutia a následného porovnania s náhodnou inicializáciou váh. V tomto prípade sme použili len FLAIR modalitu. Transfer learning dokázalo zvýšiť sledovanú metriku z hodnoty približne 0,4 na 0,6. Rozdiel medzi zamknutými a nezamknutými váhami enkodéru sa pohyboval okolo 0,02. Druhá navrhnutá technika používala self-supervised kontext enkodér s Generative Adversarial Networks (GAN) na predtrénovanie váh. Táto sieť využívala všetky tri spomenuté modality aj s prázdnymi rezmi masiek (spolu 23040 obrázkov). Úlohou GAN siete bolo dotvoriť sken mozgu, ktorý bol prekrytý čiernou maskou v tvare šachovnice. Takto naučené váhy boli následne načítané do enkodéru na aplikáciu na náš segmentačný problém. Tento experiment nevykazoval lepšie výsledky, s hodnotou DSC 0,29 a 0,09 (nezamknuté a zamknuté váhy enkodéru). Prudké zníženie metriky mohlo byť spôsobené použitím predtrénovaných váh na vzdialených problémoch (segmentácia a self-supervised kontext enkodér), ako aj zložitosť úlohy kvôli nevyváženému datasetu.

Page generated in 0.1402 seconds