• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 75
  • 2
  • 1
  • 1
  • Tagged with
  • 93
  • 93
  • 93
  • 62
  • 45
  • 41
  • 33
  • 29
  • 27
  • 26
  • 24
  • 19
  • 17
  • 16
  • 15
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
81

<b>Advanced Algorithms for X-ray CT Image Reconstruction and Processing</b>

Madhuri Mahendra Nagare (17897678) 05 February 2024 (has links)
<p dir="ltr">X-ray computed tomography (CT) is one of the most widely used imaging modalities for medical diagnosis. Improving the quality of clinical CT images while keeping the X-ray dosage of patients low has been an active area of research. Recently, there have been two major technological advances in the commercial CT systems. The first is the use of Deep Neural Networks (DNN) to denoise and sharpen CT images, and the second is use of photon counting detectors (PCD) which provide higher spectral and spatial resolution compared to the conventional energy-integrating detectors. While both techniques have potential to improve the quality of CT images significantly, there are still challenges to improve the quality further.</p><p dir="ltr"><br></p><p dir="ltr">A denoising or sharpening algorithm for CT images must retain a favorable texture which is critically important for radiologists. However, commonly used methodologies in DNN training produce over-smooth images lacking texture. The lack of texture is a systematic error leading to a biased estimator.</p><p><br></p><p dir="ltr">In the first portion of this thesis, we propose three algorithms to reduce the bias, thereby to retain the favorable texture. The first method proposes a novel approach to designing a loss function that penalizes bias in the image more while training a DNN, producing more texture and detail in results. Our experiments verify that the proposed loss function outperforms the commonly used mean squared error loss function. The second algorithm proposes a novel approach to designing training pairs for a DNN-based sharpener. While conventional sharpeners employ noise-free ground truth producing over-smooth images, the proposed Noise Preserving Sharpening Filter (NPSF) adds appropriately scaled noise to both the input and the ground truth to keep the noise texture in the sharpened result similar to that of the input. Our evaluations show that the NPSF can sharpen noisy images while producing desired noise level and texture. The above two algorithms merely control the amount of texture retained and are not designed to produce texture that matches to a target texture. A Generative Adversarial Network (GAN) can produce the target texture. However, naive application of GANs can introduce inaccurate or even unreal image detail. Therefore, we propose a Texture Matching GAN (TMGAN) that uses parallel generators to separate anatomical features from the generated texture, which allows the GAN to be trained to match the target texture without directly affecting the underlying CT image. We demonstrate that TMGAN generates enhanced image quality while also producing texture that is desirable for clinical application.</p><p><br></p><p dir="ltr">In the second portion of this research, we propose a novel algorithm for the optimal statistical processing of photon-counting detector data for CT reconstruction. Current reconstruction and material decomposition algorithms for photon counting CT are not able to utilize simultaneously both the measured spectral information and advanced prior models. We propose a modular framework based on Multi-Agent Consensus Equilibrium (MACE) to obtain material decomposition and reconstructions using the PCD data. Our method employs a detector agent that uses PCD measurements to update an estimate along with a prior agent that enforces both physical and empirical knowledge about the material-decomposed sinograms. Importantly, the modular framework allows the two agents to be designed and optimized independently. Our evaluations on simulated data show promising results.</p>
82

Generative Adversarial Networks for Image-to-Image Translation on Street View and MR Images

Karlsson, Simon, Welander, Per January 2018 (has links)
Generative Adversarial Networks (GANs) is a deep learning method that has been developed for synthesizing data. One application for which it can be used for is image-to-image translations. This could prove to be valuable when training deep neural networks for image classification tasks. Two areas where deep learning methods are used are automotive vision systems and medical imaging. Automotive vision systems are expected to handle a broad range of scenarios which demand training data with a high diversity. The scenarios in the medical field are fewer but the problem is instead that it is difficult, time consuming and expensive to collect training data. This thesis evaluates different GAN models by comparing synthetic MR images produced by the models against ground truth images. A perceptual study is also performed by an expert in the field. It is shown by the study that the implemented GAN models can synthesize visually realistic MR images. It is also shown that models producing more visually realistic synthetic images not necessarily have better results in quantitative error measurements, when compared to ground truth data. Along with the investigations on medical images, the thesis explores the possibilities of generating synthetic street view images of different resolution, light and weather conditions. Different GAN models have been compared, implemented with our own adjustments, and evaluated. The results show that it is possible to create visually realistic images for different translations and image resolutions.
83

[pt] SINTETIZAÇÃO DE IMAGENS ÓTICAS MULTIESPECTRAIS A PARTIR DE DADOS SAR/ÓTICOS USANDO REDES GENERATIVAS ADVERSARIAS CONDICIONAIS / [en] SYNTHESIS OF MULTISPECTRAL OPTICAL IMAGES FROM SAR/OPTICAL MULTITEMPORAL DATA USING CONDITIONAL GENERATIVE ADVERSARIAL NETWORKS

JOSE DAVID BERMUDEZ CASTRO 08 April 2021 (has links)
[pt] Imagens óticas são frequentemente afetadas pela presença de nuvens. Com o objetivo de reduzir esses efeitos, diferentes técnicas de reconstrução foram propostas nos últimos anos. Uma alternativa comum é explorar dados de sensores ativos, como Radar de Abertura Sintética (SAR), dado que são pouco dependentes das condições atmosféricas e da iluminação solar. Por outro lado, as imagens SAR são mais difíceis de interpretar do que as imagens óticas, exigindo um tratamento específico. Recentemente, as Redes Adversárias Generativas Condicionais (cGANs - Conditional Generative Adversarial Networks) têm sido amplamente utilizadas para aprender funções de mapeamento que relaciona dados de diferentes domínios. Este trabalho, propõe um método baseado em cGANSs para sintetizar dados óticos a partir de dados de outras fontes, incluindo dados de múltiplos sensores, dados multitemporais e dados em múltiplas resoluções. A hipótese desse trabalho é que a qualidade das imagens geradas se beneficia do número de dados utilizados como variáveis condicionantes para a cGAN. A solução proposta foi avaliada em duas bases de dados. Foram utilizadas como variáveis condicionantes dados corregistrados SAR, de uma ou duas datas produzidos pelo sensor Sentinel 1, e dados óticos de sensores da série Sentinel 2 e LANDSAT, respectivamente. Os resultados coletados dos experimentos demonstraram que a solução proposta é capaz de sintetizar dados óticos realistas. A qualidade das imagens sintetizadas foi medida de duas formas: primeiramente, com base na acurácia da classificação das imagens geradas e, em segundo lugar, medindo-se a similaridade espectral das imagens sintetizadas com imagens de referência. Os experimentos confirmaram a hipótese de que o método proposto tende a produzir melhores resultados à medida que se exploram mais variáveis condicionantes para a cGAN. / [en] Optical images from Earth Observation are often affected by the presence of clouds. In order to reduce these effects, different reconstruction techniques have been proposed in recent years. A common alternative is to explore data from active sensors, such as Synthetic Aperture Radar (SAR), as they are nearly independent on atmospheric conditions and solar lighting. On the other hand, SAR images are more difficult to interpret than optical images, requiring specific treatment. Recently, conditional Generative Adversarial Networks (cGANs) have been widely used to learn mapping functions that relate data of different domains. This work proposes a method based on cGANs to synthesize optical data from data of other sources: data of multiple sensors, multitemporal data and data at multiple resolutions. The working hypothesis is that the quality of the generated images benefits from the number of data used as conditioning variables for cGAN. The proposed solution was evaluated in two databases. As conditioning data we used co-registered data from SAR at one or two dates produced by the Sentinel 1 sensor, and optical images produced by the Sentinel 2 and LANDSAT satellite series, respectively. The experimental results demonstrated that the proposed solution is able to synthesize realistic optical data. The quality of the synthesized images was measured in two ways: firstly, based on the classification accuracy of the generated images and, secondly, on the spectral similarity of the synthesized images with reference images. The experiments confirmed the hypothesis that the proposed method tends to produce better results as we explore more conditioning data for the cGANs.
84

<b>Explaining Generative Adversarial Network Time Series Anomaly Detection using Shapley Additive Explanations</b>

Cher Simon (18324174) 10 July 2024 (has links)
<p dir="ltr">Anomaly detection is an active research field that widely applies to commercial applications to detect unusual patterns or outliers. Time series anomaly detection provides valuable insights into mission and safety-critical applications using ever-growing temporal data, including continuous streaming time series data from the Internet of Things (IoT), sensor networks, healthcare, stock prices, computer metrics, and application monitoring. While Generative Adversarial Networks (GANs) demonstrate promising results in time series anomaly detection, the opaque nature of generative deep learning models lacks explainability and hinders broader adoption. Understanding the rationale behind model predictions and providing human-interpretable explanations are vital for increasing confidence and trust in machine learning (ML) frameworks such as GANs. This study conducted a structured and comprehensive assessment of post-hoc local explainability in GAN-based time series anomaly detection using SHapley Additive exPlanations (SHAP). Using publicly available benchmarking datasets approved by Purdue’s Institutional Review Board (IRB), this study evaluated state-of-the-art GAN frameworks identifying their advantages and limitations for time series anomaly detection. This study demonstrated a systematic approach in quantifying the extent of GAN-based time series anomaly explainability, providing insights for businesses when considering adopting generative deep learning models. The presented results show that GANs capture complex time series temporal distribution and are applicable for anomaly detection. The analysis from this study shows SHAP can identify the significance of contributing features within time series data and derive post-hoc explanations to quantify GAN-detected time series anomalies.</p>
85

Prédiction et génération de données structurées à l'aide de réseaux de neurones et de décisions discrètes

Dutil, Francis 08 1900 (has links)
No description available.
86

Multi-player games in the era of machine learning

Gidel, Gauthier 07 1900 (has links)
Parmi tous les jeux de société joués par les humains au cours de l’histoire, le jeu de go était considéré comme l’un des plus difficiles à maîtriser par un programme informatique [Van Den Herik et al., 2002]; Jusqu’à ce que ce ne soit plus le cas [Silveret al., 2016]. Cette percée révolutionnaire [Müller, 2002, Van Den Herik et al., 2002] fût le fruit d’une combinaison sophistiquée de Recherche arborescente Monte-Carlo et de techniques d’apprentissage automatique pour évaluer les positions du jeu, mettant en lumière le grand potentiel de l’apprentissage automatique pour résoudre des jeux. L’apprentissage antagoniste, un cas particulier de l’optimisation multiobjective, est un outil de plus en plus utile dans l’apprentissage automatique. Par exemple, les jeux à deux joueurs et à somme nulle sont importants dans le domain des réseaux génératifs antagonistes [Goodfellow et al., 2014] ainsi que pour maîtriser des jeux comme le Go ou le Poker en s’entraînant contre lui-même [Silver et al., 2017, Brown andSandholm, 2017]. Un résultat classique de la théorie des jeux indique que les jeux convexes-concaves ont toujours un équilibre [Neumann, 1928]. Étonnamment, les praticiens en apprentissage automatique entrainent avec succès une seule paire de réseaux de neurones dont l’objectif est un problème de minimax non-convexe et non-concave alors que pour une telle fonction de gain, l’existence d’un équilibre de Nash n’est pas garantie en général. Ce travail est une tentative d'établir une solide base théorique pour l’apprentissage dans les jeux. La première contribution explore le théorème minimax pour une classe particulière de jeux non-convexes et non-concaves qui englobe les réseaux génératifs antagonistes. Cette classe correspond à un ensemble de jeux à deux joueurs et a somme nulle joués avec des réseaux de neurones. Les deuxième et troisième contributions étudient l’optimisation des problèmes minimax, et plus généralement, les inégalités variationnelles dans le cadre de l’apprentissage automatique. Bien que la méthode standard de descente de gradient ne parvienne pas à converger vers l’équilibre de Nash de jeux convexes-concaves simples, il existe des moyens d’utiliser des gradients pour obtenir des méthodes qui convergent. Nous étudierons plusieurs techniques telles que l’extrapolation, la moyenne et la quantité de mouvement à paramètre négatif. La quatrième contribution fournit une étude empirique du comportement pratique des réseaux génératifs antagonistes. Dans les deuxième et troisième contributions, nous diagnostiquons que la méthode du gradient échoue lorsque le champ de vecteur du jeu est fortement rotatif. Cependant, une telle situation peut décrire un pire des cas qui ne se produit pas dans la pratique. Nous fournissons de nouveaux outils de visualisation afin d’évaluer si nous pouvons détecter des rotations dans comportement pratique des réseaux génératifs antagonistes. / Among all the historical board games played by humans, the game of go was considered one of the most difficult to master by a computer program [Van Den Heriket al., 2002]; Until it was not [Silver et al., 2016]. This odds-breaking break-through [Müller, 2002, Van Den Herik et al., 2002] came from a sophisticated combination of Monte Carlo tree search and machine learning techniques to evaluate positions, shedding light upon the high potential of machine learning to solve games. Adversarial training, a special case of multiobjective optimization, is an increasingly useful tool in machine learning. For example, two-player zero-sum games are important for generative modeling (GANs) [Goodfellow et al., 2014] and mastering games like Go or Poker via self-play [Silver et al., 2017, Brown and Sandholm,2017]. A classic result in Game Theory states that convex-concave games always have an equilibrium [Neumann, 1928]. Surprisingly, machine learning practitioners successfully train a single pair of neural networks whose objective is a nonconvex-nonconcave minimax problem while for such a payoff function, the existence of a Nash equilibrium is not guaranteed in general. This work is an attempt to put learning in games on a firm theoretical foundation. The first contribution explores minimax theorems for a particular class of nonconvex-nonconcave games that encompasses generative adversarial networks. The proposed result is an approximate minimax theorem for two-player zero-sum games played with neural networks, including WGAN, StarCrat II, and Blotto game. Our findings rely on the fact that despite being nonconcave-nonconvex with respect to the neural networks parameters, the payoff of these games are concave-convex with respect to the actual functions (or distributions) parametrized by these neural networks. The second and third contributions study the optimization of minimax problems, and more generally, variational inequalities in the context of machine learning. While the standard gradient descent-ascent method fails to converge to the Nash equilibrium of simple convex-concave games, there exist ways to use gradients to obtain methods that converge. We investigate several techniques such as extrapolation, averaging and negative momentum. We explore these techniques experimentally by proposing a state-of-the-art (at the time of publication) optimizer for GANs called ExtraAdam. We also prove new convergence results for Extrapolation from the past, originally proposed by Popov [1980], as well as for gradient method with negative momentum. The fourth contribution provides an empirical study of the practical landscape of GANs. In the second and third contributions, we diagnose that the gradient method breaks when the game’s vector field is highly rotational. However, such a situation may describe a worst-case that does not occur in practice. We provide new visualization tools in order to exhibit rotations in practical GAN landscapes. In this contribution, we show empirically that the training of GANs exhibits significant rotations around Local Stable Stationary Points (LSSP), and we provide empirical evidence that GAN training converges to a stable stationary point, which is a saddle point for the generator loss, not a minimum, while still achieving excellent performance.
87

Segmentace lézí roztroušené sklerózy pomocí hlubokých neuronových sítí / Segmentation of multiple sclerosis lesions using deep neural networks

Sasko, Dominik January 2021 (has links)
Hlavným zámerom tejto diplomovej práce bola automatická segmentácia lézií sklerózy multiplex na snímkoch MRI. V rámci práce boli otestované najnovšie metódy segmentácie s využitím hlbokých neurónových sietí a porovnané prístupy inicializácie váh sietí pomocou preneseného učenia (transfer learning) a samoriadeného učenia (self-supervised learning). Samotný problém automatickej segmentácie lézií sklerózy multiplex je veľmi náročný, a to primárne kvôli vysokej nevyváženosti datasetu (skeny mozgov zvyčajne obsahujú len malé množstvo poškodeného tkaniva). Ďalšou výzvou je manuálna anotácia týchto lézií, nakoľko dvaja rozdielni doktori môžu označiť iné časti mozgu ako poškodené a hodnota Dice Coefficient týchto anotácií je približne 0,86. Možnosť zjednodušenia procesu anotovania lézií automatizáciou by mohlo zlepšiť výpočet množstva lézií, čo by mohlo viesť k zlepšeniu diagnostiky individuálnych pacientov. Našim cieľom bolo navrhnutie dvoch techník využívajúcich transfer learning na predtrénovanie váh, ktoré by neskôr mohli zlepšiť výsledky terajších segmentačných modelov. Teoretická časť opisuje rozdelenie umelej inteligencie, strojového učenia a hlbokých neurónových sietí a ich využitie pri segmentácii obrazu. Následne je popísaná skleróza multiplex, jej typy, symptómy, diagnostika a liečba. Praktická časť začína predspracovaním dát. Najprv boli skeny mozgu upravené na rovnaké rozlíšenie s rovnakou veľkosťou voxelu. Dôvodom tejto úpravy bolo využitie troch odlišných datasetov, v ktorých boli skeny vytvárané rozličnými prístrojmi od rôznych výrobcov. Jeden dataset taktiež obsahoval lebku, a tak bolo nutné jej odstránenie pomocou nástroju FSL pre ponechanie samotného mozgu pacienta. Využívali sme 3D skeny (FLAIR, T1 a T2 modality), ktoré boli postupne rozdelené na individuálne 2D rezy a použité na vstup neurónovej siete s enkodér-dekodér architektúrou. Dataset na trénovanie obsahoval 6720 rezov s rozlíšením 192 x 192 pixelov (po odstránení rezov, ktorých maska neobsahovala žiadnu hodnotu). Využitá loss funkcia bola Combo loss (kombinácia Dice Loss s upravenou Cross-Entropy). Prvá metóda sa zameriavala na využitie predtrénovaných váh z ImageNet datasetu na enkodér U-Net architektúry so zamknutými váhami enkodéra, resp. bez zamknutia a následného porovnania s náhodnou inicializáciou váh. V tomto prípade sme použili len FLAIR modalitu. Transfer learning dokázalo zvýšiť sledovanú metriku z hodnoty približne 0,4 na 0,6. Rozdiel medzi zamknutými a nezamknutými váhami enkodéru sa pohyboval okolo 0,02. Druhá navrhnutá technika používala self-supervised kontext enkodér s Generative Adversarial Networks (GAN) na predtrénovanie váh. Táto sieť využívala všetky tri spomenuté modality aj s prázdnymi rezmi masiek (spolu 23040 obrázkov). Úlohou GAN siete bolo dotvoriť sken mozgu, ktorý bol prekrytý čiernou maskou v tvare šachovnice. Takto naučené váhy boli následne načítané do enkodéru na aplikáciu na náš segmentačný problém. Tento experiment nevykazoval lepšie výsledky, s hodnotou DSC 0,29 a 0,09 (nezamknuté a zamknuté váhy enkodéru). Prudké zníženie metriky mohlo byť spôsobené použitím predtrénovaných váh na vzdialených problémoch (segmentácia a self-supervised kontext enkodér), ako aj zložitosť úlohy kvôli nevyváženému datasetu.
88

Improving Brain Tumor Segmentation using synthetic images from GANs

Nijhawan, Aashana January 2021 (has links)
Artificial intelligence (AI) has been seeing a great amount of hype around it for a few years but more so now in the field of diagnostic medical imaging. AI-based diagnoses have shown improvements in detecting the smallest abnormalities present in tumors and lesions. This can tremendously help public healthcare. There is a large amount of data present in the field of biomedical imaging with the hospitals but only a small amount is available for the use of research due to data and privacy protection. The task of manually segmenting tumors in this magnetic resonance imaging (MRI) can be quite expensive and time taking. This segmentation and classification would need high precision which is usually performed by medical experts that follow clinical medical standards. Due to this small amount of data when used with machine learning models, the trained models tend to overfit. With advancing deep learning techniques it is possible to generate images using Generative Adversarial Networks (GANs). GANs has garnered a heap of attention towards itself for its power to produce realistic-looking images, videos, and audios. This thesis aims to use the synthetic images generated by progressive growing GANs (PGGAN) along with real images to perform segmentation on brain tumor MRI. The idea is to investigate whether the addition of this synthetic data improves the segmentation significantly or not. To analyze the quality of the images produced by the PGGAN, Multi-scale Similarity Index Measure (MS-SSIM) and Sliced Wasserstein Distance (SWD) are recorded. To exam-ine the segmentation performance, Dice Similarity Coefficient (DSC) and accuracy scores are observed. To inspect if the improved performance by synthetic images is significant or not, a parametric paired t-test and non-parametric permutation test are used. It could be seen that the addition of synthetic images with real images is significant for most cases in comparison to using only real images. However, this addition of synthetic images makes the model uncertain. The models’ robustness is tested using training-free uncertainty estimation of neural networks.
89

Adversarial games in machine learning : challenges and applications

Berard, Hugo 08 1900 (has links)
L’apprentissage automatique repose pour un bon nombre de problèmes sur la minimisation d’une fonction de coût, pour ce faire il tire parti de la vaste littérature sur l’optimisation qui fournit des algorithmes et des garanties de convergences pour ce type de problèmes. Cependant récemment plusieurs modèles d’apprentissage automatique qui ne peuvent pas être formulé comme la minimisation d’un coût unique ont été propose, à la place ils nécessitent de définir un jeu entre plusieurs joueurs qui ont chaque leur propre objectif. Un de ces modèles sont les réseaux antagonistes génératifs (GANs). Ce modèle génératif formule un jeu entre deux réseaux de neurones, un générateur et un discriminateur, en essayant de tromper le discriminateur qui essaye de distinguer les vraies images des fausses, le générateur et le discriminateur s’améliore résultant en un équilibre de Nash, ou les images produites par le générateur sont indistinguable des vraies images. Malgré leur succès les GANs restent difficiles à entrainer à cause de la nature antagoniste du jeu, nécessitant de choisir les bons hyperparamètres et résultant souvent en une dynamique d’entrainement instable. Plusieurs techniques de régularisations ont été propose afin de stabiliser l’entrainement, dans cette thèse nous abordons ces instabilités sous l’angle d’un problème d’optimisation. Nous commençons par combler le fossé entre la littérature d’optimisation et les GANs, pour ce faire nous formulons GANs comme un problème d’inéquation variationnelle, et proposons de la littérature sur le sujet pour proposer des algorithmes qui convergent plus rapidement. Afin de mieux comprendre quels sont les défis de l’optimisation des jeux, nous proposons plusieurs outils afin d’analyser le paysage d’optimisation des GANs. En utilisant ces outils, nous montrons que des composantes rotationnelles sont présentes dans le voisinage des équilibres, nous observons également que les GANs convergent rarement vers un équilibre de Nash mais converge plutôt vers des équilibres stables locaux (LSSP). Inspirer par le succès des GANs nous proposons pour finir, une nouvelle famille de jeux que nous appelons adversarial example games qui consiste à entrainer simultanément un générateur et un critique, le générateur cherchant à perturber les exemples afin d’induire en erreur le critique, le critique cherchant à être robuste aux perturbations. Nous montrons qu’à l’équilibre de ce jeu, le générateur est capable de générer des perturbations qui transfèrent à toute une famille de modèles. / Many machine learning (ML) problems can be formulated as minimization problems, with a large optimization literature that provides algorithms and guarantees to solve this type of problems. However, recently some ML problems have been proposed that cannot be formulated as minimization problems but instead require to define a game between several players where each player has a different objective. A successful application of such games in ML are generative adversarial networks (GANs), where generative modeling is formulated as a game between a generator and a discriminator, where the goal of the generator is to fool the discriminator, while the discriminator tries to distinguish between fake and real samples. However due to the adversarial nature of the game, GANs are notoriously hard to train, requiring careful fine-tuning of the hyper-parameters and leading to unstable training. While regularization techniques have been proposed to stabilize training, we propose in this thesis to look at these instabilities from an optimization perspective. We start by bridging the gap between the machine learning and optimization literature by casting GANs as an instance of the Variational Inequality Problem (VIP), and leverage the large literature on VIP to derive more efficient and stable algorithms to train GANs. To better understand what are the challenges of training GANs, we then propose tools to study the optimization landscape of GANs. Using these tools we show that GANs do suffer from rotation around their equilibrium, and that they do not converge to Nash-Equilibria. Finally inspired by the success of GANs to generate images, we propose a new type of games called Adversarial Example Games that are able to generate adversarial examples that transfer across different models and architectures.
90

Scene Reconstruction From 4D Radar Data with GAN and Diffusion : A Hybrid Method Combining GAN and Diffusion for Generating Video Frames from 4D Radar Data / Scenrekonstruktion från 4D-radardata med GAN och Diffusion : En Hybridmetod för Generation av Bilder och Video från 4D-radardata med GAN och Diffusionsmodeller

Djadkin, Alexandr January 2023 (has links)
4D Imaging Radar is increasingly becoming a critical component in various industries due to beamforming technology and hardware advancements. However, it does not replace visual data in the form of 2D images captured by an RGB camera. Instead, 4D radar point clouds are a complementary data source that captures spatial information and velocity in a Doppler dimension that cannot be easily captured by a camera's view alone. Some discriminative features of the scene captured by the two sensors are hypothesized to have a shared representation. Therefore, a more interpretable visualization of the radar output can be obtained by learning a mapping from the empirical distribution of the radar to the distribution of images captured by the camera. To this end, the application of deep generative models to generate images conditioned on 4D radar data is explored. Two approaches that have become state-of-the-art in recent years are tested, generative adversarial networks and diffusion models. They are compared qualitatively through visual inspection and by two quantitative metrics: mean squared error and object detection count. It is found that it is easier to control the generative adversarial network's generative process through conditioning than in a diffusion process. In contrast, the diffusion model produces samples of higher quality and is more stable to train. Furthermore, their combination results in a hybrid sampling method, achieving the best results while simultaneously speeding up the diffusion process. / 4D bildradar får en alltmer betydande roll i olika industrier tack vare utveckling inom strålformningsteknik och hårdvara. Det ersätter dock inte visuell data i form av 2D-bilder som fångats av en RGB-kamera. Istället utgör 4D radar-punktmoln en kompletterande datakälla som representerar spatial information och hastighet i form av en Doppler-dimension. Det antas att vissa beskrivande egenskaper i den observerade miljön har en abstrakt representation som de två sensorerna delar. Därmed kan radar-datan visualiseras mer intuitivt genom att lära en transformation från fördelningen över radar-datan till fördelningen över bilderna. I detta syfte utforskas tillämpningen av djupa generativa modeller för bilder som är betingade av 4D radar-data. Två metoder som har blivit state-of-the-art de senaste åren testas: generativa antagonistiska nätverk och diffusionsmodeller. De jämförs kvalitativt genom visuell inspektion och med kvantitativa metriker: medelkvadratfelet och antalet korrekt detekterade objekt i den genererade bilden. Det konstateras att det är lättare att styra den generativa processen i generativa antagonistiska nätverk genom betingning än i en diffusionsprocess. Å andra sidan är diffusionsmodellen stabil att träna och producerar generellt bilder av högre kvalité. De bästa resultaten erhålls genom en hybrid: båda metoderna kombineras för att dra nytta av deras respektive styrkor. de identifierade begränsningarna i de enskilda modellerna och kurera datan för att jämföra hur dessa modeller skalar med större datamängder och mer variation.

Page generated in 0.0474 seconds