Global ETD Search

1	Unpaired Skeleton-to-Photo Translation for Sketch-to-Photo Synthesis Gu, Yuanzhe 28 October 2022 (has links) (PDF) Sketch-to-photo synthesis usually faced the problem of lack of labeled data, so we propose some methods based on CycleGAN to train a model to translate sketch to photo with unpaired data. Our main contribution is a proposed Sketch-to-Skeleton-to-Image (SSI) method, which performs skeletonization on sketches to reduce variance on the sketch data. We also tried different representations of the skeleton and different models for our task. Experiment results show that the generated image quality has a negative correlation with the sparsity of the input data. GANs image-to-image translation
2	GANtruth – a regularization method for unsupervised image-to-image translation / GANtruth – en regulariseringsmetod för oövervakad bild-till-bild-översättning Bujwid, Sebastian January 2018 (has links) In this work, we propose a novel and effective method for constraining the output space of the ill-posed problem of unsupervised image-to-image translation. We make the assumption that the environment of the source domain is known, and we propose to explicitly enforce preservation of the ground-truth labels on the images translated from the source to the target domain. We run empirical experiments on preserving information such as semantic segmentation and disparity and show evidence that our method achieves improved performance over the baseline model UNIT on translating images from SYNTHIA to Cityscapes. The generated images are perceived as more realistic in human surveys and have reduced errors when using them as adapted images in the domain adaptation scenario. Moreover, the underlying ground-truth preservation assumption is complementary to alternative approaches and by combining it with the UNIT framework, we improve the results even further. / I det här arbetet föreslår vi en ny och effektiv metod för att begränsa värdemängden för det illa-definierade problemet som utgörs av oövervakad bild-till-bild-översättning. Vi antar att miljön i källdomänen är känd, och vi föreslår att uttryckligen framtvinga bevarandet av grundfaktaetiketterna på bilder översatta från källa till måldomän. Vi utför empiriska experiment där information som semantisk segmentering och skillnad bevaras och visar belägg för att vår metod uppnår förbättrad prestanda över baslinjemetoden UNIT på att översätta bilder från SYNTHIA till Cityscapes. De genererade bilderna uppfattas som mer realistiska i undersökningar där människor tillfrågats och har minskat fel när de används som anpassade bilder i domänpassningsscenario. Dessutom är det underliggande grundfaktabevarande antagandet kompletterat med alternativa tillvägagångssätt och genom att kombinera det med UNIT-ramverket förbättrar vi resultaten ytterligare. GAN unsupervised image-to-image translation deep learning Computer Sciences Datavetenskap (datalogi)
3	Controllable Visual Synthesis AlBahar, Badour A. Sh A. 08 June 2023 (has links) Computer graphics has become an integral part of various industries such as entertainment (i.e.,films and content creation), fashion (i.e.,virtual try-on), and video games. Computer graphics has evolved tremendously over the past years. It has shown remarkable image generation improvement from low-quality, pixelated images with limited details to highly realistic images with fine details that can often be mistaken for real images. However, the traditional pipeline of rendering an image in computer graphics is complex and time- consuming. The whole process of creating the geometry, material, and textures requires not only time but also significant expertise. In this work, we aim to replace this complex traditional computer graphics pipeline with a simple machine learning model. This machine learning model can synthesize realistic images without requiring expertise or significant time and effort. Specifically, we address the problem of controllable image synthesis. We propose several approaches that allow the user to synthesize realistic content and manipulate images to achieve their desired goals with ease and flexibility. / Doctor of Philosophy / Computer graphics has become an integral part of various industries such as entertainment (i.e.,films and content creation), fashion (i.e.,virtual try-on), and video games. Computer graphics has evolved tremendously over the past years. It has shown remarkable image generation improvement from low-quality, pixelated images with limited details to highly realistic images with fine details that can often be mistaken for real images. However, the traditional process of generating an image in computer graphics is complex and time- consuming. You need to set up a camera and light, and create objects with all sorts of details. This requires not only time but also significant expertise. In this work, we aim to replace this complex traditional computer graphics pipeline with a simple machine learning model. This machine learning model can generate realistic images without requiring expertise or significant time and effort. Specifically, we address the problem of controllable image synthesis. We propose several approaches that allow the user to synthesize realistic content and manipulate images to achieve their desired goals with ease and flexibility. Computer vision Computer graphic Image-to-image translation Pose transfer Human reposing Video editing
4	A Deep Learning Approach to Predict Full-Field Stress Distribution in Composite Materials Sepasdar, Reza 17 May 2021 (has links) This thesis proposes a deep learning approach to predict stress at various stages of mechanical loading in 2-D representations of fiber-reinforced composites. More specifically, the full-field stress distribution at elastic and at an early stage of damage initiation is predicted based on the microstructural geometry. The required data set for the purposes of training and validation are generated via high-fidelity simulations of several randomly generated microstructural representations with complex geometries. Two deep learning approaches are employed and their performances are compared: fully convolutional generator and Pix2Pix translation. It is shown that both the utilized approaches can well predict the stress distributions at the designated loading stages with high accuracy. / M.S. / Fiber-reinforced composites are material types with excellent mechanical performance. They form the major material in the construction of space shuttles, aircraft, fancy cars, etc., the structures that are designed to be lightweight and at the same time extremely stiff and strong. Due to the broad application, especially in the sensitives industries, fiber-reinforced composites have always been a subject of meticulous research studies. The research studies to better understand the mechanical behavior of these composites has to be conducted on the micro-scale. Since the experimental studies on micro-scale are expensive and extremely limited, numerical simulations are normally adopted. Numerical simulations, however, are complex, time-consuming, and highly computationally expensive even when run on powerful supercomputers. Hence, this research aims to leverage artificial intelligence to reduce the complexity and computational cost associated with the existing high-fidelity simulation techniques. We propose a robust deep learning framework that can be used as a replacement for the conventional numerical simulations to predict important mechanical attributes of the fiber-reinforced composite materials on the micro-scale. The proposed framework is shown to have high accuracy in predicting complex phenomena including stress distributions at various stages of mechanical loading. Deep learning (Machine learning) CNN Full-Field Prediction Image-to-Image Translation Adversarial Learning
5	Unsupervised Image-to-image translation : Taking inspiration from human perception / Unsupervised Image-to-image translation : Taking inspiration from human perception Sveding, Jens Jakob January 2021 (has links) Generative Artificial Intelligence is a field of artificial intelligence where systems can learn underlying patterns in previously seen content and generate new content. This thesis explores a generative artificial intelligence technique used for image-toimage translations called Cycle-consistent Adversarial network (CycleGAN), which can translate images from one domain into another. The CycleGAN is a stateof-the-art technique for doing unsupervised image-to-image translations. It uses the concept of cycle-consistency to learn a mapping between image distributions, where the Mean Absolute Error function is used to compare images and thereby learn an underlying mapping between the two image distributions. In this work, we propose to use the Structural Similarity Index Measure (SSIM) as an alternative to the Mean Absolute Error function. The SSIM is a metric inspired by human perception, which measures the difference in two images by comparing the difference in, contrast, luminance, and structure. We examine if using the SSIM as the cycle-consistency loss in the CycleGAN will improve the image quality of generated images as measured by the Inception Score and Fréchet Inception Distance. The inception Score and Fréchet Inception Distance are both metrics that have been proposed as methods for evaluating the quality of images generated by generative adversarial networks (GAN). We conduct a controlled experiment to collect the quantitative metrics. Our results suggest that using the SSIM in the CycleGAN as the cycle-consistency loss will, in most cases, improve the image quality of generated images as measured Inception Score and Fréchet Inception Distance. Artificial intelligence Unsupervised learning Image-to-Image translation Generative Adversarial Networks Structural Similarity Index Measure Computer Sciences Datavetenskap (datalogi)
6	Domain Adaptation for Multi-Contrast Image Segmentation in Cardiac Magnetic Resonance Imaging / Domänanpassning för segmentering av bilder med flera kontraster vid Magnetresonanstomografi av hjärta Proudhon, Thomas January 2023 (has links) Accurate segmentation of the ventricles and myocardium on Cardiac Magnetic Resonance (CMR) images is crucial to assess the functioning of the heart or to diagnose patients suffering from myocardial infarction. However, the domain shift existing between the multiple sequences of CMR data prevents a deep learning model trained on a specific contrast to be used on a different sequence. Domain adaptation can address this issue by alleviating the domain shift between different CMR contrasts, such as Balanced Steady-State Free Precession (bSSFP) and Late Gadolinium Enhancement (LGE) sequences. The aim of the degree project “Domain Adaptation for Multi-Contrast Image Segmentation in Cardiac Magnetic Resonance Imaging” is to apply domain adaptation to perform unsupervised segmentation of cardiac structures on LGE sequences. A style-transfer model based on generative adversarial networks is trained to achieve modality-to-modality translation between LGE and bSSFP contrasts. Then, a supervised segmentation model is developed to segment the myocardium, left and right ventricles on bSSFP data. Final segmentation is performed on synthetic bSSFP obtained by translating LGE images. Our method shows a significant increase in Dice score compared to direct segmentation of LGE data. In conclusion, the results demonstrate that using domain adaptation based on information from complementary CMR sequences is a successful approach to unsupervised segmentation of Late Gadolinium Enhancement images. Cardiac Magnetic Resonance Imaging Deep Learning Domain Adaptation Unsupervised Segmentation Image-to-image Translation Medical Engineering Medicinteknik
7	Using Generative Adversarial Networks for H&E-to-HER2 Stain Translation in Digital Pathology Images Tirmén, William January 2023 (has links) In digital pathology, hematoxylin & eosin (H&E) is a routine stain which is performed on most clinical cases and it often provides clinicians with sufficient information for diagnosis. However, when making decisions on how to guide breast cancer treatment, immunohistochemical staining of human epidermal growth factor 2 (HER2 staining) is also needed. Over-expression of the HER2 protein plays a significant role in the progression of breast cancer and is therefore important to consider during treatment planning. However, the downside of HER2 staining is that it is both time consuming and rather expensive. This thesis explores the possibility for H&E-to-HER2 stain translation using generative adversarial networks (GANs). If effective, this has the potential to reduce the costs and time spent on tissue processing while still providing clinicians with the images necessary to make a complete diagnosis. To explore this area two supervised (Pix2Pix, PyramidPix2Pix) and one unsupervised (cycleGAN) GAN structure was implemented and trained on digital pathology images from the MIST dataset. These models were trained two times, with 256x256 and 512x512 patches, to see which effect patch size has on stain translation performance as well. In addition, a methodology for evaluating the quality of the generated HER2 patches was also presented and utilized. This methodology consists of structural similarity index (SSIM) and peak signal to noise ratio (PSNR) comparison to the ground truth, and a HER2 status classification protocol. In the latter a classification tool provided by Sectra was used to assign each patch with a HER2 status of No tumor, 1+, 2+ or 3+ and the statuses of the generated patches were then compared to the statuses of the ground truths. The results show that the supervised Pyramid Pix2Pix model trained on 512x512 patches performs the best according to the SSIM and PSNR metrics. However, the unsupervised cycleGAN model shows more promising results when it comes to both visual assessment and the HER2 status classification protocol. Especially when trained on 256x256 patches for 200 epochs which gave an accuracy of 0.655, F1-score of 0.674 and MCC of 0.490. In conclusion the HER2 status classification protocol is deemed as a suitable way to evaluate H&E-to-HER2 stain translation and thereby the unsupervised method is considered to be better than the supervised. Moreover, it is also concluded that a smaller patch size result in worse translation of cellular structure for the supervised methods. Further studies should focus on incorporating HER2 status classification in the cycleGAN loss function and more extensive training runs to further improve the quality of H&E-to-HER2 stain translation. Machine learning Artificial intelligence Digital pathology Image processing Generative adversarial networks Image-to-image translation Medical Image Processing Medicinsk bildbehandling
8	Learning to Generate Things and Stuff: Guided Generative Adversarial Networks for Generating Human Faces, Hands, Bodies, and Natural Scenes Tang, Hao 27 May 2021 (has links) In this thesis, we mainly focus on image generation. However, one can still observe unsatisfying results produced by existing state-of-the-art methods. To address this limitation and further improve the quality of generated images, we propose a few novel models. The image generation task can be roughly divided into three subtasks, i.e., person image generation, scene image generation, and cross-modal translation. Person image generation can be further divided into three subtasks, namely, hand gesture generation, facial expression generation, and person pose generation. Meanwhile, scene image generation can be further divided into two subtasks, i.e., cross-view image translation and semantic image synthesis. For each task, we have proposed the corresponding solution. Specifically, for hand gesture generation, we have proposed the GestureGAN framework. For facial expression generation, we have proposed the Cycle-in-Cycle GAN (C2GAN) framework. For person pose generation, we have proposed the XingGAN and BiGraphGAN frameworks. For cross-view image translation, we have proposed the SelectionGAN framework. For semantic image synthesis, we have proposed the Local and Global GAN (LGGAN), EdgeGAN, and Dual Attention GAN (DAGAN) frameworks. Although each method was originally proposed for a certain task, we later discovered that each method is universal and can be used to solve different tasks. For instance, GestureGAN can be used to solve both hand gesture generation and cross-view image translation tasks. C2GAN can be used to solve facial expression generation, person pose generation, hand gesture generation, and cross-view image translation. SelectionGAN can be used to solve cross-view image translation, facial expression generation, person pose generation, hand gesture generation, and semantic image synthesis. Moreover, we explore cross-modal translation and propose a novel DanceGAN for audio-to-video translation. Generative Adversarial Networks (GANs) Image Generation Image-to-Image Translation Person Image Generation Scene Image Generation Cross-Modal Translation
9	Exploring Multi-Domain and Multi-Modal Representations for Unsupervised Image-to-Image Translation Liu, Yahui 20 May 2022 (has links) Unsupervised image-to-image translation (UNIT) is a challenging task in the image manipulation field, where input images in a visual domain are mapped into another domain with desired visual patterns (also called styles). An ideal direction in this field is to build a model that can map an input image in a domain to multiple target domains and generate diverse outputs in each target domain, which is termed as multi-domain and multi-modal unsupervised image-to-image translation (MMUIT). Recent studies have shown remarkable results in UNIT but they suffer from four main limitations: (1) State-of-the-art UNIT methods are either built from several two-domain mappings that are required to be learned independently or they generate low-diversity results, a phenomenon also known as model collapse. (2) Most of the manipulation is with the assistance of visual maps or digital labels without exploring natural languages, which could be more scalable and flexible in practice. (3) In an MMUIT system, the style latent space is usually disentangled between every two image domains. While interpolations within domains are smooth, interpolations between two different domains often result in unrealistic images with artifacts when interpolating between two randomly sampled style representations from two different domains. Improving the smoothness of the style latent space can lead to gradual interpolations between any two style latent representations even between any two domains. (4) It is expensive to train MMUIT models from scratch at high resolution. Interpreting the latent space of pre-trained unconditional GANs can achieve pretty good image translations, especially high-quality synthesized images (e.g., 1024x1024 resolution). However, few works explore building an MMUIT system with such pre-trained GANs. In this thesis, we focus on these vital issues and propose several techniques for building better MMUIT systems. First, we base on the content-style disentangled framework and propose to fit the style latent space with Gaussian Mixture Models (GMMs). It allows a well-trained network using a shared disentangled style latent space to model multi-domain translations. Meanwhile, we can randomly sample different style representations from a Gaussian component or use a reference image for style transfer. Second, we show how the GMM-modeled latent style space can be combined with a language model (e.g., a simple LSTM network) to manipulate multiple styles by using textual commands. Then, we not only propose easy-to-use constraints to improve the smoothness of the style latent space in MMUIT models, but also design a novel metric to quantitatively evaluate the smoothness of the style latent space. Finally, we build a new model to use pretrained unconditional GANs to do MMUIT tasks.
10	Image-to-Image Translation for Improvement of Synthetic Thermal Infrared Training Data Using Generative Adversarial Networks Hamrell, Hanna January 2021 (has links) Training data is an essential ingredient within supervised learning, yet time con-suming, expensive and for some applications impossible to retrieve. Thus it isof interest to use synthetic training data. However, the domain shift of syntheticdata makes it challenging to obtain good results when used as training data fordeep learning models. It is therefore of interest to refine synthetic data, e.g. using image-to-image translation, to improve results. The aim of this work is to compare different methods to do image-to-image translation of synthetic training data of thermal IR-images using GANs. Translation is done both using synthetic thermal IR-images alone, as well as including pixelwise depth and/or semantic information. To evaluate, a new measure based on the Frechét Inception Distance, adapted to work for thermal IR-images is proposed. The results show that the model trained using IR-images alone translates the generated images closest to the domain of authentic thermal IR-images. The training where IR-images are complemented by corresponding pixelwise depth data performs second best. However, given more training time, inclusion of depth data has the potential to outperform training withirdata alone. This gives a valuable insight on how to best translate images from the domain of synthetic IR-images to that of authentic IR-images, which is vital for quick and low cost generation of training data for deep learning models. machine learning deep learning image-to-image translation image processing generative adversarial networks infrared images computer vision

Search results