Global ETD Search

1	Estimation of Global Illumination using Cycle-Consistent Adversarial Networks Oh, Junho 20 December 2023 (has links) The field of computer graphics has made significant progress over the years, transforming from simple, pixelated images to highly realistic visuals used across various industries including entertainment, fashion, and video gaming. However, the traditional process of rendering images remains complex and time-consuming, requiring a deep understanding of geometry, materials, and textures. This thesis introduces a simpler approach through a machine learning model, specifically using Cycle-Consistent Adversarial Networks (CycleGAN), to generate realistic images and estimate global illumination in real-time, significantly reducing the need for extensive expertise and time investment. Our experiments on the Blender and Portal datasets demonstrate the model's ability to efficiently generate high-quality, globally illuminated scenes, while a comparative study with the Pix2Pix model highlights our approach's strengths in preserving fine visual details. Despite these advancements, we acknowledge the limitations posed by hardware constraints and dataset diversity, pointing towards areas for future improvement and exploration. This work aims to simplify the complex world of computer graphics, making it more accessible and user-friendly, while maintaining high standards of visual realism. / Master of Science / Creating realistic images on a computer is a crucial part of making video games and movies more immersive and lifelike. Traditionally, this has been a complex and time-consuming task, requiring a deep understanding of how light interacts with objects to create shadows and highlights. This study introduces a simpler and quicker method using a type of smart computer program that learns from examples. This program, known as Cycle-Consistent Adversarial Networks (CycleGAN), is designed to understand the complex play of light in virtual scenes and recreate it in a way that makes the image look real. In testing this new method on different types of images, from simpler scenes to more complex ones, the results were impressive. The program was not only able to significantly cut down the time needed to render an image, but it also maintained the fine details that bring an image to life. While there were challenges, such as working with limited computer power and needing a wider variety of images for the program to learn from, the study shows great promise. It represents a big step forward in making the creation of high-quality, realistic computer graphics more accessible and achievable for a wider range of applications. Global Illumination GANs CycleGAN Ray-Tracing
2	POCS Augmented CycleGAN for MR Image Reconstruction Yang, Hanlu January 2020 (has links) Traditional Magnetic Resonance Imaging (MRI) reconstruction methods, which may be highly time-consuming and sensitive to noise, heavily depend on solving nonlinear optimization problems. By contrast, deep learning (DL)-based reconstruction methods do not need any explicit analytical data model and are robust to noise due to its large data-based training, which both make DL a versatile tool for fast and high-fidelity MR image reconstruction. While DL can be performed completely independently of traditional methods, it can, in fact, benefit from incorporating these established methods to achieve better results. To test this hypothesis, we proposed a hybrid DL-based MR image reconstruction method, which combines two state-of-the-art deep learning networks, U-Net and Generative Adversarial Network with Cycle loss (CycleGAN), with a traditional data reconstruction method: Projection Onto Convex Sets (POCS). Experiments were then performed to evaluate the method by comparing it to several existing state-of-the-art methods. Our results demonstrate that the proposed method outperformed the current state-of-the-art in terms of higher peak signal-to-noise ratio (PSNR) and higher Structural Similarity Index (SSIM). / Electrical and Computer Engineering Electrical Engineering Compressed Sensing Cyclegan Deep Learning Mr Image Reconstruction
3	[pt] ADAPTAÇÃO DE DOMINIO BASEADO EM APRENDIZADO PROFUNDO PARA DETECÇÃO DE MUDANÇAS EM FLORESTAS TROPICAIS / [en] DEEP LEARNING-BASED DOMAIN ADAPTATION FOR CHANGE DETECTION IN TROPICAL FORESTS PEDRO JUAN SOTO VEGA 20 July 2021 (has links) [pt] Os dados de observação da Terra são freqüentemente afetados pelo fenômeno de mudança de domínio. Mudanças nas condições ambientais, variabilidade geográfica e diferentes propriedades de sensores geralmente tornam quase impossível empregar classificadores previamente treinados para novos dados sem experimentar uma queda significativa na precisão da classificação. As técnicas de adaptação de domínio baseadas em modelos de aprendizado profundo têm se mostrado úteis para aliviar o problema da mudança de domínio. Trabalhos recentes nesta área fundamentam-se no treinamento adversárial para alinhar os atributos extraídos de imagens de diferentes domínios em um espaço latente comum. Outra forma de tratar o problema é empregar técnicas de translação de imagens e adaptá-las de um domínio para outro de forma que as imagens transformadas contenham características semelhantes às imagens do outro domínio. Neste trabalho, propõem-se abordagens de adaptação de domínio para tarefas de detecção de mudanças, baseadas em primeiro lugar numa técnica de traslação de imagens, Cycle-Consistent Generative Adversarial Network (CycleGAN), e em segundo lugar, num modelo de alinhamento de atributos: a Domain Adversarial Neural Network (DANN). Particularmente, tais técnicas foram estendidas, introduzindo-se restrições adicionais na fase de treinamento dos componentes do modelo CycleGAN, bem como um procedimento de pseudo-rotulagem não supervisionado para mitigar o impacto negativo do desequilíbrio de classes no DANN. As abordagens propostas foram avaliadas numa aplicação de detecção de desmatamento, considerando diferentes regiões na floresta amazônica e no Cerrado brasileiro (savana). Nos experimentos, cada região corresponde a um domínio, e a precisão de um classificador treinado com imagens e referências de um dos domínio (fonte) é medida na classificação de outro domínio (destino). Os resultados demonstram que as abordagens propostas foram bem sucedidas em amenizar o problema de desvio de domínio no contexto da aplicação alvo. / [en] Earth observation data are frequently affected by the domain shift phenomenon. Changes in environmental conditions, geographical variability and different sensor properties typically make it almost impossible to employ previously trained classifiers for new data without a significant drop in classification accuracy. Domain adaptation (DA) techniques based on Deep Learning models have been proven useful to alleviate domain shift. Recent improvements in DA technology rely on adversarial training to align features extracted from images of the different domains in a common latent space. Another way to face the problem is to employ image translation techniques, and adapt images from one domain in such a way that the transformed images contain characteristics that are similar to the images from the other domain. In this work two different DA approaches for change detection tasks are proposed, which are based on a particular image translation technique, the Cycle-Consistent Generative Adversarial Network (CycleGAN), and on a representation matching strategy, the Domain Adversarial Neural Network (DANN). In particular, additional constraints in the training phase of the original CycleGAN model components are proposed, as well as an unsupervised pseudo-labeling procedure, to mitigate the negative impact of class imbalance in the DANN-based approach. The proposed approaches were evaluated on a deforestation detection application, considering different sites in the Amazon rain-forest and in the Brazilian Cerrado (savanna) biomes. In the experiments each site corresponds to a domain, and the accuracy of a classifier trained with images and references from one (source) domain is measured in the classification of another (target) domain. The experimental results show that the proposed approaches are successful in alleviating the domain shift problem. [pt] APRENDIZADO PROFUNDO [pt] CYCLEGAN [pt] ADAPTACAO DE DOMINIO [pt] DETECCAO DE MUDANCAS [pt] DETECCAO DE DESMATAMENTO [en] DEEP LEARNING [en] CYCLEGAN [en] DOMAIN ADAPTATION [en] CHANGE DETECTION [en] DEFORESTATION DETECTION
4	Cycle-GAN for removing structured foreground objects in images / Cycle-GAN för att ta bort strukturerade förgrundsobjekt i bilder Arriaza Barriga, Romina Carolina January 2020 (has links) The TRACAB Image Tracking System is used by ChyronHego for the tracking of ball and players on football fields. It requires the calibration of the cameras around the arena which is disrupted by fences and other mesh structures that are positioned between the camera and the field as a safety measure for the public. The purpose of this work was the implementation of a cycle consistent Generative Adversarial Network (cycle-GAN) for removing the fence from the image using unpaired data. Cycle-GANs are part of the state-of-the-art of image-to-image translation and can solve this kind of problem without the need of paired images. This makes it an exciting and powerful method and, according to the latest investigations in the current work, it has never been used for this kind of application before. The model was able to strongly attenuate, and in some cases completely remove, the net structure from images. To quantify the impact of the net removal a homography matching was performed. Then, it was compared with the homography associated to the baseline of blurring the image with a gaussian filter and the original image without the use of any filter. The results showed that the identification of key-points was harder on synthetic images than on the original image with or without small Gaussian filters, but it showed a better performance against images blurred with filters with a standard deviation of 3 pixels or more. Despite the performance not being better than the baseline in all the cases it always added new key-points, and sometimes, it was able to find correct homographies where the baseline could not. Therefore, the cycle-GAN model proved to complement the baseline. / TRACAB Image Tracking System används av ChyronHego för spårning av bollen och spelaren påfotbollsplaner. Detta kräver kalibrering av kamerorna runt arenan som störs av staket och andra nätstrukturer som är placerade mellan kameran och fältet som en säkerhetsåtgärd för publiken. Detta examensabrete fokuserar påimplementeringen av en cycle-GAN för borttagning av nätet från bilden med hjälp av oparade data. Cycle-GAN är en bild-till-bild-översättning state-of-the-art teknik och det kan lösa denna typ av problem utan parade bilder. Detta gör det till en spännande och kraftfull metod och enligt den senaste forskningen har det aldrig använts för denna typ av tillämpning förut. Modellen kunde kraftigt dämpa och i vissa fall helt ta bort nätstrukturen från bilder. För att kvantifiera effekterna av avlägsnandet av nätet utfördes en homografimatchning. Därefter jämfördes det med homografin associerad med baslinjen där bilden görs suddig med ett gaussiskt filter och originalbilden utan användning av något filter. Resultaten visade att identifieringen av nyckelpunkter var svårare påsyntetiska bilder än påoriginalbilder med eller utan småGauss-filter, men det visade bättre prestanda än bilder som var suddigt med filter med en standardavvikelse på 3 pixlar eller mer. Trots att prestandan inte var bättre än baslinjen i alla fall lade versionen utan nätet alltid till nya nyckelpunkter, och ibland kunde den hitta korrekta homografier där baslinjen misslyckades. Därför, cycle-GAN-modellen kompletterar baslinjen. ComputerVision Image de-fencing CycleGAN GAN Homography Unpaired data Fence removal. Datorvision avstängning av bilder CycleGAN GAN homografi oparade data borttagning av staket. Computer and Information Sciences Data- och informationsvetenskap
5	Enhancing Simulated Sonar Images With CycleGAN for Deep Learning in Autonomous Underwater Vehicles / Djupinlärning, maskininlärning, sonar, simulering, GAN, cycleGAN, YOLO-v4, gles data, osäkerhetsanalys Norén, Aron January 2021 (has links) This thesis addresses the issues of data sparsity in the sonar domain. A data pipeline is set up to generate and enhance sonar data. The possibilities and limitations of using cycleGAN as a tool to enhance simulated sonar images for the purpose of training neural networks for detection and classification is studied. A neural network is trained on the enhanced simulated sonar images and tested on real sonar images to evaluate the quality of these images.The novelty of this work lies in extending previous methods to a more general framework and showing that GAN enhanced simulations work for complex tasks on field data.Using real sonar images to enhance the simulated images, resulted in improved classification compared to a classifier trained on solely simulated images. / Denna rapport ämnar undersöka problemet med gles data för djupinlärning i sonardomänen. Ett dataflöde för att generera och höja kvalitén hos simulerad sonardata sätts upp i syfte att skapa en stor uppsättning data för att träna ett neuralt nätverk. Möjligheterna och begränsningarna med att använda cycleGAN för att höja kvalitén hos simulerad sonardata studeras och diskuteras. Ett neuralt nätverk för att upptäcka och klassificera objekt i sonarbilder tränas i syfte att evaluera den förbättrade simulerade sonardatan.Denna rapport bygger vidare på tidigare metoder genom att generalisera dessa och visa att metoden har potential även för komplexa uppgifter baserad på icke trivial data.Genom att träna ett nätverk för klassificering och detektion på simulerade sonarbilder som använder cycleGAN för att höja kvalitén, ökade klassificeringsresultaten markant jämfört med att träna på enbart simulerade bilder. Deep Learning Machine Learning Sonar Simulation GAN cycleGAN YOLO-v4 Data Sparsity Uncertainty Estimations Djupinlärning maskininlärning sonar simulering GAN cycleGAN YOLO-v4 gles data osäkerhetsanalys Mathematics Matematik
6	Improving Image Quality in Cardiac Computed Tomography using Deep Learning / Att förbättra bildkvalitet från datortomografier av hjärtat med djupinlärning Wajngot, David January 2019 (has links) Cardiovascular diseases are the largest mortality factor globally, and early diagnosis is essential for a proper medical response. Cardiac computed tomography can be used to acquire images for their diagnosis, but without radiation dose reduction the radiation emitted to the patient becomes a significant risk factor. By reducing the dose, the image quality is often compromised, and determining a diagnosis becomes difficult. This project proposes image quality enhancement with deep learning. A cycle-consistent generative adversarial neural network was fed low- and high-quality images with the purpose to learn to translate between them. By using a cycle-consistency cost it was possible to train the network without paired data. With this method, a low-quality image acquired from a computed tomography scan with dose reduction could be enhanced in post processing. The results were mixed but showed an increase of ventricular contrast and artifact mitigation. The technique comes with several problems that are yet to be solved, such as structure alterations, but it shows promise for continued development. deep learning neural network GAN cycleGAN CT heart imaging medical imaging Medical Engineering Medicinteknik
7	Deep-Learning Conveyor Belt Anomaly Detection Using Synthetic Data and Domain Adaptation Fridesjö, Jakob January 2024 (has links) Conveyor belts are essential components used in the mining and mineral processing industry to transport granular material and objects. However, foreign objects/anomalies transported along the conveyor belts can result in catastrophic and costly consequences. A solution to the problem is to use machine vision systems based on AI algorithms to detect anomalies before any incidents occur. However, the challenge is to obtain sufficient training data when images containing anomalous objects are, by definition, scarce. This thesis investigates how synthetic data generated by a granular simulator can be used to train a YOLOv8-based model to detect foreign objects in a real world setting. Furthermore, the domain gap between the synthetic data domain and real-world data domain is bridged by utilizing style transfer through CycleGAN. Results show that using YOLOv8s-seg for instance segmentation of conveyors is possible even when trained on synthetic data. It is also shown that using domain adaptation by style transfer using CycleGAN can improve the performance of the synthetic model, even when the real-world data lacks anomalies. Computer Vision Deep Learning Anomaly Detection Domain Adaptation YOLOv8 CycleGAN Computer Sciences Datavetenskap (datalogi)
8	Generative Adversarial Networks for Cross-Lingual Voice Conversion Ankaräng, Fredrik January 2021 (has links) Speech synthesis is a technology that increasingly influences our daily lives, in the form of smart assistants, advanced translation systems and similar applications. In this thesis, the phenomenon of making one’s voice sound like the voice of someone else is explored. This topic is called voice conversion and needs to be done without altering the linguistic content of speech. More specifically, a Cycle-Consistent Adversarial Network that has proven to work well in a monolingual setting, is evaluated in a multilingual environment. The model is trained to convert voices between native speakers from the Nordic countries. In the experiments no parallel, transcribed or aligned speech data is being used, forcing the model to focus on the raw audio signal. The goal of the thesis is to evaluate if performance is degraded in a multilingual environment, in comparison to monolingual voice conversion, and to measure the impact of the potential performance drop. In the study, performance is measured in terms of naturalness and speaker similarity between the generated speech and the target voice. For evaluation, listening tests are conducted, as well as objective comparisons of the synthesized speech. The results show that voice conversion between a Swedish and Norwegian speaker is possible and also that it can be performed without performance degradation in comparison to Swedish-to-Swedish conversion. Furthermore, conversion between Finnish and Swedish speakers, as well as Danish and Swedish speakers show a performance drop for the generated speech. However, despite the performance decrease, the model produces fluent and clearly articulated converted speech in all experiments. These results are noteworthy, especially since the network is trained on less than 15 minutes of nonparallel speaker data for each speaker. This thesis opens up for further areas of research, for instance investigating more languages, more recent Generative Adversarial Network architectures and devoting more resources to tweaking the hyperparameters to further optimize the model for multilingual voice conversion. / Talsyntes är ett område som allt mer influerar vår vardag, exempelvis genom smarta assistenter, avancerade översättningssystem och liknande användningsområden. I det här examensarbetet utforskas fenomenet röstkonvertering, som innebär att man får en talare att låta som någon annan, utan att det som sades förändras. Mer specifikt undersöks ett Cycle-Consistent Adversarial Network som fungerat väl för röstkonvertering inom ett enskilt språk för röstkonvertering mellan olika språk. Det neurala nätverket tränas för konvertering mellan röster från olika modersmålstalare från de nordiska länderna. I experimenten används ingen parallell eller transkriberad data, vilket tvingar modellen att endast använda sig av ljudsignalen. Målet med examensarbetet är att utvärdera om modellens prestanda försämras i en flerspråkig kontext, jämfört med en enkelspråkig sådan, samt mäta hur stor försämringen i sådant fall är. I studien mäts prestanda i termer av kvalitet och talarlikhet för det genererade talet och rösten som efterliknas. För att utvärdera detta genomförs lyssningstester, samt objektiva analyser av det genererade talet. Resultaten visar att röstkonvertering mellan en svensk och norsk talare är möjlig utan att modellens prestanda försämras, jämfört med konvertering mellan svenska talare. För konvertering mellan finska och svenska talare, samt danska och svenska talare försämrades däremot kvaliteten av det genererade talet. Trots denna försämring producerade modellen tydligt och sammanhängande tal i samtliga experiment. Det här är anmärkningsvärt eftersom modellen tränades på mindre än 15 minuter icke-parallel data för varje talare. Detta examensarbete öppnar upp för nya framtida studier, exempelvis skulle fler språk kunna inkluderas eller nyare varianter av typen Generative Adversarial Network utvärderas. Mer resurser skulle även kunna läggas på att optimera hyperparametrarna för att ytterligare optimera den undersökta modellen för flerspråkig röstkonvertering. Generative Adversarial Network CycleGAN Cross-Lingual Voice Conversion Speech Synthesis Machine Learning Computer and Information Sciences Data- och informationsvetenskap
9	Generative Adversarial Networks to enhance decision support in digital pathology De Biase, Alessia January 2019 (has links) Histopathological evaluation and Gleason grading on Hematoxylin and Eosin(H&E) stained specimens is the clinical standard in grading prostate cancer. Recently, deep learning models have been trained to assist pathologists in detecting prostate cancer. However, these predictions could be improved further regarding variations in morphology, staining and differences across scanners. An approach to tackle such problems is to employ conditional GANs for style transfer. A total of 52 prostatectomies from 48 patients were scanned with two different scanners. Data was split into 40 images for training and 12 images for testing and all images were divided into overlapping 256x256 patches. A segmentation model was trained using images from scanner A, and the model was tested on images from both scanner A and B. Next, GANs were trained to perform style transfer from scanner A to scanner B. The training was performed using unpaired training images and different types of Unsupervised Image to Image Translation GANs (CycleGAN and UNIT). Beside the common CycleGAN architecture, a modified version was also tested, adding Kullback Leibler (KL) divergence in the loss function. Then, the segmentation model was tested on the augmented images from scanner B.The models were evaluated on 2,000 randomly selected patches of 256x256 pixels from 10 prostatectomies. The resulting predictions were evaluated both qualitatively and quantitatively. All proposed methods outperformed in AUC, in the best case the improvement was of 16%. However, only CycleGAN trained on a large dataset demonstrated to be capable to improve the segmentation tool performance, preserving tissue morphology and obtaining higher results in all the evaluation measurements. All the models were analyzed and, finally, the significance of the difference between the segmentation model performance on style transferred images and on untransferred images was assessed, using statistical tests. Generative Adversarial Networks Digital Pathology CycleGAN Style Transfer Probability Theory and Statistics Sannolikhetsteori och statistik Medical Image Processing Medicinsk bildbehandling Övrig annan teknik
10	LiDAR Point Cloud De-noising for Adverse Weather Bergius, Johan, Holmblad, Jesper January 2022 (has links) Light Detection And Ranging (LiDAR) is a hot topic today primarily because of its vast importance within autonomous vehicles. LiDAR sensors are capable of capturing and identifying objects in the 3D environment. However, a drawback of LiDAR is that they perform poorly under adverse weather conditions. Noise present in LiDAR scans can be divided into random and pseudo-random noise. Random noise can be modeled and mitigated by statistical means. The same approach works on pseudo-random noise, but it is less effective. For this, Deep Neural Nets (DNN) are better suited. The main goal of this thesis is to investigate how snow can be detected in LiDAR point clouds and filtered out. The dataset used is Winter Adverse DrivingdataSet (WADS). Supervised filtering contains a comparison between statistical filtering and segmentation-based neural networks and is evaluated on recall, precision, and F1. The supervised approach is expanded by investigating an ensemble approach. The supervised result indicates that neural networks have an advantage over statistical filters, and the best result was obtained from the 3D convolution network with an F1 score of 94.58%. Our ensemble approaches improved the F1 score but did not lead to more snow being removed. We determine that an ensemble approach is a sub-optimal way of increasing the prediction performance and holds the drawback of being more complex. We also investigate an unsupervised approach. The unsupervised networks are evaluated on their ability to find noisy data and correct it. Correcting the LiDAR data means predicting new values for detected noise instead of just removing it. Correctness of such predictions is evaluated manually but with the assistance of metrics like PSNR and SSIM. None of the unsupervised networks produced an acceptable result. The reason behind this negative result is investigated and presented in our conclusion, along with a model that suffers none of the flaws pointed out. Semantic Segmentation Lidar point cloud CNN GAN CycleGAN Unsupervised LiOR DSOR DROR WADS Computer and Information Sciences Data- och informationsvetenskap Other Computer and Information Science Annan data- och informationsvetenskap

Search results