1 |
Real-time 3D cloud animations using DCGANJunede, Fredrik, Asp, Samuel January 2020 (has links)
Background. Previous studies in the area of video generation using generative adversarial networks have shown limitations in perceived naturalism of the generated images. A previously proposed method of rendering and simulating clouds serves as the base for this thesis. Objectives. This thesis aims to propose a new method utilising machine learning in the context of generating 3D cloud animation in computer graphics. This aim is broken down into multiple objectives, with the primary ones being the following. The utilisation of a machine learning model includes the pre-processing of cloud images into cloud maps, training the model, as well as generating 2D cloud animations with it. 3D cloud animations are achieved by implementing the model into a pre-existing real-time cloud rendering framework. The performance of the implementation is measured and evaluated. Finally, a questionnaire is deployed and its results are analysed to evaluate the effectiveness of the proposed method. Methods. The image quality of the generated images is compared using an image quality assessment method which compares them to the data set used for training. Performance measurements are taken and compared between a base method reliant on Voronoi-noise and the proposed machine learning-based method. Finally, a questionnaire is deployed and then statistically analysed to evaluate the perceived naturalism of the base method and the proposed method. Results. The proposed method has a rendering time almost twice as long when compared to the base method when run in real-time. However, the results from the questionnaire showed that the proposed method achieves a higher level of perceived naturalism of the animation. Conclusions. The proposed method generates more natural animations than the base method at a higher cost in terms of time complexity. / Bakgrund. Tidigare studier inom videogeneration med generativa motverkande nätverk har visat begränsningar kring den upplevda naturligheten av de genererade bilderna. En tidigare föreslagen metod för rendering och simulering av moln fungerar som grund för denna uppsats. Syfte. Denna uppsats siktar på att föreslå en ny maskinlärningsbaserad metod till kontexten att generera 3D-molnanimationer inom datorgrafik. Syftet bryts ner i flera mål, av vilka de primära är som följande. Användandet av en maskinlärningsmodell inkluderar förbehandlingen av molnbilder till molnkartor, träningen av modellen samt genereringen av 2D-molnanimationer via modellen. 3D-molnanimationer uppnås genom att implementera modellen i ett förexisterande ramverk för realtidsbaserat molnrendering. Prestandan av implementationen mäts och evalueras. Slutligen distribueras ett frågeformulär vars resultat analyseras för att evaluera effektiviteten av den föreslagna metoden. Metod. Bildkvaliten av de genererade bilderna jämförs, med hjälp av en metod för bildkvalitetsevaluering, med datamängden som användes vid träningen. Prestandan mäts och jämförs mellan den gamla Voronoi-brusbaserade metoden och den föreslagna maskinlärningsbaserade metoden. Slutligen kommer ett frågeformulär skickas ut och därefter bli statistiskt analyserat för att evaluera den upplevda naturligheten mellan den gamla metoden och den föreslagna metoden. Resultat. Den föreslagna metoden har en renderingstid nästan dubbelt så hög som den gamla metoden när de kör i realtid. Dock visar resultatet från frågeformuläret att animationen i den föreslagna metoden uppnår en högre nivå av upplevd naturlighet. Slutsatser. Den föreslagna metoden genererar mer naturliga animationer än den gamla metoden till en höjning i tidskomplexitet
|
2 |
IMPROVING THE PERFORMANCE OF DCGAN ON SYNTHESIZING IMAGES WITH A DEEP NEURO-FUZZY NETWORKPersson, Ludvig, Andersson Arvsell, William January 2022 (has links)
Since mid to late 2010 image synthesizing using neural networks has become a trending research topic. And the framework mostly used for solving these tasks is the Generative adversarial network (GAN). GAN works by using two networks, a generator and a discriminator that trains and competes alongside each other. In today’s research regarding image synthesis, it is mostly about generating or altering images in any way which could be used in many fields, for example creating virtual environments. The topic is however still in quite an early stage of its development and there are fields where image synthesizing using Generative adversarial networks fails. In this work, we will answer one thesis question regarding the limitations and discuss for example the limitation causing GAN networks to get stuck during training. In addition to some limitations with existing GAN models, the research also lacks more experimental GAN variants. It exists today a lot of different variants, where GAN has been further developed and modified. But when it comes to GAN models where the discriminator has been changed to a different network, the number of existing works reduces drastically. In this work, we will experiment and compare an existing deep convolutional generative adversarial network (DCGAN), which is a GAN variant, with one that we have modified using a deep neuro-fuzzy system. We have created the first DCGAN model that uses a deep neuro-fuzzy system as a discriminator. When comparing these models, we concluded that the performance differences are not big. But we strongly believe that with some further improvements our model can outperform the DCGAN model. This work will therefore contribute to the research with the result and knowledge of a possible improvement to DCGAN models which in the future might cause similar research to be conducted on other GANmodels.
|
3 |
GAN-Based Counterfactual Explanation on ImagesWang, Ning January 2023 (has links)
Machine learning models are widely used in various industries. However, the black-box nature of the model limits users’ understanding and trust in its inner workings, and the interpretability of the model becomes critical. For example, when a person’s loan application is rejected, he may want to understand the reason for the rejection and seek to improve his personal information to increase his chances of approval. Counterfactual explanation is a method used to explain the different outcomes of a specific event or situation. It modifies or manipulates the original data to generate counterfactual instances to make the model make other decision results. This paper proposes a counterfactual explanation method based on Generative Adversarial Networks (GAN) and applies it to image recognition. Counterfactual explanation aims to make the model change the predictions by modifying the feature information of the input image. Traditional machine learning methods have apparent shortcomings in computational resources when training and have specific bottlenecks in practical applications. This article builds a counterfactual explanation model based on Deep Convolutional Generative Adversarial Network (DCGAN).The original random noise input of DCGAN is converted into an image, and the perturbation is generated by the generator in the GAN network, which is combined with the original image to generate counterfactual samples. The experimental results show that the counterfactual samples generated based on GAN are better than the traditional machine learning model regarding generation efficiency and accuracy, thus verifying the effectiveness and advancement of the method proposed in this article.
|
4 |
Exploring State-of-the-Art Machine Learning Methods for Quantifying Exercise-induced Muscle Fatigue / Exploring State-of-the-Art Machine Learning Methods for Quantifying Exercise-induced Muscle FatigueAfram, Abboud, Sarab Fard Sabet, Danial January 2023 (has links)
Muscle fatigue is a severe problem for elite athletes, and this is due to the long resting times, which can vary. Various mechanisms can cause muscle fatigue which signifies that the specific muscle has reached its maximum force and cannot continue the task. This thesis was about surveying and exploring state-of-the-art methods and systematically, theoretically, and practically testing the applicability and performance of more recent machine learning methods on an existing EMG to muscle fatigue pipeline. Several challenges within the EMG domain exist, such as inadequate data, finding the most suitable model, and how they should be addressed to achieve reliable prediction. This required approaches for addressing these problems by combining and comparing various state-of-the-art methodologies, such as data augmentation techniques for upsampling, spectrogram methods for signal processing, and transfer learning to gain a reliable prediction by various pre-trained CNN models. The approach during this study was to conduct seven experiments consisting of a classification task that aims to predict muscle fatigue in various stages. These stages are divided into 7 classes from 0-6, and higher classes represent a fatigued muscle. In the tabular part of the experiments, the Decision Tree, Random Forest, and Support Vector Machine (SVM) were trained, and the accuracy was determined. A similar approach was made for the spectrogram part, where the signals were converted to spectrogram images, and with a combination of traditional- and intelligent data augmentation techniques, such as noise and DCGAN, the limited dataset was increased. A comparison between the performance of AlexNet, VGG16, DenseNet, and InceptionV3 pre-trained CNN models was made to predict differences in jump heights. The result was evaluated by implementing baseline classifiers on tabular data and pre-trained CNN model classifiers for CWT and STFT spectrograms with and without data augmentation. The evaluation of various state-of-the-art methodologies for a classification problem showed that DenseNet and VGG16 gave a reliable accuracy of 89.8 % on intelligent data augmented CWT images. The intelligent data augmentation applied on CWT images allows the pre-trained CNN models to learn features that can generalize unseen data. Proving that the combination of state-of-the-art methods can be introduced and address the challenges within the EMG domain.
|
5 |
Machine Learning for Glaucoma Assessment using Fundus ImagesDíaz Pinto, Andrés Yesid 29 July 2019 (has links)
[ES] Las imágenes de fondo de ojo son muy utilizadas por los oftalmólogos para la evaluación de la retina y la detección de glaucoma. Esta patología es la segunda causa de ceguera en el mundo, según estudios de la Organización Mundial de la Salud (OMS).
En esta tesis doctoral, se estudian algoritmos de aprendizaje automático (machine learning) para la evaluación automática del glaucoma usando imágenes de fondo de ojo. En primer lugar, se proponen dos métodos para la segmentación automática. El primer método utiliza la transformación Watershed Estocástica para segmentar la copa óptica y posteriormente medir características clínicas como la relación Copa/Disco y la regla ISNT. El segundo método es una arquitectura U-Net que se usa específicamente para la segmentación del disco óptico y la copa óptica.
A continuación, se presentan sistemas automáticos de evaluación del glaucoma basados en redes neuronales convolucionales (CNN por sus siglas en inglés). En este enfoque se utilizan diferentes modelos entrenados en ImageNet como clasificadores automáticos de glaucoma, usando fine-tuning. Esta nueva técnica permite detectar el glaucoma sin segmentación previa o extracción de características. Además, este enfoque presenta una mejora considerable del rendimiento comparado con otros trabajos del estado del arte.
En tercer lugar, dada la dificultad de obtener grandes cantidades de imágenes etiquetadas (glaucoma/no glaucoma), esta tesis también aborda el problema de la síntesis de imágenes de la retina. En concreto se analizaron dos arquitecturas diferentes para la síntesis de imágenes, las arquitecturas Variational Autoencoder (VAE) y la Generative Adversarial Networks (GAN). Con estas arquitecturas se generaron imágenes sintéticas que se analizaron cualitativa y cuantitativamente, obteniendo un rendimiento similar a otros trabajos en la literatura.
Finalmente, en esta tesis se plantea la utilización de un tipo de GAN (DCGAN) como alternativa a los sistemas automáticos de evaluación del glaucoma presentados anteriormente. Para alcanzar este objetivo se implementó un algoritmo de aprendizaje semi-supervisado. / [CA] Les imatges de fons d'ull són molt utilitzades pels oftalmòlegs per a l'avaluació de la retina i la detecció de glaucoma. Aquesta patologia és la segona causa de ceguesa al món, segons estudis de l'Organització Mundial de la Salut (OMS).
En aquesta tesi doctoral, s'estudien algoritmes d'aprenentatge automàtic (machine learning) per a l'avaluació automàtica del glaucoma usant imatges de fons d'ull. En primer lloc, es proposen dos mètodes per a la segmentació automàtica. El primer mètode utilitza la transformació Watershed Estocàstica per segmentar la copa òptica i després mesurar característiques clíniques com la relació Copa / Disc i la regla ISNT. El segon mètode és una arquitectura U-Net que s'usa específicament per a la segmentació del disc òptic i la copa òptica.
A continuació, es presenten sistemes automàtics d'avaluació del glaucoma basats en xarxes neuronals convolucionals (CNN per les sigles en anglès). En aquest enfocament s'utilitzen diferents models entrenats en ImageNet com classificadors automàtics de glaucoma, usant fine-tuning. Aquesta nova tècnica permet detectar el glaucoma sense segmentació prèvia o extracció de característiques. A més, aquest enfocament presenta una millora considerable del rendiment comparat amb altres treballs de l'estat de l'art.
En tercer lloc, donada la dificultat d'obtenir grans quantitats d'imatges etiquetades (glaucoma / no glaucoma), aquesta tesi també aborda el problema de la síntesi d'imatges de la retina. En concret es van analitzar dues arquitectures diferents per a la síntesi d'imatges, les arquitectures Variational Autoencoder (VAE) i la Generative adversarial Networks (GAN). Amb aquestes arquitectures es van generar imatges sintètiques que es van analitzar qualitativament i quantitativament, obtenint un rendiment similar a altres treballs a la literatura.
Finalment, en aquesta tesi es planteja la utilització d'un tipus de GAN (DCGAN) com a alternativa als sistemes automàtics d'avaluació del glaucoma presentats anteriorment. Per assolir aquest objectiu es va implementar un algoritme d'aprenentatge semi-supervisat. / [EN] Fundus images are widely used by ophthalmologists to assess the retina and detect glaucoma, which is, according to studies from the World Health Organization (WHO), the second cause of blindness worldwide.
In this thesis, machine learning algorithms for automatic glaucoma assessment using fundus images are studied. First, two methods for automatic segmentation are proposed. The first method uses the Stochastic Watershed transformation to segment the optic cup and measures clinical features such as the Cup/Disc ratio and ISNT rule. The second method is a U-Net architecture focused on the optic disc and optic cup segmentation task.
Secondly, automated glaucoma assessment systems using convolutional neural networks (CNNs) are presented. In this approach, different ImageNet-trained models are fine-tuned and used as automatic glaucoma classifiers. These new techniques allow detecting glaucoma without previous segmentation or feature extraction. Moreover, it improves the performance of other state-of-art works.
Thirdly, given the difficulty of getting large amounts of glaucoma-labelled images, this thesis addresses the problem of retinal image synthesis. Two different architectures for image synthesis, the Variational Autoencoder (VAE) and Generative Adversarial Networks (GAN) architectures, were analysed. Using these models, synthetic images that were qualitative and quantitative analysed, reporting state-of-the-art performance, were generated.
Finally, an adversarial model is used to create an alternative automatic glaucoma assessment system. In this part, a semi-supervised learning algorithm was implemented to reach this goal. / The research derived from this doctoral thesis has been supported by the Generalitat Valenciana under the scholarship Santiago Grisolía [GRISOLIA/2015/027]. / Díaz Pinto, AY. (2019). Machine Learning for Glaucoma Assessment using Fundus Images [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/124351
|
Page generated in 0.0148 seconds