One of the state-of-the-art highlights of deep learning in the past ten years is the introduction of generative adversarial networks (GANs), which had achieved great success in their ability to generate images comparable to real photos with minimum human intervention. These networks can generalise to a multitude of desired outputs, especially in image-to-image problems and image syntheses. This thesis proposes a computer graphics pipeline for 3D rendering by utilising generative adversarial networks (GANs).
This thesis is motivated by regression models and convolutional neural networks (ConvNets) such as U-Net architectures, which can be directed to generate realistic global illumination effects, by using a semi-supervised GANs model (Pix2pix) that is comprised of PatchGAN and conditional GAN which is then accompanied by a U-Net structure. Pix2pix had been chosen for this thesis for its ability for training as well as the quality of the output images. It is also different from other forms of GANs by utilising colour labels, which enables further control and consistency of the geometries that comprises the output image.
The series of experiments were carried out with laboratory created image sets, to pursue the possibility of which deep learning and generative adversarial networks can lend a hand to enhance the pipeline and speed up the 3D rendering process. First, ConvNet is applied in combination with Support Vector Machine (SVM) in order to pair 3D objects with their corresponding shadows, which can be applied in Augmenter Reality (AR) scenarios. Second, a GANs approach is presented to generate shadows for non-shadowed 3D models, which can also be beneficial in AR scenarios. Third, the possibility of generating high quality renders of image sequences from low polygon density 3D models using GANs. Finally, the possibility to enhance visual coherence of the output image sequences of GAN by utilising multi-colour labels.
The results of the adopted GANs model were able to generate realistic outputs comparable to the lab generated 3D rendered ground-truth and control group output images with plausible scores on PSNR and SSIM similarity index metrices.
Identifer | oai:union.ndltd.org:BRADFORD/oai:bradscholars.brad.ac.uk:10454/19220 |
Date | January 2020 |
Creators | Taif, Khasrouf M.M. |
Contributors | Ugail, Hassan, Mehmoud, Irfan, Yarmouk University |
Publisher | University of Bradford, Faculty of Engineering and Informatics. School of Media, Design and Technology |
Source Sets | Bradford Scholars |
Language | English |
Detected Language | English |
Type | Thesis, doctoral, PhD |
Rights | <a rel="license" href="http://creativecommons.org/licenses/by-nc-nd/3.0/"><img alt="Creative Commons License" style="border-width:0" src="http://i.creativecommons.org/l/by-nc-nd/3.0/88x31.png" /></a><br />The University of Bradford theses are licenced under a <a rel="license" href="http://creativecommons.org/licenses/by-nc-nd/3.0/">Creative Commons Licence</a>. |
Page generated in 0.0023 seconds