Global ETD Search

Return to search

Handling Invalid Pixels in Convolutional Neural Networks

Most neural networks use a normal convolutional layer that assumes that all input pixels are valid pixels. However, pixels added to the input through padding result in adding extra information that was not initially present. This extra information can be considered invalid. Invalid pixels can also be inside the image where they are referred to as holes in completion tasks like image inpainting. In this work, we look for a method that can handle both types of invalid pixels. We compare on the same test bench two methods previously used to handle invalid pixels outside the image (Partial and Edge convolutions) and one method that was designed for invalid pixels inside the image (Gated convolution). We show that Partial convolution performs the best in image classification while Gated convolution has the advantage on semantic segmentation. As for hotel recognition with masked regions, none of the methods seem appropriate to generate embeddings that leverage the masked regions. / Master of Science / A module at the heart of deep neural networks built for Artificial Intelligence is the convolutional layer. When multiple convolutional layers are used together with other modules, a Convolutional Neural Network (CNN) is obtained. These CNNs can be used for tasks such as image classification where they tell if the object in an image is a chair or a car, for example. Most CNNs use a normal convolutional layer that assumes that all parts of the image fed to the network are valid. However, most models zero pad the image at the beginning to maintain a certain output shape. Zero padding is equivalent to adding a black frame around the image. These added pixels result in adding information that was not initially present. Therefore, this extra information can be considered invalid. Invalid pixels can also be inside the image where they are referred to as holes in completion tasks like image inpainting where the network is asked to fill these holes and give a realistic image. In this work, we look for a method that can handle both types of invalid pixels. We compare on the same test bench two methods previously used to handle invalid pixels outside the image (Partial and Edge convolutions) and one method that was designed for invalid pixels inside the image (Gated convolution). We show that Partial convolution performs the best in image classification while Gated convolution has the advantage on semantic segmentation. As for hotel recognition with masked regions, none of the methods seem appropriate to generate embeddings that leverage the masked regions.

Identifer	oai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/98619
Date	29 May 2020
Creators	Messou, Ehounoud Joseph Christopher
Contributors	Electrical and Computer Engineering, Huang, Jia-Bin, Dhillon, Harpreet Singh, Abbott, A. Lynn
Publisher	Virginia Tech
Source Sets	Virginia Tech Theses and Dissertation
Detected Language	English
Type	Thesis
Format	ETD, application/pdf
Rights	In Copyright, http://rightsstatements.org/vocab/InC/1.0/

Page generated in 0.0018 seconds

Handling Invalid Pixels in Convolutional Neural Networks

Description

Links & Downloads

Tags

Additional Fields