Return to search

Analysis and Applications of Deep Learning Features on Visual Tasks

Benefiting from hardware development, deep learning (DL) has become a popular research area in recent decades. Convolutional neural network (CNN) is a critical deep learning tool that has been utilized in many computer vision problems. Moreover, the data-driven approach has unleashed CNN's potential in acquiring impressive learning ability with minimum human supervision. Therefore, many computer vision problems are brought into the spotlight again. In this thesis, we investigate the application of deep-learning-based methods, particularly the role of deep learning features, in two representative visual tasks: image retrieval and image inpainting.

Image retrieval aims to find in a dataset images similar to a query image.
In the proposed image retrieval method, we use canonical correlation analysis to explore the relationship between matching and non-matching features from pre-trained CNN, and generate compact transformed features. The level of similarity between two images is determined by a hypothesis test regarding the joint distribution of transformed image feature pairs. The proposed approach is benchmarked against three popular statistical analysis methods, Linear Discriminant Analysis (LDA), Principal Component Analysis with whitening (PCAw), and Supervised Principal Component Analysis (SPCA). Our approach is shown to achieve competitive retrieval performances on Oxford5k, Paris6k, rOxford, and rParis datasets.

Moreover, an image inpainting framework is proposed to reconstruct the corrupted region in an image progressively. Specifically, we design a feature extraction network inspired by Gaussian and Laplacian pyramid, which is usually used to decompose the image into different frequency components. Furthermore, we use a two-branch iterative inpainting network to progressively recover the corrupted region on high and low-frequency features respectively and fuse both high and low-frequency features from each iteration. Moreover, an enhancement model is introduced to employ neighbouring iterations' features to further improve intermediate iterations' features. The proposed network is evaluated on popular image inpainting datasets such as Paris Streetview, Celeba, and Place2.
Extensive experiments prove the validity of the proposed method in this thesis, and demonstrate the competitive performance against the state-of-the-art. / Thesis / Doctor of Philosophy (PhD)

Identiferoai:union.ndltd.org:mcmaster.ca/oai:macsphere.mcmaster.ca:11375/28152
Date January 2022
CreatorsShi, Kangdi
ContributorsChen, Jun, Electrical and Computer Engineering
Source SetsMcMaster University
LanguageEnglish
Detected LanguageEnglish
TypeThesis

Page generated in 0.0095 seconds