Urban image analysis is one of the most important problems lying at the intersection of computer graphics and computer vision research. In addition, Convolutional Sparse Coding (CSC) is a well-established image representation model especially suited for image restoration tasks. This dissertation handles urban image analysis using an asset extraction framework, studies CSC for the reconstruction of both urban and general images using supervised data, and proposes a better computational approach to CSC.
Our asset extraction framework uses object proposals which are currently used for increasing the computational efficiency of object detection. In this dissertation, we propose a novel adaptive pipeline for interleaving object proposals with object classification and use it as a formulation for asset detection. We first preprocess the images using a novel and efficient rectification technique. We then employ a particle filter approach to keep track of three priors, which guide proposed samples and get updated using classifier output. Tests performed on over 1000 urban images demonstrate that our rectification method is faster than existing methods without loss in quality, and that our interleaved proposal method outperforms current state-of-the-art. We further demonstrate that other methods can be improved by incorporating our interleaved proposals.
We also extend the applicability of the CSC model by proposing a supervised approach to the problem, which aims at learning discriminative dictionaries instead of purely reconstructive ones. We incorporate a supervised regularization term into the traditional unsupervised CSC objective to encourage the final dictionary elements to be discriminative. Experimental results show that using supervised convolutional learning results in two key advantages. First, we learn more semantically relevant filters in the dictionary and second, we achieve improved image reconstruction on unseen data.
We finally present two computational contributions to the state of the art in CSC. First, we significantly speed up the computation by proposing a new optimization framework that tackles the problem in the dual domain. Second, we extend the original formulation to higher dimensions in order to process a wider range of inputs, such as RGB images and videos. Our results show up to 20 times speedup compared to current state-of-the-art CSC solvers.
Identifer | oai:union.ndltd.org:kaust.edu.sa/oai:repository.kaust.edu.sa:10754/628738 |
Date | 18 September 2018 |
Creators | Affara, Lama Ahmed |
Contributors | Wonka, Peter, Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division, Heidrich, Wolfgang, Ghanem, Bernard, Wright, John |
Source Sets | King Abdullah University of Science and Technology |
Language | English |
Detected Language | English |
Type | Dissertation |
Page generated in 0.0022 seconds