Recent years have witnessed tremendous success in using deep learning approaches to handle low-level vision problems. Most of the deep learning based methods address the low-level vision problem by training a neural network to approximate the mapping from the inputs to the desired ground truths. However, directly learning this mapping is usually difficult and cannot achieve ideal performance. Besides, under the setting of unsupervised learning, the general deep learning approach cannot be used. In this thesis, we investigate and address several problems in low-level vision using the proposed approaches.
To learn a better mapping using the existing data, an indirect domain shift mechanism is proposed to add explicit constraints inside the neural network for single image dehazing. This allows the neural network to be optimized across several identified neighbours, resulting in a better performance.
Despite the success of the proposed approaches in learning an improved mapping from the inputs to the targets, three problems of unsupervised learning is also investigated. For unsupervised monocular depth estimation, a teacher-student network is introduced to strategically integrate both supervised and unsupervised learning benefits. The teacher network is formed by learning under the binocular depth estimation setting, and the student network is constructed as the primary network for monocular depth estimation. In observing that the performance of the teacher network is far better than that of the student network, a knowledge distillation approach is proposed to help improve the mapping learned by the student.
For single image dehazing, the current network cannot handle different types of haze patterns as it is trained on a particular dataset. The problem is formulated as a multi-domain dehazing problem. To address this issue, a test-time training approach is proposed to leverage a helper network in assisting the dehazing network adapting to a particular domain using self-supervision.
In lossy compression system, the target distribution can be different from that of the source and ground truths are not available for reference.
Thus, the objective is to transform the source to target under a rate constraint, which generalizes the optimal transport. To address this problem, theoretical analyses on the trade-off between compression rate and minimal achievable distortion are studied under the cases with and without common randomness. A deep learning approach is also developed using our theoretical results for addressing super-resolution and denoising tasks.
Extensive experiments and analyses have been conducted to prove the effectiveness of the proposed deep learning based methods in handling the problems in low-level vision. / Thesis / Doctor of Philosophy (PhD)
Identifer | oai:union.ndltd.org:mcmaster.ca/oai:macsphere.mcmaster.ca:11375/27574 |
Date | January 2022 |
Creators | Liu, Huan |
Contributors | Chen, Jun, Electrical and Computer Engineering |
Source Sets | McMaster University |
Language | English |
Detected Language | English |
Type | Thesis |
Page generated in 0.0022 seconds