Global ETD Search

1	Wide Activated Separate 3D Convolution for Video Super-Resolution Yu, Xiafei 18 December 2019 (has links) Video super-resolution (VSR) aims to recover a realistic high-resolution (HR) frame from its corresponding center low-resolution (LR) frame and several neighbouring supporting frames. The neighbouring supporting LR frames can provide extra information to help recover the HR frame. However, these frames are not aligned with the center frame due to the motion of objects. Recently, many video super-resolution methods based on deep learning have been proposed with the rapid development of neural networks. Most of these methods utilize motion estimation and compensation models as preprocessing to handle spatio-temporal alignment problem. Therefore, the accuracy of these motion estimation models are critical for predicting the high-resolution frames. Inaccurate results of motion compensation models will lead to artifacts and blurs, which also will damage the recovery of high-resolution frames. We propose an effective wide activated separate 3 dimensional (3D) Convolution Neural Network (CNN) for video super-resolution to overcome the drawback of utilizing motion compensation models. Separate 3D convolution factorizes the 3D convolution into convolutions in the spatial and temporal domain, which have benefit for the optimization of spatial and temporal convolution components. Therefore, our method can capture temporal and spatial information of input frames simultaneously without additional motion evaluation and compensation model. Moreover, the experimental results demonstrated the effectiveness of the proposed wide activated separate 3D CNN. Convolution Neural Network Residual Network Separate 3D Convolution Neural Network Video Super-Resolution
2	Multi-Kernel Deformable 3D Convolution for Video Super-Resolution Dou, Tianyu 17 September 2021 (has links) Video super-resolution (VSR) methods align and fuse consecutive low-resolution frames to generate high-resolution frames. One of the main difficulties for the VSR process is that video contains various motions, and the accuracy of motion estimation dramatically affects the quality of video restoration. However, standard CNNs share the same receptive field in each layer, and it is challenging to estimate diverse motions effectively. Neuroscience research has shown that the receptive fields of biological visual areas will be adjusted according to the input information. Diverse receptive fields in temporal and spatial dimensions have the potential to adapt to various motions, which is rarely paid attention in most known VSR methods. In this thesis, we propose to provide adaptive receptive fields for the VSR model. Firstly, we design a multi-kernel 3D convolution network and integrate it with a multi-kernel deformable convolution network for motion estimation and multiple frames alignment. Secondly, we propose a 2D multi-kernel convolution framework to improve texture restoration quality. Our experimental results show that the proposed framework outperforms the state-of-the-art VSR methods. Attention mechanism CNN Deformable convolution Separate 3D convolution Video super-resolution

Search results

Wide Activated Separate 3D Convolution for Video Super-Resolution

Multi-Kernel Deformable 3D Convolution for Video Super-Resolution