Global ETD Search

21	Multi-scale Methods for Omnidirectional Stereo with Application to Real-time Virtual Walkthroughs Brunton, Alan P 28 November 2012 (has links) This thesis addresses a number of problems in computer vision, image processing, and geometry processing, and presents novel solutions to these problems. The overarching theme of the techniques presented here is a multi-scale approach, leveraging mathematical tools to represent images and surfaces at different scales, and methods that can be adapted from one type of domain (eg., the plane) to another (eg., the sphere). The main problem addressed in this thesis is known as stereo reconstruction: reconstructing the geometry of a scene or object from two or more images of that scene. We develop novel algorithms to do this, which work for both planar and spherical images. By developing a novel way to formulate the notion of disparity for spherical images, we are able effectively adapt our algorithms from planar to spherical images. Our stereo reconstruction algorithm is based on a novel application of distance transforms to multi-scale matching. We use matching information aggregated over multiple scales, and enforce consistency between these scales using distance transforms. We then show how multiple spherical disparity maps can be efficiently and robustly fused using visibility and other geometric constraints. We then show how the reconstructed point clouds can be used to synthesize a realistic sequence of novel views, images from points of view not captured in the input images, in real-time. Along the way to this result, we address some related problems. For example, multi-scale features can be detected in spherical images by convolving those images with a filterbank, generating an overcomplete spherical wavelet representation of the image from which the multiscale features can be extracted. Convolution of spherical images is much more efficient in the spherical harmonic domain than in the spatial domain. Thus, we develop a GPU implementation for fast spherical harmonic transforms and frequency domain convolutions of spherical images. This tool can also be used to detect multi-scale features on geometric surfaces. When we have a point cloud of a surface of a particular class of object, whether generated by stereo reconstruction or by some other modality, we can use statistics and machine learning to more robustly estimate the surface. If we have at our disposal a database of surfaces of a particular type of object, such as the human face, we can compute statistics over this database to constrain the possible shape a new surface of this type can take. We show how a statistical spherical wavelet shape prior can be used to efficiently and robustly reconstruct a face shape from noisy point cloud data, including stereo data. multi-scale wavelets stereo reconstruction omnidirectional vision real-time novel view synthesis real-time virtual walkthroughs spherical parameterizations spherical harmonics GPU programming
22	Free View Rendering for 3D Video : Edge-Aided Rendering and Depth-Based Image Inpainting Muddala, Suryanarayana Murthy January 2015 (has links) Three Dimensional Video (3DV) has become increasingly popular with the success of 3D cinema. Moreover, emerging display technology offers an immersive experience to the viewer without the necessity of any visual aids such as 3D glasses. 3DV applications, Three Dimensional Television (3DTV) and Free Viewpoint Television (FTV) are auspicious technologies for living room environments by providing immersive experience and look around facilities. In order to provide such an experience, these technologies require a number of camera views captured from different viewpoints. However, the capture and transmission of the required number of views is not a feasible solution, and thus view rendering is employed as an efficient solution to produce the necessary number of views. Depth-image-based rendering (DIBR) is a commonly used rendering method. Although DIBR is a simple approach that can produce the desired number of views, inherent artifacts are major issues in the view rendering. Despite much effort to tackle the rendering artifacts over the years, rendered views still contain visible artifacts. This dissertation addresses three problems in order to improve 3DV quality: 1) How to improve the rendered view quality using a direct approach without dealing each artifact specifically. 2) How to handle disocclusions (a.k.a. holes) in the rendered views in a visually plausible manner using inpainting. 3) How to reduce spatial inconsistencies in the rendered view. The first problem is tackled by an edge-aided rendering method that uses a direct approach with one-dimensional interpolation, which is applicable when the virtual camera distance is small. The second problem is addressed by using a depth-based inpainting method in the virtual view, which reconstructs the missing texture with background data at the disocclusions. The third problem is undertaken by a rendering method that firstly inpaint occlusions as a layered depth image (LDI) in the original view, and then renders a spatially consistent virtual view. Objective assessments of proposed methods show improvements over the state-of-the-art rendering methods. Visual inspection shows slight improvements for intermediate views rendered from multiview videos-plus-depth, and the proposed methods outperforms other view rendering methods in the case of rendering from single view video-plus-depth. Results confirm that the proposed methods are capable of reducing rendering artifacts and producing spatially consistent virtual views. In conclusion, the view rendering methods proposed in this dissertation can support the production of high quality virtual views based on a limited number of input views. When used to create a multi-scopic presentation, the outcome of this dissertation can benefit 3DV technologies to improve the immersive experience. 3DV 3DTV FTV view rendering depth-image-based rendering hole-filling disocclusion filling inpainting texture synthesis view synthesis layered depth image
23	Multi-scale Methods for Omnidirectional Stereo with Application to Real-time Virtual Walkthroughs Brunton, Alan P 28 November 2012 (has links) This thesis addresses a number of problems in computer vision, image processing, and geometry processing, and presents novel solutions to these problems. The overarching theme of the techniques presented here is a multi-scale approach, leveraging mathematical tools to represent images and surfaces at different scales, and methods that can be adapted from one type of domain (eg., the plane) to another (eg., the sphere). The main problem addressed in this thesis is known as stereo reconstruction: reconstructing the geometry of a scene or object from two or more images of that scene. We develop novel algorithms to do this, which work for both planar and spherical images. By developing a novel way to formulate the notion of disparity for spherical images, we are able effectively adapt our algorithms from planar to spherical images. Our stereo reconstruction algorithm is based on a novel application of distance transforms to multi-scale matching. We use matching information aggregated over multiple scales, and enforce consistency between these scales using distance transforms. We then show how multiple spherical disparity maps can be efficiently and robustly fused using visibility and other geometric constraints. We then show how the reconstructed point clouds can be used to synthesize a realistic sequence of novel views, images from points of view not captured in the input images, in real-time. Along the way to this result, we address some related problems. For example, multi-scale features can be detected in spherical images by convolving those images with a filterbank, generating an overcomplete spherical wavelet representation of the image from which the multiscale features can be extracted. Convolution of spherical images is much more efficient in the spherical harmonic domain than in the spatial domain. Thus, we develop a GPU implementation for fast spherical harmonic transforms and frequency domain convolutions of spherical images. This tool can also be used to detect multi-scale features on geometric surfaces. When we have a point cloud of a surface of a particular class of object, whether generated by stereo reconstruction or by some other modality, we can use statistics and machine learning to more robustly estimate the surface. If we have at our disposal a database of surfaces of a particular type of object, such as the human face, we can compute statistics over this database to constrain the possible shape a new surface of this type can take. We show how a statistical spherical wavelet shape prior can be used to efficiently and robustly reconstruct a face shape from noisy point cloud data, including stereo data. multi-scale wavelets stereo reconstruction omnidirectional vision real-time novel view synthesis real-time virtual walkthroughs spherical parameterizations spherical harmonics GPU programming
24	Multi-scale Methods for Omnidirectional Stereo with Application to Real-time Virtual Walkthroughs Brunton, Alan P January 2012 (has links) This thesis addresses a number of problems in computer vision, image processing, and geometry processing, and presents novel solutions to these problems. The overarching theme of the techniques presented here is a multi-scale approach, leveraging mathematical tools to represent images and surfaces at different scales, and methods that can be adapted from one type of domain (eg., the plane) to another (eg., the sphere). The main problem addressed in this thesis is known as stereo reconstruction: reconstructing the geometry of a scene or object from two or more images of that scene. We develop novel algorithms to do this, which work for both planar and spherical images. By developing a novel way to formulate the notion of disparity for spherical images, we are able effectively adapt our algorithms from planar to spherical images. Our stereo reconstruction algorithm is based on a novel application of distance transforms to multi-scale matching. We use matching information aggregated over multiple scales, and enforce consistency between these scales using distance transforms. We then show how multiple spherical disparity maps can be efficiently and robustly fused using visibility and other geometric constraints. We then show how the reconstructed point clouds can be used to synthesize a realistic sequence of novel views, images from points of view not captured in the input images, in real-time. Along the way to this result, we address some related problems. For example, multi-scale features can be detected in spherical images by convolving those images with a filterbank, generating an overcomplete spherical wavelet representation of the image from which the multiscale features can be extracted. Convolution of spherical images is much more efficient in the spherical harmonic domain than in the spatial domain. Thus, we develop a GPU implementation for fast spherical harmonic transforms and frequency domain convolutions of spherical images. This tool can also be used to detect multi-scale features on geometric surfaces. When we have a point cloud of a surface of a particular class of object, whether generated by stereo reconstruction or by some other modality, we can use statistics and machine learning to more robustly estimate the surface. If we have at our disposal a database of surfaces of a particular type of object, such as the human face, we can compute statistics over this database to constrain the possible shape a new surface of this type can take. We show how a statistical spherical wavelet shape prior can be used to efficiently and robustly reconstruct a face shape from noisy point cloud data, including stereo data. multi-scale wavelets stereo reconstruction omnidirectional vision real-time novel view synthesis real-time virtual walkthroughs spherical parameterizations spherical harmonics GPU programming
25	Codage multi-vues multi-profondeur pour de nouveaux services multimédia / Multiview video plus depth coding for new multimedia services Mora, Elie-Gabriel 04 February 2014 (has links) Les travaux effectués durant cette thèse de doctorat ont pour but d’augmenter l’efficacité de codage dans 3D-HEVC. Nous proposons des approches conventionnelles orientées vers la normalisation vidéo, ainsi que des approches en rupture basées sur le flot optique. En approches conventionnelles, nous proposons une méthode qui prédit les modes Intra de profondeur avec ceux de texture. L’héritage est conditionné par un critère qui mesure le degré de similitude entre les deux modes. Ensuite, nous proposons deux méthodes pour améliorer la prédiction inter-vue du mouvement dans 3D-HEVC. La première ajoute un vecteur de disparité comme candidat inter-vue dans la liste des candidats du Merge, et la seconde modifie le processus de dérivation de ce vecteur. Finalement, un outil de codage intercomposantes est proposé, où le lien entre les arbres quaternaires de texture et de profondeur est exploité pour réduire le temps d’encodage et le débit, à travers un codage conjoint des deux arbres. Dans la catégorie des approches en rupture, nous proposons deux méthodes basées sur l’estimation de champs denses de vecteurs de mouvement en utilisant le flot optique. La première calcule un champ au niveau d’une vue de base reconstruite, puis l’extrapole au niveau d’une vue dépendante, où il est hérité par les unités de prédiction en tant que candidat dense du Merge. La deuxième méthode améliore la synthèse de vues : quatre champs sont calculés au niveau de deux vues de référence en utilisant deux références temporelles. Ils sont ensuite extrapolés au niveau d’une vue synthétisée et corrigés en utilisant une contrainte épipolaire. Les quatre prédictions correspondantes sont ensuite combinées. / This PhD. thesis deals with improving the coding efficiency in 3D-HEVC. We propose both constrained approaches aimed towards standardization, and also more innovative approaches based on optical flow. In the constrained approaches category, we first propose a method that predicts the depth Intra modes using the ones of the texture. The inheritance is driven by a criterion measuring how much the two are expected to match. Second, we propose two simple ways to improve inter-view motion prediction in 3D-HEVC. The first adds an inter-view disparity vector candidate in the Merge list and the second modifies the derivation process of this disparity vector. Third, an inter-component tool is proposed where the link between the texture and depth quadtree structures is exploited to save both runtime and bits through a joint coding of the quadtrees. In the more innovative approaches category, we propose two methods that are based on a dense motion vector field estimation using optical flow. The first computes such a field on a reconstructed base view. It is then warped at the level of a dependent view where it is inserted as a dense candidate in the Merge list of prediction units in that view. The second method improves the view synthesis process: four fields are computed at the level of the left and right reference views using a past and a future temporal reference. These are then warped at the level of the synthesized view and corrected using an epipolar constraint. The four corresponding predictions are then blended together. Both methods bring significant coding gains which confirm the potential of such innovative solutions. 3D-HEVC Synthèse de vues Flot optique Liste des candidats du Merge Vecteur de disparité Vecteur de mouvement Mode Intra 3D-HEVC View synthesis Optical flow Merge candidate list Quadtree initialization and limitation Disparity vector Motion vector Intra mode

Page generated in 0.0613 seconds