• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 14
  • 4
  • 3
  • Tagged with
  • 24
  • 24
  • 7
  • 6
  • 5
  • 5
  • 5
  • 4
  • 4
  • 4
  • 4
  • 4
  • 4
  • 4
  • 4
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

User Directed View Synthesis On Omap Processors

Yildiz, Mursel 01 July 2009 (has links) (PDF)
In this thesis, real time image rendering for hand held devices is studied according to user&rsquo / s view point choice and using image frames with corresponding depth maps obtained from 2 different cameras, of which positions on coordinate system is known. User&rsquo / s view point choice is restricted to the area between right, and left cameras. Occlusion handling methods for image rendering systems is explored and discussed together with frame enhancement techniques. Median filtering is studied for multicolor image frames and post processing methods are discussed for image enhancement at the end of rendering algorithm. In this thesis, OMAP3530 microprocessor is used as the main processor which processes suggested rendering algorithm with occlusion handling and frame enhancement. proposed algorithms are implemented on DSP core and ARM cores of OMAP3530 separately and their performances are evaluated through experiments. Embedded Linux (Kernel-2.6.22) is run as the operating system for applications. Driver usage together with devices for Linux embedded operating system is explored and studied. 3 boards are used for the realization of proposed system. OMAP35x EVM board from Mistral Solutions Company is used for processor utilization, high resolution LCD utilization, system monitoring, user interface and communication purposes. Two daughter cards are designed for user view point determination. First daughter card handles communication process with EVM board and calculates view point according to input from second daughter card with single axis response GYRO sensor (ADIS16060). Spartan&reg / -3A DSP FPGA family is utilized in this system for view point determination. DSP slices that are hardly present inside gate arrays of this FPGA family are utilized and their performance is studied. Asynchronous memory interface, i2c bus interface, SPI interface are studied and implemented on FPGA.
12

Master Texture Space: An Efficient Encoding for Projectively Mapped Objects

Guinnip, David 01 January 2005 (has links)
Projectively textured models are used in an increasingly large number of applicationsthat dynamically combine images with a simple geometric surface in a viewpoint dependentway. These models can provide visual fidelity while retaining the effects affordedby geometric approximation such as shadow casting and accurate perspective distortion.However, the number of stored views can be quite large and novel views must be synthesizedduring the rendering process because no single view may correctly texture the entireobject surface. This work introduces the Master Texture encoding and demonstrates thatthe encoding increases the utility of projectively textured objects by reducing render-timeoperations. Encoding involves three steps; 1) all image regions that correspond to the samegeometric mesh element are extracted and warped to a facet of uniform size and shape,2) an efficient packing of these facets into a new Master Texture image is computed, and3) the visibility of each pixel in the new Master Texture data is guaranteed using a simplealgorithm to discard occluded pixels in each view. Because the encoding implicitly representsthe multi-view geometry of the multiple images, a single texture mesh is sufficientto render the view-dependent model. More importantly, every Master Texture image cancorrectly texture the entire surface of the object, removing expensive computations suchas visibility analysis from the rendering algorithm. A benefit of this encoding is the supportfor pixel-wise view synthesis. The utility of pixel-wise view synthesis is demonstratedwith a real-time Master Texture encoded VDTM application. Pixel-wise synthesis is alsodemonstrated with an algorithm that distills a set of Master Texture images to a singleview-independent Master Texture image.
13

Síntese de vistas em depht-image-based rendering (DIBR) / View synthesis with depth-image-based rendering (DIBR)

Oliveira, Adriano Quilião de January 2016 (has links)
Esta dissertação investiga soluções para o problema genérico de geração de vistas sintéticas a partir de um conjunto de imagens utilizando a abordagem Depth-Image-Based Rendering. Essa abordagem utiliza um formato compacto para a representação de imagens 3D, composto basicamente por duas imagens, uma colorida para a vista de referência e outra em tons de cinza com a correspondência de disparidade para cada pixel. Soluções para esse problema beneficiam aplicações como Free Viewpoint Television. O maior desafio é o preenchimento de regiões sem informação de projeção considerando o novo ponto de vista, genericamente denominados holes, além de outros artefatos como cracks e ghosts que ocorrem por oclusões e erros no mapa de disparidade. Nesta dissertação apresentamos técnicas para remoção e tratamento de cada uma das classes de potenciais artefatos. O conjunto de métodos propostos apresenta melhores resultados quando comparado com o atual estado da arte em geração de vistas sintéticas com o modelo DIBR para o conjunto de dados Middlebury, considerando-se as métricas SSIM e PSNR. / This dissertation investigates solutions to the general problem of generating synthetic views from a set of images using the Depth-Image-Based Rendering approach. This approach uses a compact format for the 3D image representation, composed basically of two images, one color image for the reference view and other grayscale image with the disparity information available for each pixel. Solutions to this problem benefit applications such as Free Viewpoint Television. The biggest challenge is filling in regions without projection information considering the new viewpoint, usually called holes, and other artifacts such as cracks and ghosts that occur due to occlusions and errors in the disparity map. In this dissertation we present techniques for removal and treatment of each of these classes of potential artifacts. The set of proposed methods shows improved results when compared to the current state of the art generation of synthetic views using the DIBR model applied to the Middlebury dataset, considering the SSIM and PSNR metrics.
14

Síntese de vistas em depht-image-based rendering (DIBR) / View synthesis with depth-image-based rendering (DIBR)

Oliveira, Adriano Quilião de January 2016 (has links)
Esta dissertação investiga soluções para o problema genérico de geração de vistas sintéticas a partir de um conjunto de imagens utilizando a abordagem Depth-Image-Based Rendering. Essa abordagem utiliza um formato compacto para a representação de imagens 3D, composto basicamente por duas imagens, uma colorida para a vista de referência e outra em tons de cinza com a correspondência de disparidade para cada pixel. Soluções para esse problema beneficiam aplicações como Free Viewpoint Television. O maior desafio é o preenchimento de regiões sem informação de projeção considerando o novo ponto de vista, genericamente denominados holes, além de outros artefatos como cracks e ghosts que ocorrem por oclusões e erros no mapa de disparidade. Nesta dissertação apresentamos técnicas para remoção e tratamento de cada uma das classes de potenciais artefatos. O conjunto de métodos propostos apresenta melhores resultados quando comparado com o atual estado da arte em geração de vistas sintéticas com o modelo DIBR para o conjunto de dados Middlebury, considerando-se as métricas SSIM e PSNR. / This dissertation investigates solutions to the general problem of generating synthetic views from a set of images using the Depth-Image-Based Rendering approach. This approach uses a compact format for the 3D image representation, composed basically of two images, one color image for the reference view and other grayscale image with the disparity information available for each pixel. Solutions to this problem benefit applications such as Free Viewpoint Television. The biggest challenge is filling in regions without projection information considering the new viewpoint, usually called holes, and other artifacts such as cracks and ghosts that occur due to occlusions and errors in the disparity map. In this dissertation we present techniques for removal and treatment of each of these classes of potential artifacts. The set of proposed methods shows improved results when compared to the current state of the art generation of synthetic views using the DIBR model applied to the Middlebury dataset, considering the SSIM and PSNR metrics.
15

A Statistical Approach To View Synthesis

Berkowitz, Phillip 01 January 2009 (has links)
View Synthesis is the challenging problem of predicting a new view or pose of an object given an exemplar view or set of views. This thesis presents a novel approach for the problem of view synthesis. The proposed method uses global features rather than local geometry to achieve an effect similar to that of the well known view morphing method . While previous approaches to the view synthesis problem have shown impressive results, they are highly dependent on being able to solve for epipolar geometry and therefore have a very precise correspondence between reference images. In cases where this is not possible such as noisy data, low contrast data, or long wave infrared data an alternative approach is desirable. Here two problems will be considered. The proposed view synthesis method will be used to synthesis new views given a set of reference views. Additionally the algorithm will be extended to synthesis new lighting conditions and thermal signatures. Finally the algorithm will be applied toward enhancing the ATR problem by creating additional training data to increase the likelihood of detection and classification.
16

Novel Image Interpolation Schemes with Applications to Frame Rate Conversion and View Synthesis

Rezaee Kaviani, Hoda January 2018 (has links)
Image interpolation is the process of generating a new image utilizing a set of available images. The available images may be taken with a camera at different times, or with multiple cameras and from different viewpoints. Usually, the interpolation problem in the first scenario is called Frame Rate-Up Conversion (FRUC), and the second one view synthesis. This thesis focuses on image interpolation and addresses both FRUC and view synthesis problems. We propose a novel FRUC method using optical flow motion estimation and a patch-based reconstruction scheme. FRUC interpolates new frames between original frames of a video to increase the number of frames, and increases motion continuity. In our approach first, forward and backward motion vectors are obtained using an optical flow algorithm, and reconstructed versions of the current and previous frames are generated by our patch-based reconstruction scheme. Using the original and reconstructed versions of the current and previous frames, two mismatch masks are obtained. Then two versions of the middle frame are generated using a patch-based scheme, with estimated motion vectors and the current and previous frames. Finally, a middle mask, which identifies the mismatch areas of the two middle frames is reconstructed. Using these three masks, the best candidates for interpolation are selected and fused to obtain the final middle frame. Due to the patch-based nature of our interpolation scheme most of the holes and cracks will be filled. Although there is always a probability of having holes, the size and number of such holes are much smaller than those that would be generated using pixel-based mapping. The rare holes are filled using existing hole-filling algorithms. With fewer and smaller holes, simpler hole-filling algorithms can be applied to the image and the overall complexity of the required post processing decreases. View synthesis is the process of generating a new (virtual) view using available ones. Depending on the amount of available geometric information, view synthesis techniques can be divided into three categories: Image Based Rendering (IBR), Depth Image Based Rendering (DIBR), and Model Based Rendering (MBR). We introduce an adaptive version, patch-based scheme for IBR. This patch-based scheme reduces the size and number of holes during reconstruction. The size of patch is determined in response to edge information for better reconstruction, especially near the boundaries. In the first stage of the algorithm, disparity is obtained using optical flow estimation. Then, a reconstructed version of the left and right views are generated using our adaptive patch-based algorithm. The mismatches between each view and its reconstructed version are obtained in the mismatch detection steps. This stage results in two masks as outputs, which help with the refinement of disparities and the selection of the best patches for final synthesis. Finally, the remaining holes are filled using our simple hole filling scheme and the refined disparities. The adaptive version still benefits from the overlapping effect of the patches for hole reduction. However, compared with our fixed-size version, it results in better reconstruction near the edges, object boundaries, and inside the highly textured areas. We also propose an adaptive patch-based scheme for DIBR. The proposed method avoids unnecessary warping which is a computationally expensive step in DIBR. We divide nearby views into blocks, and only warp the center of each block. To have a better reconstruction near the edges and depth discontinuities, the block size is selected adaptively. In the blending step, an approach is introduced to calculate and refine the blending weights. Many of the existing DIBR schemes warp all pixels of nearby views during interpolation which is unnecessary. We show that using our adaptive patch-based scheme, it is possible to reduce the number of required warping without degrading the overall quality compared with existing schemes. / Thesis / Doctor of Philosophy (PhD)
17

Image Quality Assessment of 3D Synthesized Views / Évaluation de la qualité des images obtenues par synthèse de vues 3D

Tian, Shishun 22 March 2019 (has links)
Depth-Image-Based Rendering (DIBR) est une technologie fondamentale dans plusieurs applications liées à la 3D, telles que la vidéo en mode point de vue libre (FVV), la réalité virtuelle (VR) et la réalité augmentée (AR). Cependant, l'évaluation de la qualité des vues synthétisées par DIBR a également posé de nouveaux problèmes, car ce processus induit de nouveaux types de distorsions, qui sont intrinsèquement différentes des distorsions provoquées par le codage vidéo. Ce travail est destiné à mieux évaluer la qualité des vues synthétisées par DIBR en multimédia immersif. Au chapitre 2, nous proposons deux métriques complètements sans référence (NR). Le principe de la première métrique NR NIQSV consiste à utiliser plusieurs opérations morphologiques d’ouverture et de fermeture pour détecter et mesurer les distorsions, telles que les régions floues et l’effritement. Dans la deuxième métrique NR NIQSV+, nous améliorons NIQSV en ajoutant un détecteur de “black hole” et une détection “stretching”.Au chapitre 3, nous proposons deux métriques de référence complète pour traiter les distorsions géométriques à l'aide d'un masque de désocclusion et d'une méthode de correspondance de blocs multi-résolution. Au chapitre 4, nous présentons une nouvelle base de données d'images synthétisée par DIBR avec ses scores subjectifs associés. Ce travail se concentre sur les distorsions uniquement induites par différentes méthodes de synthèse de DIBR qui déterminent la qualité d’expérience (QoE) de ces applications liées à DIBR. En outre, nous effectuons également une analyse de référence des mesures d'évaluation de la qualité objective de pointe pour les vues synthétisées par DIBR sur cette base de données. Le chapitre 5 conclut les contributions de cette thèse et donne quelques orientations pour les travaux futurs. / Depth-Image-Based Rendering (DIBR) is a fundamental technology in several 3D-related applications, such as Free viewpoint video (FVV), Virtual Reality (VR) and Augmented Reality (AR). However, new challenges have also been brought in assessing the quality of DIBR-synthesized views since this process induces some new types of distortions, which are inherently different from the distortions caused by video coding. This work is dedicated to better evaluate the quality of DIBRsynthesized views in immersive multimedia. In chapter 2, we propose a completely No-reference (NR) metric. The principle of the first NR metrics NIQSV is to use a couple of opening and closing morphological operations to detect and measure the distortions, such as “blurry regions” and “crumbling”. In the second NR metric NIQSV+, we improve NIQSV by adding a “black hole” and a “stretching” detection. In chapter 3, we propose two Fullreference metrics to handle the geometric distortions by using a dis-occlusion mask and a multi-resolution block matching methods.In chapter 4, we present a new DIBR-synthesized image database with its associated subjective scores. This work focuses on the distortions only induced by different DIBR synthesis methods which determine the quality of experience (QoE) of these DIBR related applications. In addition, we also conduct a benchmark of the state-of-the-art objective quality assessment metrics for DIBR-synthesized views on this database. The chapter 5 concludes the contributions of this thesis and gives some directions of future work.
18

Cartographie RGB-D dense pour la localisation visuelle temps-réel et la navigation autonome / Dense RGB-D mapping for real-time localisation and autonomous navigation

Meilland, Maxime 28 March 2012 (has links)
Dans le contexte de la navigation autonome en environnement urbain, une localisation précise du véhicule est importante pour une navigation sure et fiable. La faible précision des capteurs bas coût existants tels que le système GPS, nécessite l'utilisation d'autres capteurs eux aussi à faible coût. Les caméras mesurent une information photométrique riche et précise sur l'environnement, mais nécessitent l'utilisation d'algorithmes de traitement avancés pour obtenir une information sur la géométrie et sur la position de la caméra dans l'environnement. Cette problématique est connue sous le terme de Cartographie et Localisation Simultanées (SLAM visuel). En général, les techniques de SLAM sont incrémentales et dérivent sur de longues trajectoires. Pour simplifier l'étape de localisation, il est proposé de découpler la partie cartographie et la partie localisation en deux phases: la carte est construite hors-ligne lors d'une phase d'apprentissage, et la localisation est effectuée efficacement en ligne à partir de la carte 3D de l'environnement. Contrairement aux approches classiques, qui utilisent un modèle 3D global approximatif, une nouvelle représentation égo-centrée dense est proposée. Cette représentation est composée d'un graphe d'images sphériques augmentées par l'information dense de profondeur (RGB+D), et permet de cartographier de larges environnements. Lors de la localisation en ligne, ce type de modèle apporte toute l'information nécessaire pour une localisation précise dans le voisinage du graphe, et permet de recaler en temps-réel l'image perçue par une caméra embarquée sur un véhicule, avec les images du graphe, en utilisant une technique d'alignement d'images directe. La méthode de localisation proposée, est précise, robuste aux aberrations et prend en compte les changements d'illumination entre le modèle de la base de données et les images perçues par la caméra. Finalement, la précision et la robustesse de la localisation permettent à un véhicule autonome, équipé d'une caméra, de naviguer de façon sure en environnement urbain. / In an autonomous navigation context, a precise localisation of the vehicule is important to ensure a reliable navigation. Low cost sensors such as GPS systems are inacurrate and inefficicent in urban areas, and therefore the employ of such sensors alone is not well suited for autonomous navigation. On the other hand, camera sensors provide a dense photometric measure that can be processed to obtain both localisation and mapping information. In the robotics community, this problem is well known as Simultaneous Localisation and Mapping (SLAM) and it has been studied for the last thirty years. In general, SLAM algorithms are incremental and prone to drift, thus such methods may not be efficient in large scale environments for real-time localisation. Clearly, an a-priori 3D model simplifies the localisation and navigation tasks since it allows to decouple the structure and motion estimation problems. Indeed, the map can be previously computed during a learning phase, whilst the localisation can be handled in real-time using a single camera and the pre-computed model. Classic global 3D model representations are usually inacurrate and photometrically inconsistent. Alternatively, it is proposed to use an ego-centric model that represents, as close as possible, real sensor measurements. This representation is composed of a graph of locally accurate spherical panoramas augmented with dense depth information. These augmented panoramas allow to generate varying viewpoints through novel view synthesis. To localise a camera navigating locally inside the graph, we use the panoramas together with a direct registration technique. The proposed localisation method is accurate, robust to outliers and can handle large illumination changes. Finally, autonomous navigation in urban environments is performed using the learnt model, with only a single camera to compute localisation.
19

Cartographie RGB-D dense pour la localisation visuelle temps-réel et la navigation autonome / Dense RGB-D mapping for real-time localisation and autonomous navigation

Meilland, Maxime 28 March 2012 (has links)
Dans le contexte de la navigation autonome en environnement urbain, une localisation précise du véhicule est importante pour une navigation sure et fiable. La faible précision des capteurs bas coût existants tels que le système GPS, nécessite l'utilisation d'autres capteurs eux aussi à faible coût. Les caméras mesurent une information photométrique riche et précise sur l'environnement, mais nécessitent l'utilisation d'algorithmes de traitement avancés pour obtenir une information sur la géométrie et sur la position de la caméra dans l'environnement. Cette problématique est connue sous le terme de Cartographie et Localisation Simultanées (SLAM visuel). En général, les techniques de SLAM sont incrémentales et dérivent sur de longues trajectoires. Pour simplifier l'étape de localisation, il est proposé de découpler la partie cartographie et la partie localisation en deux phases: la carte est construite hors-ligne lors d'une phase d'apprentissage, et la localisation est effectuée efficacement en ligne à partir de la carte 3D de l'environnement. Contrairement aux approches classiques, qui utilisent un modèle 3D global approximatif, une nouvelle représentation égo-centrée dense est proposée. Cette représentation est composée d'un graphe d'images sphériques augmentées par l'information dense de profondeur (RGB+D), et permet de cartographier de larges environnements. Lors de la localisation en ligne, ce type de modèle apporte toute l'information nécessaire pour une localisation précise dans le voisinage du graphe, et permet de recaler en temps-réel l'image perçue par une caméra embarquée sur un véhicule, avec les images du graphe, en utilisant une technique d'alignement d'images directe. La méthode de localisation proposée, est précise, robuste aux aberrations et prend en compte les changements d'illumination entre le modèle de la base de données et les images perçues par la caméra. Finalement, la précision et la robustesse de la localisation permettent à un véhicule autonome, équipé d'une caméra, de naviguer de façon sure en environnement urbain. / In an autonomous navigation context, a precise localisation of the vehicule is important to ensure a reliable navigation. Low cost sensors such as GPS systems are inacurrate and inefficicent in urban areas, and therefore the employ of such sensors alone is not well suited for autonomous navigation. On the other hand, camera sensors provide a dense photometric measure that can be processed to obtain both localisation and mapping information. In the robotics community, this problem is well known as Simultaneous Localisation and Mapping (SLAM) and it has been studied for the last thirty years. In general, SLAM algorithms are incremental and prone to drift, thus such methods may not be efficient in large scale environments for real-time localisation. Clearly, an a-priori 3D model simplifies the localisation and navigation tasks since it allows to decouple the structure and motion estimation problems. Indeed, the map can be previously computed during a learning phase, whilst the localisation can be handled in real-time using a single camera and the pre-computed model. Classic global 3D model representations are usually inacurrate and photometrically inconsistent. Alternatively, it is proposed to use an ego-centric model that represents, as close as possible, real sensor measurements. This representation is composed of a graph of locally accurate spherical panoramas augmented with dense depth information. These augmented panoramas allow to generate varying viewpoints through novel view synthesis. To localise a camera navigating locally inside the graph, we use the panoramas together with a direct registration technique. The proposed localisation method is accurate, robust to outliers and can handle large illumination changes. Finally, autonomous navigation in urban environments is performed using the learnt model, with only a single camera to compute localisation.
20

Multi-scale Methods for Omnidirectional Stereo with Application to Real-time Virtual Walkthroughs

Brunton, Alan P 28 November 2012 (has links)
This thesis addresses a number of problems in computer vision, image processing, and geometry processing, and presents novel solutions to these problems. The overarching theme of the techniques presented here is a multi-scale approach, leveraging mathematical tools to represent images and surfaces at different scales, and methods that can be adapted from one type of domain (eg., the plane) to another (eg., the sphere). The main problem addressed in this thesis is known as stereo reconstruction: reconstructing the geometry of a scene or object from two or more images of that scene. We develop novel algorithms to do this, which work for both planar and spherical images. By developing a novel way to formulate the notion of disparity for spherical images, we are able effectively adapt our algorithms from planar to spherical images. Our stereo reconstruction algorithm is based on a novel application of distance transforms to multi-scale matching. We use matching information aggregated over multiple scales, and enforce consistency between these scales using distance transforms. We then show how multiple spherical disparity maps can be efficiently and robustly fused using visibility and other geometric constraints. We then show how the reconstructed point clouds can be used to synthesize a realistic sequence of novel views, images from points of view not captured in the input images, in real-time. Along the way to this result, we address some related problems. For example, multi-scale features can be detected in spherical images by convolving those images with a filterbank, generating an overcomplete spherical wavelet representation of the image from which the multiscale features can be extracted. Convolution of spherical images is much more efficient in the spherical harmonic domain than in the spatial domain. Thus, we develop a GPU implementation for fast spherical harmonic transforms and frequency domain convolutions of spherical images. This tool can also be used to detect multi-scale features on geometric surfaces. When we have a point cloud of a surface of a particular class of object, whether generated by stereo reconstruction or by some other modality, we can use statistics and machine learning to more robustly estimate the surface. If we have at our disposal a database of surfaces of a particular type of object, such as the human face, we can compute statistics over this database to constrain the possible shape a new surface of this type can take. We show how a statistical spherical wavelet shape prior can be used to efficiently and robustly reconstruct a face shape from noisy point cloud data, including stereo data.

Page generated in 0.0719 seconds