Spelling suggestions: "subject:"(3D) deconstruction"" "subject:"(3D) areconstruction""
281 |
Contribution à la cartographie 3D des parois internes de la vessie par cystoscopie à vision active / Contribution to the 3D mapping of internal walls of the bladder by active vision cystoscopyBen Hamadou, Achraf 19 September 2011 (has links)
La cystoscopie est actuellement l'examen clinique de référence permettant l'exploration visuelle des parois internes de la vessie. Le cystoscope (instrument utilisé pour cet examen) permet d'acquérir une séquence vidéo des parois épithéliales de la vessie. Cependant, chaque image de la séquence vidéo ne visualise qu'une surface réduite de quelques centimètres carrés de la paroi. Les travaux réalisés dans le cadre de cette thèse ont pour objectif de construire une carte 3D reproduisant d'une manière fidèle les formes et les textures des parois internes de la vessie. Une telle représentation de l'intérieur de la vessie permettrait d'améliorer l'interprétation des données acquises lors d'un examen cystoscopique. Pour atteindre cet objectif, un nouvel algorithme flexible est proposé pour le calibrage de systèmes cystoscopiques à vision active. Cet algorithme fournit les paramètres nécessaires à la reconstruction précise de points 3D sur la portion de surface imagée à chaque instant donné de la séquence vidéo cystoscopique. Ainsi, pour chaque acquisition de la séquence vidéo, un ensemble de quelques points 3D/2D et une image 2D est disponible. L'objectif du deuxième algorithme proposé dans cette thèse est de ramener l'ensemble des données obtenues pour une séquence dans un repère global pour générer un nuage de points 3D et une image panoramique 2D représentant respectivement la forme 3D et la texture de la totalité de la paroi imagée dans la séquence vidéo. Cette méthode de cartographie 3D permet l'estimation simultanée des transformations 3D rigides et 2D perspectives liant respectivement les positions du cystoscope et les images de paires d'acquisitions consécutives. Les résultats obtenus sur des fantômes réalistes de vessie montrent que ces algorithmes permettent de calculer des surfaces 3D reproduisant les formes à retrouver / Cystoscopy is currently the reference clinical examination for visual exploration of the inner walls of the bladder. A cystoscope (instrument used in this examination) allows for video acquisition of the bladder epithelium. Nonetheless, each frame of the video displays only a small area of few squared centimeters. This work aims to build 3D maps representing the 3D shape and the texture of the inner walls of the bladder. Such maps should improve and facilitate the interpretation of the cystoscopic data. To reach this purpose, a new flexible algorithm is proposed for the calibration of cystoscopic active vision systems. This algorithm provides the required parameters to achieve accurate reconstruction of 3D points on the surface part imaged at each given moment of the video cystoscopy. Thus, available data for each acquisition are a set of few 3D points (and their corresponding 2D projections) and a 2D image. The aim of the second algorithm described in this work is to place all the data obtained for a sequence in a global coordinate system to generate a 3D point cloud and a 2D panoramic image representing respectively the 3D shape and the texture of the bladder wall imaged in the video. This 3D cartography method allows for the simultaneous estimation of 3D rigid transformations and 2D perspective transformations. These transformations give respectively the link between cystoscope positions and between images of consecutive acquisitions. The results obtained on realistic bladder phantoms show that the proposed method generates 3D surfaces recovering the ground truth shapes
|
282 |
Craniogenetische Untersuchungen an Macroscelidea (Butler, 1956) (Mammalia: Afrotheria)Ihlau, Jan 29 June 2011 (has links)
An Hand einer Reihe Ontogenesestadien konnte die Entwicklung ausgewählter Bereiche des Schädels von Macroscelides proboscideus dargestellt werden. Die Befunde der Craniogenese vom Macroscelides proboscideus wurden mit Elephantulus spec., anderen Macroscelidea und Orycteropus afer verglichen. In der vorliegenden Arbeit wurden bisher nicht beschriebene Synapomorpien der Macroscelidea dargestellt, die einen Beitrag zur Erstellung des Grundmusters dieser Gruppe leisten. Es konnten folgende Befunde erhoben werden: Bei Macroscelides proboscideus wurde ein bisher nur bei Elephantulus brachyrhynchus und Elephantulus myurus beschriebener Knorpel gefunden: diese Cartilago lateralis dient bei Macroscelides proboscideus als rostraler Ursprung des Musculus maxillo-labialis ventralis. Wird dieser Befund für weitere Vertreter der Sengis bestätigt, könnte es sich um eine Autapomorphie der Macroscelidea handeln. Ein fehlender Processus postorbitalis des Frontale scheint für Macroscelides proboscideus abgeleitet zu sein. Das Foramen subopticum, das bei den Orycteropodidae fehlt, ist eine Apomorphie der Sengis. Die eindeutig basale Position der Macroscelidea innerhalb des Stammbaums der Afrotheria, die auf wenigen Fossilien gestützte phylogenetische Entwicklung, die bis heute nicht geklärten intrafamiliären Beziehungen innerhalb der Macroscelididae und die fehlenden cranialen Apomorphien der Afrotheria lassen viel Raum für weitergehende vergleichende Untersuchungen. Es wurden histologische Schnittserien digital fotografiert und als Grundlage für eine computergestützte 3D-Rekonstruktion verwendet, so dass ein digitales 3D-Modell eines Stadiums der Sengis für Vergleichszwecke vorliegt. / Sengis (formerly known as elephant shrews) represent a monophyletic group within Afrotheria. An ontogenetic study on cranial development was conducted by histological slicing 7 individuals of Macroscelides proboscideus and Elephantulus spec. Through the development of an computer aided 3D- computer model following aspects could be found: The Cartilago lateralis, a separate cartilage holding the rostral beginning of the Musculus maxillo-labialis ventralis could be described for two species of Sengis and might turn out to be a cranial autapomophie for Macroscelidea. The in Oryctepodiadae missing Foramen subopticum is present in all Sengis. The basal position of Sengis within Afrotheria and the still missing cranial Apomorphies for the Afrotheria leave much space for further discussions.
|
283 |
Rastreamento de componentes conexas em vídeo 3D para obtenção de estruturas tridimensionais / Tracking of connected components from 3D video in order to obtain tridimensional structuresPires, David da Silva 17 August 2007 (has links)
Este documento apresenta uma dissertação sobre o desenvolvimento de um sistema de integração de dados para geração de estruturas tridimensionais a partir de vídeo 3D. O trabalho envolve a extensão de um sistema de vídeo 3D em tempo real proposto recentemente. Esse sistema, constituído por projetor e câmera, obtém imagens de profundidade de objetos por meio da projeção de slides com um padrão de faixas coloridas. Tal procedimento permite a obtenção, em tempo real, tanto do modelo 2,5 D dos objetos quanto da textura dos mesmos, segundo uma técnica denominada luz estruturada. Os dados são capturados a uma taxa de 30 quadros por segundo e possuem alta qualidade: resoluções de 640 x 480 pixeis para a textura e de 90 x 240 pontos (em média) para a geometria. A extensão que essa dissertação propõe visa obter o modelo tridimensional dos objetos presentes em uma cena por meio do registro dos dados (textura e geometria) dos diversos quadros amostrados. Assim, o presente trabalho é um passo intermediário de um projeto maior, no qual pretende-se fazer a reconstrução dos modelos por completo, bastando para isso apenas algumas imagens obtidas a partir de diferentes pontos de observação. Tal reconstrução deverá diminuir a incidência de pontos de oclusão (bastante comuns nos resultados originais) de modo a permitir a adaptação de todo o sistema para objetos móveis e deformáveis, uma vez que, no estado atual, o sistema é robusto apenas para objetos estáticos e rígidos. Até onde pudemos averiguar, nenhuma técnica já foi aplicada com este propósito. Este texto descreve o trabalho já desenvolvido, o qual consiste em um método para detecção, rastreamento e casamento espacial de componentes conexas presentes em um vídeo 3D. A informação de imagem do vídeo (textura) é combinada com posições tridimensionais (geometria) a fim de alinhar partes de superfícies que são vistas em quadros subseqüentes. Esta é uma questão chave no vídeo 3D, a qual pode ser explorada em diversas aplicações tais como compressão, integração geométrica e reconstrução de cenas, dentre outras. A abordagem que adotamos consiste na detecção de características salientes no espaço do mundo, provendo um alinhamento de geometria mais completo. O processo de registro é feito segundo a aplicação do algoritmo ICP---Iterative Closest Point---introduzido por Besl e McKay em 1992. Resultados experimentais bem sucedidos corroborando nosso método são apresentados. / This document presents a MSc thesis focused on the development of a data integration system to generate tridimensional structures from 3D video. The work involves the extension of a recently proposed real time 3D video system. This system, composed by a video camera and a projector, obtains range images of recorded objects using slide projection of a coloured stripe pattern. This procedure allows capturing, in real time, objects´ texture and 2,5 D model, at the same time, by a technique called structured light. The data are acquired at 30 frames per second, being of high quality: the resolutions are 640 x 480 pixels and 90 x 240 points (in average), respectively. The extension that this thesis proposes aims at obtaining the tridimensional model of the objects present in a scene through data matching (texture and geometry) of various sampled frames. Thus, the current work is an intermediary step of a larger project with the intent of achieving a complete reconstruction from only a few images obtained from different viewpoints. Such reconstruction will reduce the incidence of occlusion points (very common on the original results) such that it should be possible to adapt the whole system to moving and deformable objects (In the current state, the system is robust only to static and rigid objects.). To the best of our knowledge, there is no method that has fully solved this problem. This text describes the developed work, which consists of a method to perform detection, tracking and spatial matching of connected components present in a 3D video. The video image information (texture) is combined with tridimensional sites (geometry) in order to align surface portions seen on subsequent frames. This is a key step in the 3D video that may be explored in several applications such as compression, geometric integration and scene reconstruction, to name but a few. Our approach consists of detecting salient features in both image and world spaces, for further alignment of texture and geometry. The matching process is accomplished by the application of the ICP---Iterative Closest Point---algorithm, introduced by Besl and McKay in 1992. Succesful experimental results corroborating our method are shown.
|
284 |
Modélisation 3D à partir d'images : contributions en reconstruction photométrique à l'aide de maillages déformables / Multi-view Shape Modeling from Images : Contributions to Photometric-based Reconstruction using Deformable MeshesDelaunoy, Amaël 02 December 2011 (has links)
Comprendre, analyser et modéliser l'environment 3D à partir d'images provenant de caméras et d'appareils photos est l'un des défis majeurs actuel de recherche en vision par ordinateur. Cette thèse s'interesse à plusieurs aspects géométriques et photometriques liés à la reconstruction de surface à partir de plusieurs caméras calibrées. La reconstruction 3D est vue comme un problème de rendu inverse, et vise à minimiser une fonctionnelle d'énergie afin d'optimiser un maillage triangulaire représentant la surface à reconstruire. L'énergie est définie via un modèle génératif faisant naturellement apparaître des attributs tels que la visibilité ou la photométrie. Ainsi, l'approche présentée peut indifférement s'adapter à divers cas d'application tels que la stéréovision multi-vues, la stéréo photométrique multi-vues ou encore le “shape from shading” multi-vues. Plusieurs approches sont proposées afin de résoudre les problèmes de correspondances de l'apparence pour des scènes non Lambertiennes, dont l'apparence varie en fonction du point de vue. La segmentation, la stéréo photométrique ou encore la réciprocité d'Helmholtz sont des éléments étudiés afin de contraindre la reconstruction. L'exploitation de ces contraintes dans le cadre de reconstruction multi-vues permet de reconstruire des modèles complets 3D avec une meilleure qualité. / Understanding, analyzing and modeling the 3D world from 2D pictures and videos is probably one of the most exciting and challenging problem of computer vision. In this thesis, we address several geometric and photometric aspects to 3D surface reconstruction from multi-view calibrated images. We first formulate multi-view shape reconstruction as an inverse rendering problem. Using generative models, we formulate the problem as an energy minimization method that leads to the non-linear surface optimization of a deformable mesh. A particular attention is addressed to the computation of the discrete gradient flow, which leads to coherent vertices displacements. We particularly focus on models and energy functionals that depend on visibility and photometry. The same framework can then be equally used to perform multi-view stereo, multi-view shape from shading or multi-view photometric stereo. Then, we propose to exploit different additional information to constraint the problem in the non-Lambertian case, where the appearance of the scene depends on the view-point direction. Segmentation for instance can be used to segment surface regions sharing similar appearance or reflectance. Helmholtz reciprocity can also be applied to reconstruct 3D shapes of objects of any arbitrary reflectance properties. By taking multiple image-light pairs around an object, multi-view Helmholtz stereo can be performed. Using this constrained acquisition scenario and our deformable mesh framework, it is possible to reconstruct high quality 3D models.
|
285 |
Méthodes d’analyse de mouvement en vision 3D : invariance aux délais temporels entre des caméras non synchronisées et flux optique par isocontoursBenrhaiem, Rania 12 1900 (has links)
Cette thèse porte sur deux sujets de vision par ordinateur axés sur l’analyse de mouvement dans une scène dynamique vue par une ou plusieurs caméras. En premier lieu, nous avons travaillé sur le problème de la capture de mouvement avec des caméras non synchronisées. Ceci entraı̂ne généralement des erreurs de mise en correspondance 2D et par la suite des erreurs de reconstruction 3D. En contraste avec les solutions matérielles déjà existantes qui essaient de minimiser voire annuler le délai temporel entre les caméras, nous avons proposé une solution qui assure une invariance aux délais. En d’autres termes, nous avons développé une méthode qui permet de trouver la bonne mise en correspondance entre les points à reconstruire indépendamment du délai temporel. En second lieu, nous nous sommes intéressés au problème du flux optique avec une approche différente des méthodes proposées dans l’état de l’art. Le flux optique est utilisé pour l’analyse de mouvement en temps réel. Il est donc important qu’il soit calculé rapidement. Généralement, les méthodes existantes de flux optique sont classées en deux principales catégories: ou bien à la fois denses et précises mais très exigeantes en calcul, ou bien rapides mais moins denses et moins précises. Nous avons proposé une alternative qui tient compte à la fois du temps de calcul et de la précision du résultat. Nous avons proposé d’utiliser les isocontours d’intensité et de les mettre en correspondance afin de retrouver le flux optique en question. Ces travaux ont amené à deux principales contributions intégrées dans les chapitres de la thèse. / In this thesis we focused on two computer vision subjects. Both of them concern motion analysis in a dynamic scene seen by one or more cameras.
The first subject concerns motion capture using unsynchronised cameras. This causes many correspondence errors and 3D reconstruction errors. In contrast with existing material solutions trying to minimize the temporal delay between the cameras, we propose a software solution ensuring an invariance to the existing temporal delay. We developed a method that finds the good correspondence between points regardless of the temporal delay. It solves the resulting spatial shift and finds the correct position of the shifted points.
In the second subject, we focused on the optical flow problem using a different approach than the ones in the state of the art. In most applications, optical flow is used for real-time motion analysis. It is then important to be performed in a reduced time. In general, existing optical flow methods are classified into two main categories: either precise and dense but computationally intensive, or fast but less precise and less dense. In this work, we propose an alternative solution being at the same time, fast and precise. To do this, we propose extracting intensity isocontours to find corresponding points representing the related optical flow.
By addressing these problems we made two major contributions.
|
286 |
A three-dimensional representation method for noisy point clouds based on growing self-organizing maps accelerated on GPUsOrts-Escolano, Sergio 21 January 2014 (has links)
The research described in this thesis was motivated by the need of a robust model capable of representing 3D data obtained with 3D sensors, which are inherently noisy. In addition, time constraints have to be considered as these sensors are capable of providing a 3D data stream in real time. This thesis proposed the use of Self-Organizing Maps (SOMs) as a 3D representation model. In particular, we proposed the use of the Growing Neural Gas (GNG) network, which has been successfully used for clustering, pattern recognition and topology representation of multi-dimensional data. Until now, Self-Organizing Maps have been primarily computed offline and their application in 3D data has mainly focused on free noise models, without considering time constraints. It is proposed a hardware implementation leveraging the computing power of modern GPUs, which takes advantage of a new paradigm coined as General-Purpose Computing on Graphics Processing Units (GPGPU). The proposed methods were applied to different problem and applications in the area of computer vision such as the recognition and localization of objects, visual surveillance or 3D reconstruction.
|
287 |
Immersive Dynamic Scenes for Virtual Reality from a Single RGB-D CameraLai, Po Kong 26 September 2019 (has links)
In this thesis we explore the concepts and components which can be used as individual building blocks for producing immersive virtual reality (VR) content from a single RGB-D sensor. We identify the properties of immersive VR videos and propose a system composed of a foreground/background separator, a dynamic scene re-constructor and a shape completer.
We initially explore the foreground/background separator component in the context of video summarization. More specifically, we examined how to extract trajectories of moving objects from video sequences captured with a static camera. We then present a new approach for video summarization via minimization of the spatial-temporal projections of the extracted object trajectories. New evaluation criterion are also presented for video summarization. These concepts of foreground/background separation can then be applied towards VR scene creation by extracting relative objects of interest.
We present an approach for the dynamic scene re-constructor component using a single moving RGB-D sensor. By tracking the foreground objects and removing them from the input RGB-D frames we can feed the background only data into existing RGB-D SLAM systems. The result is a static 3D background model where the foreground frames are then super-imposed to produce a coherent scene with dynamic moving foreground objects. We also present a specific method for extracting moving foreground objects from a moving RGB-D camera along with an evaluation dataset with benchmarks.
Lastly, the shape completer component takes in a single view depth map of an object as input and "fills in" the occluded portions to produce a complete 3D shape. We present an approach that utilizes a new data minimal representation, the additive depth map, which allows traditional 2D convolutional neural networks to accomplish the task. The additive depth map represents the amount of depth required to transform the input into the "back depth map" which would exist if there was a sensor exactly opposite of the input. We train and benchmark our approach using existing synthetic datasets and also show that it can perform shape completion on real world data without fine-tuning. Our experiments show that our data minimal representation can achieve comparable results to existing state-of-the-art 3D networks while also being able to produce higher resolution outputs.
|
288 |
Cellular GPU Models to Euclidean Optimization Problems : Applications from Stereo Matching to Structured Adaptive Meshing and Traveling Salesman Problem / Modèles cellulaires GPU appliquès à des problèmes d'optimisation euclidiennes : applications à l'appariement d'images stéréo, à la génération de maillages et au voyageur de commerceZhang, Naiyu 02 December 2013 (has links)
Le travail présenté dans ce mémoire étudie et propose des modèles de calcul parallèles de type cellulaire pour traiter différents problèmes d’optimisation NP-durs définis dans l’espace euclidien, et leur implantation sur des processeurs graphiques multi-fonction (Graphics Processing Unit; GPU). Le but est de pouvoir traiter des problèmes de grande taille tout en permettant des facteurs d’accélération substantiels à l’aide du parallélisme massif. Les champs d’application visés concernent les systèmes embarqués pour la stéréovision de même que les problèmes de transports définis dans le plan, tels que les problèmes de tournées de véhicules. La principale caractéristique du modèle cellulaire est qu’il est fondé sur une décomposition du plan en un nombre approprié de cellules, chacune comportant une part constante de la donnée, et chacune correspondant à une unité de calcul (processus). Ainsi, le nombre de processus parallèles et la taille mémoire nécessaire sont en relation linéaire avec la taille du problème d’optimisation, ce qui permet de traiter des instances de très grandes tailles.L’efficacité des modèles cellulaires proposés a été testée sur plateforme parallèle GPU sur quatre applications. La première application est un problème d’appariement d’images stéréo. Elle concerne la stéréovision couleur. L’entrée du problème est une paire d’images stéréo, et la sortie une carte de disparités représentant les profondeurs dans la scène 3D. Le but est de comparer des méthodes d’appariement local selon l’approche winner-takes-all et appliquées à des paires d’images CFA (color filter array). La deuxième application concerne la recherche d’améliorations de l’implantation GPU permettant de réaliser un calcul quasi temps-réel de l’appariement. Les troisième et quatrième applications ont trait à l’implantation cellulaire GPU des réseaux neuronaux de type carte auto-organisatrice dans le plan. La troisième application concerne la génération de maillages structurés appliquée aux cartes de disparité afin de produire des représentations compressées des surfaces 3D. Enfin, la quatrième application concerne le traitement d’instances de grandes tailles du problème du voyageur de commerce euclidien comportant jusqu’à 33708 villes.Pour chacune des applications, les implantations GPU permettent une accélération substantielle du calcul par rapport aux versions CPU, pour des tailles croissantes des problèmes et pour une qualité de résultat obtenue similaire ou supérieure. Le facteur d’accélération GPU par rapport à la version CPU est d’environ 20 fois plus vite pour la version GPU sur le traitement des images CFA, cependant que le temps de traitement GPU est d’environ de 0,2s pour une paire d’images de petites tailles de la base Middlebury. L’algorithme amélioré quasi temps-réel nécessite environ 0,017s pour traiter une paire d’images de petites tailles, ce qui correspond aux temps d’exécution parmi les plus rapides de la base Middlebury pour une qualité de résultat modérée. La génération de maillages structurés est évaluée sur la base Middlebury afin de déterminer les facteurs d’accélération et qualité de résultats obtenus. Le facteur d’accélération obtenu pour l’implantation parallèle des cartes auto-organisatrices appliquée au problème du voyageur de commerce et pour l’instance avec 33708 villes est de 30 pour la version parallèle. / The work presented in this PhD studies and proposes cellular computation parallel models able to address different types of NP-hard optimization problems defined in the Euclidean space, and their implementation on the Graphics Processing Unit (GPU) platform. The goal is to allow both dealing with large size problems and provide substantial acceleration factors by massive parallelism. The field of applications concerns vehicle embedded systems for stereovision as well as transportation problems in the plane, as vehicle routing problems. The main characteristic of the cellular model is that it decomposes the plane into an appropriate number of cellular units, each responsible of a constant part of the input data, and such that each cell corresponds to a single processing unit. Hence, the number of processing units and required memory are with linear increasing relationship to the optimization problem size, which makes the model able to deal with very large size problems.The effectiveness of the proposed cellular models has been tested on the GPU parallel platform on four applications. The first application is a stereo-matching problem. It concerns color stereovision. The problem input is a stereo image pair, and the output a disparity map that represents depths in the 3D scene. The goal is to implement and compare GPU/CPU winner-takes-all local dense stereo-matching methods dealing with CFA (color filter array) image pairs. The second application focuses on the possible GPU improvements able to reach near real-time stereo-matching computation. The third and fourth applications deal with a cellular GPU implementation of the self-organizing map neural network in the plane. The third application concerns structured mesh generation according to the disparity map to allow 3D surface compressed representation. Then, the fourth application is to address large size Euclidean traveling salesman problems (TSP) with up to 33708 cities.In all applications, GPU implementations allow substantial acceleration factors over CPU versions, as the problem size increases and for similar or higher quality results. The GPU speedup factor over CPU was of 20 times faster for the CFA image pairs, but GPU computation time is about 0.2s for a small image pair from Middlebury database. The near real-time stereovision algorithm takes about 0.017s for a small image pair, which is one of the fastest records in the Middlebury benchmark with moderate quality. The structured mesh generation is evaluated on Middlebury data set to gauge the GPU acceleration factor and quality obtained. The acceleration factor for the GPU parallel self-organizing map over the CPU version, on the largest TSP problem with 33708 cities, is of 30 times faster.
|
289 |
Design and Calibration of a Network of RGB-D Sensors for Robotic Applications over Large WorkspacesRizwan, Macknojia 21 March 2013 (has links)
This thesis presents an approach for configuring and calibrating a network of RGB-D sensors used to guide a robotic arm to interact with objects that get rapidly modeled in 3D. The system is based on Microsoft Kinect sensors for 3D data acquisition. The work presented here also details an analysis and experimental study of the Kinect’s depth sensor capabilities and performance. The study comprises examination of the resolution, quantization error, and random distribution of depth data. In addition, the effects of color and reflectance characteristics of an object are also analyzed. The study examines two versions of Kinect sensors, one dedicated to operate with the Xbox 360 video game console and the more recent Microsoft Kinect for Windows version.
The study of the Kinect sensor is extended to the design of a rapid acquisition system dedicated to large workspaces by the linkage of multiple Kinect units to collect 3D data over a large object, such as an automotive vehicle. A customized calibration method for this large workspace is proposed which takes advantage of the rapid 3D measurement technology embedded in the Kinect sensor and provides registration accuracy between local sections of point clouds that is within the range of the depth measurements accuracy permitted by the Kinect technology. The method is developed to calibrate all Kinect units with respect to a reference Kinect. The internal calibration of the sensor in between the color and depth measurements is also performed to optimize the alignment between the modalities. The calibration of the 3D vision system is also extended to formally estimate its configuration with respect to the base of a manipulator robot, therefore allowing for seamless integration between the proposed vision platform and the kinematic control of the robot. The resulting vision-robotic system defines the comprehensive calibration of reference Kinect with the robot. The latter can then be used to interact under visual guidance with large objects, such as vehicles, that are positioned within a significantly enlarged field of view created by the network of RGB-D sensors.
The proposed design and calibration method is validated in a real world scenario where five Kinect sensors operate collaboratively to rapidly and accurately reconstruct a 180 degrees coverage of the surface shape of various types of vehicles from a set of individual acquisitions performed in a semi-controlled environment, that is an underground parking garage. The vehicle geometrical properties generated from the acquired 3D data are compared with the original dimensions of the vehicle.
|
290 |
Contour Based 3D Biological Image Reconstruction and Partial RetrievalLi, Yong 28 November 2007 (has links)
Image segmentation is one of the most difficult tasks in image processing. Segmentation algorithms are generally based on searching a region where pixels share similar gray level intensity and satisfy a set of defined criteria. However, the segmented region cannot be used directly for partial image retrieval. In this dissertation, a Contour Based Image Structure (CBIS) model is introduced. In this model, images are divided into several objects defined by their bounding contours. The bounding contour structure allows individual object extraction, and partial object matching and retrieval from a standard CBIS image structure. The CBIS model allows the representation of 3D objects by their bounding contours which is suitable for parallel implementation particularly when extracting contour features and matching them for 3D images require heavy computations. This computational burden becomes worse for images with high resolution and large contour density. In this essence we designed two parallel algorithms; Contour Parallelization Algorithm (CPA) and Partial Retrieval Parallelization Algorithm (PRPA). Both algorithms have considerably improved the performance of CBIS for both contour shape matching as well as partial image retrieval. To improve the effectiveness of CBIS in segmenting images with inhomogeneous backgrounds we used the phase congruency invariant features of Fourier transform components to highlight boundaries of objects prior to extracting their contours. The contour matching process has also been improved by constructing a fuzzy contour matching system that allows unbiased matching decisions. Further improvements have been achieved through the use of a contour tailored Fourier descriptor to make translation and rotation invariance. It is proved to be suitable for general contour shape matching where translation, rotation, and scaling invariance are required. For those images which are hard to be classified by object contours such as bacterial images, we define a multi-level cosine transform to extract their texture features for image classification. The low frequency Discrete Cosine Transform coefficients and Zenike moments derived from images are trained by Support Vector Machine (SVM) to generate multiple classifiers.
|
Page generated in 0.0958 seconds