Global ETD Search

1	Implicit shape representation for 2D/3D tracking and reconstruction Ren, Yuheng January 2014 (has links) This thesis develops and describes methods for real-time tracking, segmentation and 3-dimensional (3D) model acquisition, in the context of developing games for stroke patients that are rehabilitating at home. Real-time tracking and reconstruction of a stroke patient's feet, hands and the control objects that they are touching can enable not only the graphical visualization of the virtual avatar in the rehabilitation games, but also permits measurement of the patient's performs. Depth or combined colour and depth imagery from a Kinect sensor is used as input data. The 3D signed distance function (SDF) is used as implicit shape representation, and a series of probabilistic graphical models are developed for the problem of model-based 3D tracking, simultaneous 3D tracking and reconstruction and 3D tracking of multiple objects with identical appearance. The work is based on the assumption that the observed imagery is generated jointly by the pose(s) and the shape(s). The depth of each pixel is randomly and independently sampled from the likelihood of the pose(s) and the shape(s). The pose(s) tracking and 3D shape reconstruction problems are then cast as the maximum likelihood (ML) or maximum a posterior (MAP) estimate of the pose(s) or 3D shape. This methodology first leads to a novel probabilistic model for tracking rigid 3D objects with only depth data. For a known 3D shape, optimization aims to find the optimal pose that back projects all object region pixels onto the zero level set of the 3D shape, thus effectively maximising the likelihood of the pose. The method is extended to consider colour information for more robust tracking in the presence of outliers and occlusions. Initialised with a coarse 3D model, the extended method is also able to simultaneously reconstruct and track an unknown 3D object in real time. Finally, the concept of `shape union' is introduced to solve the problem of tracking multiple 3D objects with identical appearance. This is formulated as the minimum value of all SDFs in camera coordinates, which (i) leads to a per-pixel soft membership weight for each object thus providing an elegant solution for the data association in multi-target tracking and (ii) it allows for probabilistic physical constraints that avoid collisions between objects to be naturally enforced. The thesis also explore the possibility of using implicit shape representation for online shape learning. We use the harmonics of 2D discrete cosine transform (DCT) to represent 2D shapes. High frequency harmonics are decoupled from low ones to represent the coarse information and the details of the 2D shape. A regression model is learnt online to model the relationship between the high and low frequency harmonics using Locally Weighted Projection Regression (LWPR). We have demonstrated that the learned regression model is able to detect occlusion and recover them to the complete shape. 629.8
2	Kinematics of cricket phonotaxis Petrou, Georgios January 2012 (has links) Male crickets produce a species specific song to attract females which in response move towards the sound source. This behaviour, termed phonotaxis, has been the subject of many morphological, neurophysiological and behavioural studies making it one of the most well studied examples of acoustic communication in the animal kingdom. Despite this fact, the precise leg movements during this behaviour is unknown. This is of specific interest as the cricket’s ears are located on their front legs, meaning that the perception of the sound input might change as the insect moves. This dissertation describes a methodology and an analysis that fills this knowledge gap. I developed a semi-automated tracking system for insect motion based on commercially available high-speed video cameras and freely available software. I used it to collect detailed three dimensional kinematic information from female crickets performing free walking phonotaxis towards a calling song stimulus. I marked the insect’s joints with small dots of paint and recorded the movements from underneath with a pair of cameras following the insect as it walks on the transparent floor of an arena. Tracking is done offline, utilizing a kinematic model to constrain the processing. I obtained, for the first time, the positions and angles of all joints of all legs and six additional body joints, synchronised with stance-swing transitions and the sound pattern, at a 300 Hz frame rate. I then analysed this data based on four categories: The single leg motion analysis revealed the importance of the thoraco-coxal (ThC) and body joints in the movement of the insect. Furthermore the inside middle leg’s tibio-tarsal (TiTa) joint was the centre of the rotation during turning. Certain joints appear to be the most crucial ones for the transition from straight walking to turning. The leg coordination analysis revealed the patterns followed during straight walking and turning. Furthermore, some leg combinations cannot be explained by current coordination rules. The angles relative to the active speaker revealed the deviation of the crickets as they followed a meandering course towards it. The estimation of ears’ input revealed the differences between the two sides as the insect performed phonotaxis by using a simple algorithm. In general, the results reveal both similarities and differences with other cricket studies and other insects such as cockroaches and stick insects. The work presented herein advances the current knowledge on cricket phonotactic behaviour and will be used in the further development of models of neural control of phonotaxis.
3	Détection et suivi de personnes par vision omnidirectionnelle : approche 2D et 3D / Detection and tracking of persons by omnidirectional vision : 2D and 3D approaches Boui, Marouane 14 May 2018 (has links) Dans cette thèse, nous traiterons du problème de la détection et du suivi 3D de personnes dans des séquences d'images omnidirectionnelles, dans le but de réaliser des applications permettant l'estimation de pose 3D. Ceci nécessite, la mise en place d'un suivi stable et précis de la personne dans un environnement réel. Dans le cadre de cette étude, on utilisera une caméra catadioptrique composée d'un miroir sphérique et d'une caméra perspective. Ce type de capteur est couramment utilisé dans la vision par ordinateur et la robotique. Son principal avantage est son large champ de vision qui lui permet d'acquérir une vue à 360 degrés de la scène avec un seul capteur et en une seule image. Cependant, ce capteur va engendrer des distorsions importantes dans les images, ne permettant pas une application directe des méthodes classiquement utilisées en vision perspective. Cette thèse traite de deux approches de suivi développées durant cette thèse, qui permettent de tenir compte de ces distorsions. Elles illustrent le cheminement suivi par nos travaux, nous permettant de passer de la détection de personne à l'estimation 3D de sa pose. La première étape de nos travaux a consisté à mettre en place un algorithme de détection de personnes dans les images omnidirectionnelles. Nous avons proposé d'étendre l'approche conventionnelle pour la détection humaine en image perspective, basée sur l'Histogramme Orientés du Gradient (HOG), pour l'adapter à des images sphériques. Notre approche utilise les variétés riemanniennes afin d'adapter le calcul du gradient dans le cas des images omnidirectionnelles. Elle utilise aussi le gradient sphérique pour le cas les images sphériques afin de générer notre descripteur d'image omnidirectionnelle. Par la suite, nous nous sommes concentrés sur la mise en place d'un système de suivi 3D de personnes avec des caméras omnidirectionnelles. Nous avons fait le choix de faire du suivi 3D basé sur un modèle de la personne avec 30 degrés de liberté car nous nous sommes imposés comme contrainte l'utilisation d'une seule caméra catadioptrique. / In this thesis we will handle the problem of 3D people detection and tracking in omnidirectional images sequences, in order to realize applications allowing3D pose estimation, we investigate the problem of 3D people detection and tracking in omnidirectional images sequences. This requires a stable and accurate monitoring of the person in a real environment. In order to achieve this, we will use a catadioptric camera composed of a spherical mirror and a perspective camera. This type of sensor is commonly used in computer vision and robotics. Its main advantage is its wide field of vision, which allows it to acquire a 360-degree view of the scene with a single sensor and in a single image. However, this kind of sensor generally generates significant distortions in the images, not allowing a direct application of the methods conventionally used in perspective vision. Our thesis contains a description of two monitoring approaches that take into account these distortions. These methods show the progress of our work during these three years, allowing us to move from person detection to the 3Destimation of its pose. The first step of this work consisted in setting up a person detection algorithm in the omnidirectional images. We proposed to extend the conventional approach for human detection in perspective image, based on the Gradient-Oriented Histogram (HOG), in order to adjust it to spherical images. Our approach uses the Riemannian varieties to adapt the gradient calculation for omnidirectional images as well as the spherical gradient for spherical images to generate our omnidirectional image descriptor. Caméra omnidirectionnelle Images sphériques Suivi 3D Omnidirectional camera Spherical image Tracking 3D
4	Shape knowledge for segmentation and tracking Prisacariu, Victor Adrian January 2012 (has links) The aim of this thesis is to provide methods for 2D segmentation and 2D/3D tracking, that are both fast and robust to imperfect image information, as caused for example by occlusions, motion blur and cluttered background. We do this by combining high level shape information with simultaneous segmentation and tracking. We base our work on the assumption that the space of possible 2D object shapes can be either generated by projecting down known rigid 3D shapes or learned from 2D shape examples. We minimise the discrimination between statistical foreground and background appearance models with respect to the parameters governing the shape generative process (the 6 degree-of-freedom 3D pose of the 3D shape or the parameters of the learned space). The foreground region is delineated by the zero level set of a signed distance function, and we define an energy over this region and its immediate background surroundings based on pixel-wise posterior membership probabilities. We obtain the differentials of this energy with respect to the parameters governing shape and conduct searches for the correct shape using standard non-linear minimisation techniques. This methodology first leads to a novel rigid 3D object tracker. For a known 3D shape, our optimisation here aims to find the 3D pose that leads to the 2D projection that best segments a given image. We extend our approach to track multiple objects from multiple views and propose novel enhancements at the pixel level based on temporal consistency. Finally, owing to the per pixel nature of much of the algorithm, we support our theoretical approach with a real-time GPU based implementation. We next use our rigid 3D tracker in two applications: (i) a driver assistance system, where the tracker is augmented with 2D traffic sign detections, which, unlike previous work, allows for the relevance of the traffic signs to the driver to be gauged and (ii) a robust, real time 3D hand tracker that uses data from an off-the-shelf accelerometer and articulated pose classification results from a multiclass SVM classifier. Finally, we explore deformable 2D/3D object tracking. Unlike previous works, we use a non-linear and probabilistic dimensionality reduction, called Gaussian Process Latent Variable Models, to learn spaces of shape. Segmentation becomes a minimisation of an image-driven energy function in the learned space. We can represent both 2D and 3D shapes which we compress with Fourier-based transforms, to keep inference tractable. We extend this method by learning joint shape-parameter spaces, which, novel to the literature, enable simultaneous segmentation and generic parameter recovery. These can describe anything from 3D articulated pose to eye gaze. We also propose two novel extensions to standard GP-LVM: a method to explore the multimodality in the joint space efficiently, by learning a mapping from the latent space to a space that encodes the similarity between shapes and a method for obtaining faster convergence and greater accuracy by use of a hierarchy of latent embeddings. 621.3994
5	Intelligent 3D seam tracking and adaptable weld process control for robotic TIG welding Manorathna, Prasad January 2015 (has links) Tungsten Inert Gas (TIG) welding is extensively used in aerospace applications, due to its unique ability to produce higher quality welds compared to other shielded arc welding types. However, most TIG welding is performed manually and has not achieved the levels of automation that other welding techniques have. This is mostly attributed to the lack of process knowledge and adaptability to complexities, such as mismatches due to part fit-up. Recent advances in automation have enabled the use of industrial robots for complex tasks that require intelligent decision making, predominantly through sensors. Applications such as TIG welding of aerospace components require tight tolerances and need intelligent decision making capability to accommodate any unexpected variation and to carry out welding of complex geometries. Such decision making procedures must be based on the feedback about the weld profile geometry. In this thesis, a real-time position based closed loop system was developed with a six axis industrial robot (KUKA KR 16) and a laser triangulation based sensor (Micro-Epsilon Scan control 2900-25).
6	Entwicklung einer Methode zur Identifikation dreidimensionaler Blickbewegungen in realer und virtueller Umgebung Weber, Sascha 07 July 2016 (has links) Das Verständnis über visuelle Aufmerksamkeitsprozesse ist nicht nur für die Kognitionsforschung von großem Interesse. Auch in alltäglichen Bereichen des Lebens stellt sich die Frage, wie wir unsere Umwelt in unterschiedlichen Situationen visuell wahrnehmen. Entsprechende Untersuchungen können in realen Szenarien und aufgrund neuer innovativer 3D-Verfahren auch in Umgebungen der virtuellen Realität (VR) durchgeführt werden. Zur Erforschung von Aufmerksamkeitsprozessen wird unter anderem die Methode der Blickbewegungsmessung (Eyetracking) angewandt, da das Sehen für uns Menschen die wichtigste Sinnesmodalität darstellt. Herkömmliche Blickbewegungsmessungen beziehen sich allerdings überwiegend auf zweidimensionale Messebenen, wie Bildschirm, Leinwand oder Szenevideo. Die vorliegende Arbeit stellt eine Methode vor, mit der dreidimensionale Blickorte und Blickbewegungen sowohl in einer realen als auch in einer stereoskopisch projizierten VR-Umgebung anhand moderner Eyetracking-Technologien bestimmt werden können. Dafür wurde zunächst in Studie I geprüft, ob die Blickbewegungsmessung durch die für eine stereoskopische Bildtrennung notwendigen 3D-Brillen hindurch möglich ist und inwieweit durch diesen Versuchsaufbau die Qualität der erhobenen Eyetracking-Daten beeinflusst wird. Im nächsten Schritt wurde zur Berechnung dreidimensionaler Blickorte das Anforderungsprofil an einen universellen Algorithmus erstellt und mit einem vektorbasierten Ansatz umgesetzt. Die Besonderheit hierbei besteht in der Berechnung der Blickvektoren anhand der Augen- bzw. Foveaposition und binokularen Eyetracking-Daten. Wie genau dreidimensionale Blickorte anhand dieses Algorithmus berechnet werden können, wurde nachfolgend in realer (Studie II) als auch stereoskopisch projizierter VR-Umgebung (Studie III) untersucht. Anschließend erfolgte die Bestimmung dreidimensionaler Blickbewegungen aus den berechneten 3D-Blickorten. Dazu wurde ein ellipsoider Fixationserkennungsalgorithmus konzipiert und implementiert. Für die dispersionsbasierte Blickbewegungserkennung waren sowohl ein zeitlicher als auch örtlicher Parameter für die Identifikation einer Fixation erforderlich. Da es noch keinerlei Erkenntnisse im dreidimensionalen Bereich gab, wurden die in Studie II und III ermittelten 3D-Blickorte der ellipsoiden Fixationserkennung übergeben und die daraus berechneten Fixationsparameter analysiert. Die entwickelte Methode der räumlichen Blickbewegungsmessung eröffnet die Möglichkeit, bislang in zwei Dimensionen untersuchte Blickmuster nunmehr räumlich zu bestimmen und grundlegende Zusammenhänge zwischen Blickbewegungen und kognitiven Prozessen dreidimensional sowohl in einer realen als auch virtuellen Umgebung zu analysieren. info:eu-repo/classification/ddc/153 ddc:153

1

Page generated in 0.0473 seconds