251 |
Volume Estimation of Airbags: A Visual Hull ApproachAnliot, Manne January 2005 (has links)
<p>This thesis presents a complete and fully automatic method for estimating the volume of an airbag, through all stages of its inflation, with multiple synchronized high-speed cameras.</p><p>Using recorded contours of the inflating airbag, its visual hull is reconstructed with a novel method: The intersections of all back-projected contours are first identified with an accelerated epipolar algorithm. These intersections, together with additional points sampled from concave surface regions of the visual hull, are then Delaunay triangulated to a connected set of tetrahedra. Finally, the visual hull is extracted by carving away the tetrahedra that are classified as inconsistent with the contours, according to a voting procedure.</p><p>The volume of an airbag's visual hull is always larger than the airbag's real volume. By projecting a known synthetic model of the airbag into the cameras, this volume offset is computed, and an accurate estimate of the real airbag volume is extracted. </p><p>Even though volume estimates can be computed for all camera setups, the cameras should be specially posed to achieve optimal results. Such poses are uniquely found for different airbag models with a separate, fully automatic, simulated annealing algorithm.</p><p>Satisfying results are presented for both synthetic and real-world data.</p>
|
252 |
Bringing the avatar to life : Studies and developments in facial communication for virtual agents and robotsAl Moubayed, Samer January 2012 (has links)
The work presented in this thesis comes in pursuit of the ultimate goal of building spoken and embodied human-like interfaces that are able to interact with humans under human terms. Such interfaces need to employ the subtle, rich and multidimensional signals of communicative and social value that complement the stream of words – signals humans typically use when interacting with each other. The studies presented in the thesis concern facial signals used in spoken communication, and can be divided into two connected groups. The first is targeted towards exploring and verifying models of facial signals that come in synchrony with speech and its intonation. We refer to this as visual-prosody, and as part of visual-prosody, we take prominence as a case study. We show that the use of prosodically relevant gestures in animated faces results in a more expressive and human-like behaviour. We also show that animated faces supported with these gestures result in more intelligible speech which in turn can be used to aid communication, for example in noisy environments. The other group of studies targets facial signals that complement speech. As spoken language is a relatively poor system for the communication of spatial information; since such information is visual in nature. Hence, the use of visual movements of spatial value, such as gaze and head movements, is important for an efficient interaction. The use of such signals is especially important when the interaction between the human and the embodied agent is situated – that is when they share the same physical space, and while this space is taken into account in the interaction. We study the perception, the modelling, and the interaction effects of gaze and head pose in regulating situated and multiparty spoken dialogues in two conditions. The first is the typical case where the animated face is displayed on flat surfaces, and the second where they are displayed on a physical three-dimensional model of a face. The results from the studies show that projecting the animated face onto a face-shaped mask results in an accurate perception of the direction of gaze that is generated by the avatar, and hence can allow for the use of these movements in multiparty spoken dialogue. Driven by these findings, the Furhat back-projected robot head is developed. Furhat employs state-of-the-art facial animation that is projected on a 3D printout of that face, and a neck to allow for head movements. Although the mask in Furhat is static, the fact that the animated face matches the design of the mask results in a physical face that is perceived to “move”. We present studies that show how this technique renders a more intelligible, human-like and expressive face. We further present experiments in which Furhat is used as a tool to investigate properties of facial signals in situated interaction. Furhat is built to study, implement, and verify models of situated and multiparty, multimodal Human-Machine spoken dialogue, a study that requires that the face is physically situated in the interaction environment rather than in a two-dimensional screen. It also has received much interest from several communities, and been showcased at several venues, including a robot exhibition at the London Science Museum. We present an evaluation study of Furhat at the exhibition where it interacted with several thousand persons in a multiparty conversation. The analysis of the data from the setup further shows that Furhat can accurately regulate multiparty interaction using gaze and head movements. / <p>QC 20121123</p>
|
253 |
Approaches to Mobile Robot Localization in Indoor EnvironmentsJensfelt, Patric January 2001 (has links)
QC 20100621
|
254 |
Video See-Through Augmented Reality Application on a Mobile Computing Platform Using Position Based Visual POSE EstimationFischer, Daniel 22 August 2013 (has links)
A technique for real time object tracking in a mobile computing environment and its application to video see-through Augmented Reality (AR) has been designed, verified through simulation, and implemented and validated on a mobile computing device. Using position based visual position and orientation (POSE) methods and the Extended Kalman Filter (EKF), it is shown how this technique lends itself to be flexible to tracking multiple objects and multiple object models using a single monocular camera on different mobile computing devices. Using the monocular camera of the mobile computing device, feature points of the object(s) are located through image processing on the display. The relative position and orientation between the device and the object(s) is determined recursively by an EKF process. Once the relative position and orientation is determined for each object, three dimensional AR image(s) are rendered onto the display as if the device is looking at the virtual object(s) in the real world. This application and the framework presented could be used in the future to overlay additional informational onto displays in mobile computing devices. Example applications include robotic aided surgery where animations could be overlaid to assist the surgeon, in training applications that could aid in operation of equipment or in search and rescue operations where critical information such as floor plans and directions could be virtually placed onto the display.
Current approaches in the field of real time object tracking are discussed along with the methods used for video see-through AR applications on mobile computing devices. The mathematical framework for the real time object tracking and video see-through AR rendering is discussed in detail along with some consideration to extension to the handling of multiple AR objects. A physical implementation for a mobile computing device is proposed detailing the algorithmic approach along with design decisions.
The real time object tracking and video see-through AR system proposed is verified through simulation and details around the accuracy, robustness, constraints, and an extension to multiple object tracking are presented. The system is then validated using a ground truth measurement system and the accuracy, robustness, and its limitations are reviewed. A detailed validation analysis is also presented showing the feasibility of extending this approach to multiple objects. Finally conclusions from this research are presented based on the findings of this work and further areas of study are proposed.
|
255 |
Video See-Through Augmented Reality Application on a Mobile Computing Platform Using Position Based Visual POSE EstimationFischer, Daniel 22 August 2013 (has links)
A technique for real time object tracking in a mobile computing environment and its application to video see-through Augmented Reality (AR) has been designed, verified through simulation, and implemented and validated on a mobile computing device. Using position based visual position and orientation (POSE) methods and the Extended Kalman Filter (EKF), it is shown how this technique lends itself to be flexible to tracking multiple objects and multiple object models using a single monocular camera on different mobile computing devices. Using the monocular camera of the mobile computing device, feature points of the object(s) are located through image processing on the display. The relative position and orientation between the device and the object(s) is determined recursively by an EKF process. Once the relative position and orientation is determined for each object, three dimensional AR image(s) are rendered onto the display as if the device is looking at the virtual object(s) in the real world. This application and the framework presented could be used in the future to overlay additional informational onto displays in mobile computing devices. Example applications include robotic aided surgery where animations could be overlaid to assist the surgeon, in training applications that could aid in operation of equipment or in search and rescue operations where critical information such as floor plans and directions could be virtually placed onto the display.
Current approaches in the field of real time object tracking are discussed along with the methods used for video see-through AR applications on mobile computing devices. The mathematical framework for the real time object tracking and video see-through AR rendering is discussed in detail along with some consideration to extension to the handling of multiple AR objects. A physical implementation for a mobile computing device is proposed detailing the algorithmic approach along with design decisions.
The real time object tracking and video see-through AR system proposed is verified through simulation and details around the accuracy, robustness, constraints, and an extension to multiple object tracking are presented. The system is then validated using a ground truth measurement system and the accuracy, robustness, and its limitations are reviewed. A detailed validation analysis is also presented showing the feasibility of extending this approach to multiple objects. Finally conclusions from this research are presented based on the findings of this work and further areas of study are proposed.
|
256 |
Fusion de données visuo-inertielles pour l'estimation de pose et l'autocalibrageGlauco Garcia, Scandaroli 14 June 2013 (has links) (PDF)
Les systèmes multi-capteurs exploitent les complémentarités des différentes sources sensorielles. Par example, le capteur visuo-inertiel permet d'estimer la pose à haute fréquence et avec une grande précision. Les méthodes de vision mesurent la pose à basse fréquence mais limitent la dérive causée par l'intégration des données inertielles. Les centrales inertielles mesurent des incréments du déplacement à haute fréquence, ce que permet d'initialiser la vision et de compenser la perte momentanée de celle-ci. Cette thèse analyse deux aspects du problème. Premièrement, nous étudions les méthodes visuelles directes pour l'estimation de pose, et proposons une nouvelle technique basée sur la corrélation entre des images et la pondération des régions et des pixels, avec une optimisation inspirée de la méthode de Newton. Notre technique estime la pose même en présence des changements d'illumination extrêmes. Deuxièmement, nous étudions la fusion des données a partir de la théorie de la commande. Nos résultats principaux concernent le développement d'observateurs pour l'estimation de pose, biais IMU et l'autocalibrage. Nous analysons la dynamique de rotation d'un point de vue nonlinéaire, et fournissons des observateurs stables dans le groupe des matrices de rotation. Par ailleurs, nous analysons la dynamique de translation en tant que système linéaire variant dans le temps, et proposons des conditions d'observabilité uniforme. Les analyses d'observabilité nous permettent de démontrer la stabilité uniforme des observateurs proposés. La méthode visuelle et les observateurs sont testés et comparés aux méthodes classiques avec des simulations et de vraies données visuo-inertielles.
|
257 |
Repousser les limites de l'identification faciale en contexte de vidéo-surveillanceFiche, Cecile 31 January 2012 (has links) (PDF)
Les systèmes d'identification de personnes basés sur le visage deviennent de plus en plus répandus et trouvent des applications très variées, en particulier dans le domaine de la vidéosurveillance. Or, dans ce contexte, les performances des algorithmes de reconnaissance faciale dépendent largement des conditions d'acquisition des images, en particulier lorsque la pose varie mais également parce que les méthodes d'acquisition elles mêmes peuvent introduire des artéfacts. On parle principalement ici de maladresse de mise au point pouvant entraîner du flou sur l'image ou bien d'erreurs liées à la compression et faisant apparaître des effets de blocs. Le travail réalisé au cours de la thèse porte donc sur la reconnaissance de visages à partir d'images acquises à l'aide de caméras de vidéosurveillance, présentant des artéfacts de flou ou de bloc ou bien des visages avec des poses variables. Nous proposons dans un premier temps une nouvelle approche permettant d'améliorer de façon significative la reconnaissance des visages avec un niveau de flou élevé ou présentant de forts effets de bloc. La méthode, à l'aide de métriques spécifiques, permet d'évaluer la qualité de l'image d'entrée et d'adapter en conséquence la base d'apprentissage des algorithmes de reconnaissance. Dans un second temps, nous nous sommes focalisés sur l'estimation de la pose du visage. En effet, il est généralement très difficile de reconnaître un visage lorsque celui-ci n'est pas de face et la plupart des algorithmes d'identification de visages considérés comme peu sensibles à ce paramètre nécessitent de connaître la pose pour atteindre un taux de reconnaissance intéressant en un temps relativement court. Nous avons donc développé une méthode d'estimation de la pose en nous basant sur des méthodes de reconnaissance récentes afin d'obtenir une estimation rapide et suffisante de ce paramètre.
|
258 |
Vérification automatique des montages d'usinage par vision : application à la sécurisation de l'usinageKarabagli, Bilal 06 November 2013 (has links) (PDF)
Le terme "usinage à porte fermée", fréquemment employé par les PME de l'aéronautique et de l'automobile, désigne l'automatisation sécurisée du processus d'usinage des pièces mécaniques. Dans le cadre de notre travail, nous nous focalisons sur la vérification du montage d'usinage, avant de lancer la phase d'usinage proprement dite. Nous proposons une solution sans contact, basée sur la vision monoculaire (une caméra), permettant de reconnaitre automatiquement les éléments du montage (brut à usiner, pions de positionnement, tiges de fixation,etc.), de vérifier que leur implantation réelle (réalisée par l'opérateur) est conforme au modèle 3D numérique de montage souhaité (modèle CAO), afin de prévenir tout risque de collision avec l'outil d'usinage.
|
259 |
Objektų Pozicijos ir Orientacijos Nustatymo Metodų Mobiliam Robotui Efektyvumo Tyrimas / Efficiency Analysis of Object Position and Orientation Detection Algorithms for Mobile RobotUktveris, Tomas 18 August 2014 (has links)
Šiame darbe tiriami algoritminiai sprendimai mobiliam robotui, leidžiantys aptikti ieškomą objektą bei įvertinti jo poziciją ir orientaciją erdvėje. Atlikus šios srities technologijų analizę surasta įvairių realizacijai tinkamų metodų, tačiau bendro jų efektyvumo palyginimo trūko. Siekiant užpildyti šią spragą realizuota programinė ir techninė įranga, kuria atliktas labiausiai roboto sistemoms tinkamų metodų vertinimas. Algoritmų analizė susideda iš algoritmų tikslumo ir jų veikimo spartos vertinimo panaudojant tam paprastus bei efektyvius metodus. Darbe analizuojamas objektų orientacijos nustatymas iš Kinect kameros gylio duomenų pasitelkiant ICP algoritmą. Atliktas dviejų gylio sistemų spartos ir tikslumo tyrimas parodė, jog Kinect kamera spartos atžvilgiu yra efektyvesnis bei 2-5 kartus tikslesnis sprendimas nei įprastinė stereo kamerų sistema. Objektų aptikimo algoritmų efektyvumo eksperimentuose nustatytas maksimalus aptikimo tikslumas apie 90% bei pasiekta maksimali 15 kadrų/s veikimo sparta analizuojant standartinius VGA 640x480 raiškos vaizdus. Atliktas objektų pozicijos ir orientacijos nustatymo ICP metodo efektyvumo tyrimas parodė, jog vidutinė absoliutinė pozicijos ir orientacijos nustatymo paklaida yra atitinkamai apie 3.4cm bei apie 30 laipsnių, o veikimo sparta apie 2 kadrai/s. Tolesnis optimizavimas arba duomenų kiekio minimizavimas yra būtinas norint pasiekti geresnius veikimo rezultatus mobilioje ribotų resursų roboto sistemoje. Darbe taip pat buvo sėkmingai... [toliau žr. visą tekstą] / This work presents a performance analysis of the state-of-the-art computer vision algorithms for object detection and pose estimation. Initial field study showed that many algorithms for the given problem exist but still their combined comparison was lacking. In order to fill in the existing gap a software and hardware solution was created and the comparison of the most suitable methods for a robot system were done. The analysis consists of detector accuracy and runtime performance evaluation using simple and robust techniques. Object pose estimation via ICP algorithm and stereo vision Kinect depth sensor method was used in this work. A conducted two different stereo system analysis showed that Kinect achieves best runtime performance and its accuracy is 2-5 times more superior than a regular stereo setup. Object detection experiments showcased a maximum object detection accuracy of nearly 90% and speed of 15 fps for standard size VGA 640x480 resolution images. Accomplished object position and orientation estimation experiment using ICP method showed, that average absolute position and orientation detection error is respectively 3.4cm and 30 degrees while the runtime speed – 2 fps. Further optimization and data size minimization is necessary to achieve better efficiency on a resource limited mobile robot platform. The robot hardware system was also successfully implemented and tested in this work for object position and orientation detection.
|
260 |
Theory and Practice of Globally Optimal Deformation EstimationTian, Yuandong 01 September 2013 (has links)
Nonrigid deformation modeling and estimation from images is a technically challenging task due to its nonlinear, nonconvex and high-dimensional nature. Traditional optimization procedures often rely on good initializations and give locally optimal solutions. On the other hand, learning-based methods that directly model the relationship between deformed images and their parameters either cannot handle complicated forms of mapping, or suffer from the Nyquist Limit and the curse of dimensionality due to high degrees of freedom in the deformation space. In particular, to achieve a worst-case guarantee of ∈ error for a deformation with d degrees of freedom, the sample complexity required is O(1/∈d).
In this thesis, a generative model for deformation is established and analyzed using a unified theoretical framework. Based on the framework, three algorithms, Data-Driven Descent, Top-down and Bottom-up Hierarchical Models, are designed and constructed to solve the generative model. Under Lipschitz conditions that rule out unsolvable cases (e.g., deformation of a blank image), all algorithms achieve globally optimal solutions to the specific generative model. The sample complexity of these methods is substantially lower than that of learning-based approaches, which are agnostic to deformation modeling.
To achieve global optimality guarantees with lower sample complexity, the structureembedded in the deformation model is exploited. In particular, Data-driven Descentrelates two deformed images that are far away in the parameter space by compositionalstructures of deformation and reduce the sample complexity to O(Cd log 1/∈).Top-down Hierarchical Model factorizes the local deformation into patches once theglobal deformation has been estimated approximately and further reduce the samplecomplexity to O(Cd/1+C2 log 1/∈). Finally, the Bottom-up Hierarchical Model buildsrepresentations that are invariant to local deformation. With the representations, theglobal deformation can be estimated independently of local deformation, reducingthe sample complexity to O((C/∈)d0) (d0 ≪ d). From the analysis, this thesis showsthe connections between approaches that are traditionally considered to be of verydifferent nature. New theoretical conjectures on approaches like Deep Learning, arealso provided.
practice, broad applications of the proposed approaches have also been demonstrated to estimate water distortion, air turbulence, cloth deformation and human pose with state-of-the-art results. Some approaches even achieve near real-time performance. Finally, application-dependent physics-based models are built with good performance in document rectification and scene depth recovery in turbulent media.
|
Page generated in 0.0489 seconds