Global ETD Search

1	Human Activity Recognition and Prediction using RGBD Data Coen, Paul Dixon 01 August 2019 (has links) Being able to predict and recognize human activities is an essential element for us to effectively communicate with other humans during our day to day activities. A system that is able to do this has a number of appealing applications, from assistive robotics to health care and preventative medicine. Previous work in supervised video-based human activity prediction and detection fails to capture the richness of spatiotemporal data that these activities generate. Convolutional Long short-term memory (Convolutional LSTM) networks are a useful tool in analyzing this type of data, showing good results in many other areas. This thesis’ focus is on utilizing RGB-D Data to improve human activity prediction and recognition. A modified Convolutional LSTM network is introduced to do so. Experiments are performed on the network and are compared to other models in-use as well as the current state-of-the-art system. We show that our proposed model for human activity prediction and recognition outperforms the current state-of-the-art models in the CAD-120 dataset without giving bounding frames or ground-truths about objects. Artificial Intelligence CAD-120 Convolutional LSTM Human Activity Neural Networks RGBD Data
2	La réalité augmentée : fusion de vision et navigation / Augmented reality : the fusion of vision and navigation Zarrouati-Vissière, Nadège 20 December 2013 (has links) Cette thèse a pour objet l'étude d'algorithmes pour des applications de réalité visuellement augmentée. Plusieurs besoins existent pour de telles applications, qui sont traités en tenant compte de la contrainte d'indistinguabilité de la profondeur et du mouvement linéaire dans le cas de l'utilisation de systèmes monoculaires. Pour insérer en temps réel de manière réaliste des objets virtuels dans des images acquises dans un environnement arbitraire et inconnu, il est non seulement nécessaire d'avoir une perception 3D de cet environnement à chaque instant, mais également d'y localiser précisément la caméra. Pour le premier besoin, on fait l'hypothèse d'une dynamique de la caméra connue, pour le second on suppose que la profondeur est donnée en entrée: ces deux hypothèses sont réalisables en pratique. Les deux problèmes sont posés dans lecontexte d'un modèle de caméra sphérique, ce qui permet d'obtenir des équations de mouvement invariantes par rotation pour l'intensité lumineuse comme pour la profondeur. L'observabilité théorique de ces problèmes est étudiée à l'aide d'outils de géométrie différentielle sur la sphère unité Riemanienne. Une implémentation pratique est présentée: les résultats expérimentauxmontrent qu'il est possible de localiser une caméra dans un environnement inconnu tout en cartographiant précisément cet environnement. / The purpose of this thesis is to study algorithms for visual augmented reality. Different requirements of such an application are addressed, with the constraint that the use of a monocular system makes depth and linear motion indistinguishable. The real-time realistic insertion of virtual objects in images of a real arbitrary environment yields the need for a dense Threedimensional (3D) perception of this environment on one hand, and a precise localization of the camera on the other hand. The first requirement is studied under an assumption of known dynamics, and the second under the assumption of known depth: both assumptions are practically realizable. Both problems are posed in the context of a spherical camera model, which yields SO(3)-invariant dynamical equations for light intensity and depth. The study of theoreticalobservability requires differential geometry tools for the Riemannian unit sphere. Practical implementation on a system is presented and experimental results demonstrate the ability to localize a camera in a unknown environment while precisely mapping this environment. Observateur asymptotique Realité augmentée Systèmes de navigation Données RGBD Algorithmes SLAM Asymptotic observer Augmented reality Navigation systems RGBD data SLAM algorithms
3	3D real time object recognition Amplianitis, Konstantinos 01 March 2017 (has links) Die Objekterkennung ist ein natürlicher Prozess im Menschlichen Gehirn. Sie ndet im visuellen Kortex statt und nutzt die binokulare Eigenschaft der Augen, die eine drei- dimensionale Interpretation von Objekten in einer Szene erlaubt. Kameras ahmen das menschliche Auge nach. Bilder von zwei Kameras, in einem Stereokamerasystem, werden von Algorithmen für eine automatische, dreidimensionale Interpretation von Objekten in einer Szene benutzt. Die Entwicklung von Hard- und Software verbessern den maschinellen Prozess der Objek- terkennung und erreicht qualitativ immer mehr die Fähigkeiten des menschlichen Gehirns. Das Hauptziel dieses Forschungsfeldes ist die Entwicklung von robusten Algorithmen für die Szeneninterpretation. Sehr viel Aufwand wurde in den letzten Jahren in der zweidimen- sionale Objekterkennung betrieben, im Gegensatz zur Forschung zur dreidimensionalen Erkennung. Im Rahmen dieser Arbeit soll demnach die dreidimensionale Objekterkennung weiterent- wickelt werden: hin zu einer besseren Interpretation und einem besseren Verstehen von sichtbarer Realität wie auch der Beziehung zwischen Objekten in einer Szene. In den letzten Jahren aufkommende low-cost Verbrauchersensoren, wie die Microsoft Kinect, generieren Farb- und Tiefendaten einer Szene, um menschenähnliche visuelle Daten zu generieren. Das Ziel hier ist zu zeigen, wie diese Daten benutzt werden können, um eine neue Klasse von dreidimensionalen Objekterkennungsalgorithmen zu entwickeln - analog zur Verarbeitung im menschlichen Gehirn. / Object recognition is a natural process of the human brain performed in the visual cor- tex and relies on a binocular depth perception system that renders a three-dimensional representation of the objects in a scene. Hitherto, computer and software systems are been used to simulate the perception of three-dimensional environments with the aid of sensors to capture real-time images. In the process, such images are used as input data for further analysis and development of algorithms, an essential ingredient for simulating the complexity of human vision, so as to achieve scene interpretation for object recognition, similar to the way the human brain perceives it. The rapid pace of technological advancements in hardware and software, are continuously bringing the machine-based process for object recognition nearer to the inhuman vision prototype. The key in this eld, is the development of algorithms in order to achieve robust scene interpretation. A lot of recognisable and signi cant e ort has been successfully carried out over the years in 2D object recognition, as opposed to 3D. It is therefore, within this context and scope of this dissertation, to contribute towards the enhancement of 3D object recognition; a better interpretation and understanding of reality and the relationship between objects in a scene. Through the use and application of low-cost commodity sensors, such as Microsoft Kinect, RGB and depth data of a scene have been retrieved and manipulated in order to generate human-like visual perception data. The goal herein is to show how RGB and depth information can be utilised in order to develop a new class of 3D object recognition algorithms, analogous to the perception processed by the human brain. 3D Objekt Erkennung 3D Mensch Segmentierung Objekt Erkennung Conditional Random Fields Kinect-Sensor RGBD-Daten 3D Rekonstruktionen Bundle Adjustment ICP Registrierung 3D Object Recognition 3D Human Segmentation Object Detection Conditional Random Fields Kinect Sensor RGBD Data 3D Reconstructions Bundle Adjustment ICP registration 004 Informatik 28 Informatik, Datenverarbeitung ST 330 ddc:004

1

Page generated in 0.024 seconds