Global ETD Search

1	An Obstacle Avoidance System for the Visually Impaired Using 3-D Point Cloud Processing Taylor, Evan Justin 01 December 2017 (has links) The long white cane offers many benefits for the blind and visually impaired. Still, many report being injured both indoors and outdoors while using the long white cane. One frequent cause of injury is due to the fact that the long white cane cannot detect obstacles above the waist of the user. This thesis presents a system that attempts to augment the capabilities of the long white cane by sensing the environment around the user, creating a map of obstacles within the environment, and providing simple haptic feedback to the user. The proposed augmented cane system uses the Asus Xtion Pro Live infrared depth sensor to capture the user's environment as a point cloud. The open-source Point Cloud Library (PCL) and Robotic Operating System (ROS) are used to process the point cloud. The points representing the ground plane are extracted to more clearly define potential obstacles. The system determines the nearest point for each 1degree across the horizontal view. These nearest points are recorded as a ROS Laser Scan message and used in a simple haptic feedback system where the rumble feedback is based on two different cost functions. Twenty-two volunteers participated in a user demonstration that showed the augmented cane system can successfully communicate the presence of obstacles to blindfolded users. The users reported experiencing a sense of safety and confidence in the system's abilities. Obstacles above waist height are detected and communicated to the user. The system requires additional development before it could be considered a viable product for the visually impaired. mobility aid RGB-D camera point cloud processing obstacle avoidance visually impaired Mechanical Engineering
2	Vers un système de capture du mouvement humain en 3D pour un robot mobile évoluant dans un environnement encombré / Toward a motion capture system in 3D for a mobile robot moving in a cluttered environment Dib, Abdallah 24 May 2016 (has links) Dans cette thèse nous intéressons à la conception d'un robot mobile capable d’analyser le comportement et le mouvement d’une personne en environnement intérieur et encombré, par exemple le domicile d’une personne âgée. Plus précisément, notre objectif est de doter le robot des capacités de perception visuelle de la posture humaine de façon à mieux maîtriser certaines situations qui nécessitent de comprendre l’intention des personnes avec lesquelles le robot interagit, ou encore de détecter des situations à risques comme les chutes ou encore d’analyser les capacités motrices des personnes dont il a la garde. Le suivi de la posture dans un environnement dynamique et encombré relève plusieurs défis notamment l'apprentissage en continue du fond de la scène et l'extraction la silhouette qui peut être partiellement observable lorsque la personne est dans des endroits occultés. Ces difficultés rendent le suivi de la posture une tâche difficile. La majorité des méthodes existantes, supposent que la scène est statique et la personne est toujours visible en entier. Ces approches ne sont pas adaptées pour fonctionner dans des conditions réelles. Nous proposons, dans cette thèse, un nouveau système de suivi capable de suivre la posture de la personne dans ces conditions réelles. Notre approche utilise une grille d'occupation avec un modèle de Markov caché pour apprendre en continu l'évolution de la scène et d'extraire la silhouette, ensuite un algorithme de filtrage particulaire hiérarchique est utilisé pour reconstruire la posture. Nous proposons aussi un nouvel algorithme de gestion d'occlusion capable d'identifier et d'exclure les parties du corps cachées du processus de l'estimation de la pose. Finalement, nous avons proposé une base de données contenant des images RGB-D avec la vérité-terrain dans le but d'établir une nouvelle référence pour l'évaluation des systèmes de capture de mouvement dans un environnement réel avec occlusions. La vérité-terrain est obtenue à partir d'un système de capture de mouvement à base de marqueur de haute précision avec huit caméras infrarouges. L'ensemble des données est disponible en ligne. La deuxième contribution de cette thèse, est le développement d'une méthode de localisation visuelle à partir d'une caméra du type RGB-D montée sur un robot qui se déplace dans un environnement dynamique. En effet, le système de capture de mouvement que nous avons développé doit équiper un robot se déplaçant dans une scène. Ainsi, l'estimation de mouvement du robot est importante pour garantir une extraction de silhouette correcte pour le suivi. La difficulté majeure de la localisation d'une caméra dans un environnement dynamique, est que les objets mobiles de la scène induisent un mouvement supplémentaire qui génère des pixels aberrants. Ces pixels doivent être exclus du processus de l'estimation du mouvement de la caméra. Nous proposons ainsi une extension de la méthode de localisation dense basée sur le flux optique pour isoler les pixels aberrants en utilisant l'algorithme de RANSAC. / In this thesis we are interested in designing a mobile robot able to analyze the behavior and movement of a a person in indoor and cluttered environment. Our goal is to equip the robot by visual perception capabilities of the human posture to better analyze situations that require understanding of person with which the robot interacts, or detect risk situations such as falls or analyze motor skills of the person. Motion capture in a dynamic and crowded environment raises multiple challenges such as learning the background of the environment and extracting the silhouette that can be partially observable when the person is in hidden places. These difficulties make motion capture difficult. Most of existing methods assume that the scene is static and the person is always fully visible by the camera. These approaches are not able to work in such realistic conditions. In this thesis, We propose a new motion capture system capable of tracking a person in realistic world conditions. Our approach uses a 3D occupancy grid with a hidden Markov model to continuously learn the changing background of the scene and to extract silhouette of the person, then a hierarchical particle filtering algorithm is used to reconstruct the posture. We propose a novel occlusion management algorithm able to identify and discards hidden body parts of the person from process of the pose estimation. We also proposed a new database containing RGBD images with ground truth data in order to establish a new benchmark for the assessment of motion capture systems in a real environment with occlusions. The ground truth is obtained from a motion capture system based on high-precision marker with eight infrared cameras. All data is available online. The second contribution of this thesis is the development of a new visual odometry method to localize an RGB-D camera mounted on a robot moving in a dynamic environment. The major difficulty of the localization in a dynamic environment, is that mobile objects in the scene induce additional movement that generates outliers pixels. These pixels should be excluded from the camera motion estimation process in order to produce accurate and precise localization. We thus propose an extension of the dense localization method based on the optical flow method to remove outliers pixels using the RANSAC algorithm. Capture de mouvement Odométrie visuelle Filtrage particulaire Modèle de Markov Caché Caméra RGB-D Motion capture Visual odometry Particle filter Hidden Markov Model RGB-D camera Optical flow 681.75 006.33
3	Unconstrained Gaze Estimation Using RGB-D Camera. / Estimation du regard avec une caméra RGB-D dans des environnements utilisateur non-contraints Kacete, Amine 15 December 2016 (has links) Dans ce travail, nous avons abordé le problème d’estimation automatique du regard dans des environnements utilisateur sans contraintes. Ce travail s’inscrit dans la vision par ordinateur appliquée à l’analyse automatique du comportement humain. Plusieurs solutions industrielles sont aujourd’hui commercialisées et donnent des estimations précises du regard. Certaines ont des spécifications matérielles très complexes (des caméras embarquées sur un casque ou sur des lunettes qui filment le mouvement des yeux) et présentent un niveau d’intrusivité important, ces solutions sont souvent non accessible au grand public. Cette thèse vise à produire un système d’estimation automatique du regard capable d’augmenter la liberté du mouvement de l’utilisateur par rapport à la caméra (mouvement de la tête, distance utilisateur-capteur), et de réduire la complexité du système en utilisant des capteurs relativement simples et accessibles au grand public. Dans ce travail, nous avons exploré plusieurs paradigmes utilisés par les systèmes d’estimation automatique du regard. Dans un premier temps, Nous avons mis au point deux systèmes basés sur deux approches classiques: le premier basé caractéristiques et le deuxième basé semi apparence. L’inconvénient majeur de ces paradigmes réside dans la conception des systèmes d'estimation du regard qui supposent une indépendance totale entre l'image d'apparence des yeux et la pose de la tête. Pour corriger cette limitation, Nous avons convergé vers un nouveau paradigme qui unifie les deux blocs précédents en construisant un espace regard global, nous avons exploré deux directions en utilisant des données réelles et synthétiques respectivement. / In this thesis, we tackled the automatic gaze estimation problem in unconstrained user environments. This work takes place in the computer vision research field applied to the perception of humans and their behaviors. Many existing industrial solutions are commercialized and provide an acceptable accuracy in gaze estimation. These solutions often use a complex hardware such as range of infrared cameras (embedded on a head mounted or in a remote system) making them intrusive, very constrained by the user's environment and inappropriate for a large scale public use. We focus on estimating gaze using cheap low-resolution and non-intrusive devices like the Kinect sensor. We develop new methods to address some challenging conditions such as head pose changes, illumination conditions and user-sensor large distance. In this work we investigated different gaze estimation paradigms. We first developed two automatic gaze estimation systems following two classical approaches: feature and semi appearance-based approaches. The major limitation of such paradigms lies in their way of designing gaze systems which assume a total independence between eye appearance and head pose blocks. To overcome this limitation, we converged to a novel paradigm which aims at unifying the two previous components and building a global gaze manifold, we explored two global approaches across the experiments by using synthetic and real RGB-D gaze samples. Estimation du regard Caméra RGB-D Suivi de la pupille Champs aléatoires Apprentissage automatique Gaze estimation RGB-D Camera Eye-pupil localization Random Forest Machine learning
4	REAL-TIME CAPTURE AND RENDERING OF PHYSICAL SCENE WITH AN EFFICIENTLY CALIBRATED RGB-D CAMERA NETWORK Su, Po-Chang 01 January 2017 (has links) From object tracking to 3D reconstruction, RGB-Depth (RGB-D) camera networks play an increasingly important role in many vision and graphics applications. With the recent explosive growth of Augmented Reality (AR) and Virtual Reality (VR) platforms, utilizing camera RGB-D camera networks to capture and render dynamic physical space can enhance immersive experiences for users. To maximize coverage and minimize costs, practical applications often use a small number of RGB-D cameras and sparsely place them around the environment for data capturing. While sparse color camera networks have been studied for decades, the problems of extrinsic calibration of and rendering with sparse RGB-D camera networks are less well understood. Extrinsic calibration is difficult because of inappropriate RGB-D camera models and lack of shared scene features. Due to the significant camera noise and sparse coverage of the scene, the quality of rendering 3D point clouds is much lower compared with synthetic models. Adding virtual objects whose rendering depend on the physical environment such as those with reflective surfaces further complicate the rendering pipeline. In this dissertation, I propose novel solutions to tackle these challenges faced by RGB-D camera systems. First, I propose a novel extrinsic calibration algorithm that can accurately and rapidly calibrate the geometric relationships across an arbitrary number of RGB-D cameras on a network. Second, I propose a novel rendering pipeline that can capture and render, in real-time, dynamic scenes in the presence of arbitrary-shaped reflective virtual objects. Third, I have demonstrated a teleportation application that uses the proposed system to merge two geographically separated 3D captured scenes into the same reconstructed environment. To provide a fast and robust calibration for a sparse RGB-D camera network, first, the correspondences between different camera views are established by using a spherical calibration object. We show that this approach outperforms other techniques based on planar calibration objects. Second, instead of modeling camera extrinsic using rigid transformation that is optimal only for pinhole cameras, different view transformation functions including rigid transformation, polynomial transformation, and manifold regression are systematically tested to determine the most robust mapping that generalizes well to unseen data. Third, the celebrated bundle adjustment procedure is reformulated to minimize the global 3D projection error so as to fine-tune the initial estimates. To achieve a realistic mirror rendering, a robust eye detector is used to identify the viewer's 3D location and render the reflective scene accordingly. The limited field of view obtained from a single camera is overcome by our calibrated RGB-D camera network system that is scalable to capture an arbitrarily large environment. The rendering is accomplished by raytracing light rays from the viewpoint to the scene reflected by the virtual curved surface. To the best of our knowledge, the proposed system is the first to render reflective dynamic scenes from real 3D data in large environments. Our scalable client-server architecture is computationally efficient - the calibration of a camera network system, including data capture, can be done in minutes using only commodity PCs. RGB-D Camera Network Real-time Capture and Rendering Virtual Curved Mirror 3D Telepresence 3D Interaction Computer Sciences Electrical and Computer Engineering Graphics and Human Computer Interfaces
5	Visual object perception in unstructured environments Choi, Changhyun 12 January 2015 (has links) As robotic systems move from well-controlled settings to increasingly unstructured environments, they are required to operate in highly dynamic and cluttered scenarios. Finding an object, estimating its pose, and tracking its pose over time within such scenarios are challenging problems. Although various approaches have been developed to tackle these problems, the scope of objects addressed and the robustness of solutions remain limited. In this thesis, we target a robust object perception using visual sensory information, which spans from the traditional monocular camera to the more recently emerged RGB-D sensor, in unstructured environments. Toward this goal, we address four critical challenges to robust 6-DOF object pose estimation and tracking that current state-of-the-art approaches have, as yet, failed to solve. The first challenge is how to increase the scope of objects by allowing visual perception to handle both textured and textureless objects. A large number of 3D object models are widely available in online object model databases, and these object models provide significant prior information including geometric shapes and photometric appearances. We note that using both geometric and photometric attributes available from these models enables us to handle both textured and textureless objects. This thesis presents our efforts to broaden the spectrum of objects to be handled by combining geometric and photometric features. The second challenge is how to dependably estimate and track the pose of an object despite the clutter in backgrounds. Difficulties in object perception rise with the degree of clutter. Background clutter is likely to lead to false measurements, and false measurements tend to result in inaccurate pose estimates. To tackle significant clutter in backgrounds, we present two multiple pose hypotheses frameworks: a particle filtering framework for tracking and a voting framework for pose estimation. Handling of object discontinuities during tracking, such as severe occlusions, disappearances, and blurring, presents another important challenge. In an ideal scenario, a tracked object is visible throughout the entirety of tracking. However, when an object happens to be occluded by other objects or disappears due to the motions of the object or the camera, difficulties ensue. Because the continuous tracking of an object is critical to robotic manipulation, we propose to devise a method to measure tracking quality and to re-initialize tracking as necessary. The final challenge we address is performing these tasks within real-time constraints. Our particle filtering and voting frameworks, while time-consuming, are composed of repetitive, simple and independent computations. Inspired by that observation, we propose to run massively parallelized frameworks on a GPU for those robotic perception tasks which must operate within strict time constraints. Computer vision Robotic perception Visual tracking Object recognition Pose estimation Particle filtering Voting process RGB-D camera Monocular Geometric feature Photometric feature Unstructured environments GPU Real-time
6	Unconstrained Gaze Estimation Using RGB-D Camera. / Estimation du regard avec une caméra RGB-D dans des environnements utilisateur non-contraints Kacete, Amine 15 December 2016 (has links) Dans ce travail, nous avons abordé le problème d’estimation automatique du regard dans des environnements utilisateur sans contraintes. Ce travail s’inscrit dans la vision par ordinateur appliquée à l’analyse automatique du comportement humain. Plusieurs solutions industrielles sont aujourd’hui commercialisées et donnent des estimations précises du regard. Certaines ont des spécifications matérielles très complexes (des caméras embarquées sur un casque ou sur des lunettes qui filment le mouvement des yeux) et présentent un niveau d’intrusivité important, ces solutions sont souvent non accessible au grand public. Cette thèse vise à produire un système d’estimation automatique du regard capable d’augmenter la liberté du mouvement de l’utilisateur par rapport à la caméra (mouvement de la tête, distance utilisateur-capteur), et de réduire la complexité du système en utilisant des capteurs relativement simples et accessibles au grand public. Dans ce travail, nous avons exploré plusieurs paradigmes utilisés par les systèmes d’estimation automatique du regard. Dans un premier temps, Nous avons mis au point deux systèmes basés sur deux approches classiques: le premier basé caractéristiques et le deuxième basé semi apparence. L’inconvénient majeur de ces paradigmes réside dans la conception des systèmes d'estimation du regard qui supposent une indépendance totale entre l'image d'apparence des yeux et la pose de la tête. Pour corriger cette limitation, Nous avons convergé vers un nouveau paradigme qui unifie les deux blocs précédents en construisant un espace regard global, nous avons exploré deux directions en utilisant des données réelles et synthétiques respectivement. / In this thesis, we tackled the automatic gaze estimation problem in unconstrained user environments. This work takes place in the computer vision research field applied to the perception of humans and their behaviors. Many existing industrial solutions are commercialized and provide an acceptable accuracy in gaze estimation. These solutions often use a complex hardware such as range of infrared cameras (embedded on a head mounted or in a remote system) making them intrusive, very constrained by the user's environment and inappropriate for a large scale public use. We focus on estimating gaze using cheap low-resolution and non-intrusive devices like the Kinect sensor. We develop new methods to address some challenging conditions such as head pose changes, illumination conditions and user-sensor large distance. In this work we investigated different gaze estimation paradigms. We first developed two automatic gaze estimation systems following two classical approaches: feature and semi appearance-based approaches. The major limitation of such paradigms lies in their way of designing gaze systems which assume a total independence between eye appearance and head pose blocks. To overcome this limitation, we converged to a novel paradigm which aims at unifying the two previous components and building a global gaze manifold, we explored two global approaches across the experiments by using synthetic and real RGB-D gaze samples. Estimation du regard Caméra RGB-D Suivi de la pupille Champs aléatoires Apprentissage automatique Gaze estimation RGB-D Camera Eye-pupil localization Random Forest Machine learning
7	RGB-D Deep Learning keypoints and descriptors extraction Network for feature-based Visual Odometry systems / RGB-D Deep Learning-nätverk för utvinning av nyckelpunkter och deskriptorer för nyckelpunktsbaserad Visuella Odometri. Bennasciutti, Federico January 2022 (has links) Feature extractors in Visual Odometry pipelines rarely exploit depth signals, even though depth sensors and RGB-D cameras are commonly used in later stages of Visual Odometry systems. Nonetheless, depth sensors from RGB-D cameras function even with no external light and can provide feature extractors with additional structural information otherwise invisible in RGB images. Deep Learning feature extractors, which have recently been shown to outperform their classical counterparts, still only exploit RGB information. Against this background, this thesis presents a Self-Supervised Deep Learning feature extraction algorithm that employs both RGB and depth signals as input. The proposed approach builds upon the existing deep learning feature extractors, adapting the architecture and training procedure to introduce the depth signal. The developed RGB-D system is compared with an RGB-only feature extractor in a qualitative study on keypoints’ location and a quantitative evaluation on pose estimation. The qualitative evaluation demonstrates that the proposed system exploits information from both RGB and depth domains, and it robustly adapts to the degradation of either of the two input signals. The pose estimation results indicate that the RGB-D system performs comparably to the RGB-only one in normal and low-light conditions. Thanks to the usage of depth information, the RGB-D feature extractor can still operate, showing only limited performance degradation, even in completely dark environments, where RGB methods fail due to a lack of input information. The combined qualitative and quantitative results suggest that the proposed system extracts features based on both RGB and depth input domains and can autonomously transition from normal brightness to a no-light environment, by exploiting depth signal to compensate for the degraded RGB information. / Detektering av nyckelpunkter för Visuell Odometri (VO) utnyttjar sällan information om djup i bilder, även om avståndssensorer och RGB-D-kameror ofta används i senare skeden av VO pipelinen. RGB-D-kamerors avståndsestimering fungerar även utan externt ljus. De kan förse nyckelpunktsdetektorer med ytterligare strukturell information som är svårt att extrahera enbart från RGB-bilder. Detektering av nyckelpunkter, med hjälp av Deep Learning metoder, har nyligen visat sig överträffa sina klassiska motsvarigheter som fortfarande endast utnyttjar bildinformation. Denna avhandling presenterar en algoritm för självövervakande nyckelpunktsdetektering med djupinlärning, som använder både RGB-bilder och avståndsinformation som indata. Det föreslagna tillvägagångssättet bygger på en befintlig arkitektur, som har anpassats för att också kunna hantera informationen om djupet i bilder. Den utvecklade RGB-D nyckelpunktsdetektorn har jämförts med en detektor som enbart baseras på RGB-bilder. Det har både gjorts en kvalitativ utvärdering av nyckelpunkternas läge och en kvantitativ utvärdering av detektorns förmåga på VO-tillämpningar, dvs estimering av position och orientering. Den kvalitativa utvärderingen av nyckelpunkterna visar att det föreslagna systemet kan utnyttja både information från bild- och djupdomänen. Den visar även att detektorn är robust mot försämringar av båda bilderna och djupinformationen. Evalueringen visar att den utvecklade RGB-D-metoden och en standardetektor uppnår jämförbara resultat under normala och svaga ljusförhållanden. Dock, tack vare användningen av tillgänglig djupinformation kan RGB-D-metoden fortfarande fungera i helt mörka förhållanden, med endast begränsad försämring av prestanda. I dessa scenarion misslyckas RGB-metoder på grund av brist på användbar bildinformation. De kombinerade kvalitativa och kvantitativa resultaten tyder på att det föreslagna systemet extraherar egenskaper som baseras på både bild- och djupinmatningsområden och kan självständigt övergå mellan normala och ljusfattiga förhållanden genom att utnyttja djup för att kompensera för den försämrade bildinformationen. DeepLearning Visual Odometry Computer Vision RGB-D Camera Feature Extraction Interest Point Extraction Djupinlärning Visuell Odometri Datorseende RGB-D-kamera Nyckelpunkter Detektion Computer and Information Sciences Data- och informationsvetenskap
8	Acquisition et rendu 3D réaliste à partir de périphériques "grand public" / Capture and Realistic 3D rendering from consumer grade devices Chakib, Reda 14 December 2018 (has links) L'imagerie numérique, de la synthèse d'images à la vision par ordinateur est en train de connaître une forte évolution, due entre autres facteurs à la démocratisation et au succès commercial des caméras 3D. Dans le même contexte, l'impression 3D grand public, qui est en train de vivre un essor fulgurant, contribue à la forte demande sur ce type de caméra pour les besoins de la numérisation 3D. L'objectif de cette thèse est d'acquérir et de maîtriser un savoir-faire dans le domaine de la capture/acquisition de modèles 3D en particulier sur l'aspect rendu réaliste. La réalisation d'un scanner 3D à partir d'une caméra RGB-D fait partie de l'objectif. Lors de la phase d'acquisition, en particulier pour un dispositif portable, on est confronté à deux problèmes principaux, le problème lié au référentiel de chaque capture et le rendu final de l'objet reconstruit. / Digital imaging, from the synthesis of images to computer vision isexperiencing a strong evolution, due among other factors to the democratization and commercial success of 3D cameras. In the same context, the consumer 3D printing, which is experiencing a rapid rise, contributes to the strong demand for this type of camera for the needs of 3D scanning. The objective of this thesis is to acquire and master a know-how in the field of the capture / acquisition of 3D models in particular on the rendered aspect. The realization of a 3D scanner from a RGB-D camera is part of the goal. During the acquisition phase, especially for a portable device, there are two main problems, the problem related to the repository of each capture and the final rendering of the reconstructed object. Numérisation 3D Nuage de point 3D Caméra à lumière structurée Etalonnage de caméra Caméra RGB-D Modèle sténopé Etalonnage intrinsèque Recalage de nuage de points BRDF 3D scanning 3D point cloud Structured-light camera Camera calibration RGB-D camera Pinhole model Intrinsic calibration Point cloud registration BRDF 006.696

Search results