Global ETD Search

411	Reconstruction 3D de l'environnement dynamique d'un véhicule à l'aide d'un système multi-caméras hétérogène en stéréo wide-baseline / 3D reconstruction of the dynamic environment surrounding a vehicle using a heterogeneous multi-camera system in wide-baseline stereo Mennillo, Laurent 05 June 2019 (has links) Cette thèse a été réalisée dans le secteur de l'industrie automobile, en collaboration avec le Groupe Renault et concerne en particulier le développement de systèmes d'aide à la conduite avancés et de véhicules autonomes. Les progrès réalisés par la communauté scientifique durant les dernières décennies, dans les domaines de l'informatique et de la robotique notamment, ont été si importants qu'ils permettent aujourd'hui la mise en application de systèmes complexes au sein des véhicules. Ces systèmes visent dans un premier temps à réduire les risques inhérents à la conduite en assistant les conducteurs, puis dans un second temps à offrir des moyens de transport entièrement autonomes. Les méthodes de SLAM multi-objets actuellement intégrées au sein de ces véhicules reposent pour majeure partie sur l'utilisation de capteurs embarqués très performants tels que des télémètres laser, au coût relativement élevé. Les caméras numériques en revanche, de par leur coût largement inférieur, commencent à se démocratiser sur certains véhicules de grande série et assurent généralement des fonctions d'assistance à la conduite, pour l'aide au parking ou le freinage d'urgence, par exemple. En outre, cette implantation plus courante permet également d'envisager leur utilisation afin de reconstruire l'environnement dynamique proche des véhicules en trois dimensions. D'un point de vue scientifique, les techniques de SLAM visuel multi-objets existantes peuvent être regroupées en deux catégories de méthodes. La première catégorie et plus ancienne historiquement concerne les méthodes stéréo, faisant usage de plusieurs caméras à champs recouvrants afin de reconstruire la scène dynamique observée. La plupart reposent en général sur l'utilisation de paires stéréo identiques et placées à faible distance l'une de l'autre, ce qui permet un appariement dense des points d'intérêt dans les images et l'estimation de cartes de disparités utilisées lors de la segmentation du mouvement des points reconstruits. L'autre catégorie de méthodes, dites monoculaires, ne font usage que d'une unique caméra lors du processus de reconstruction. Cela implique la compensation du mouvement propre du système d'acquisition lors de l'estimation du mouvement des autres objets mobiles de la scène de manière indépendante. Plus difficiles, ces méthodes posent plusieurs problèmes, notamment le partitionnement de l'espace de départ en plusieurs sous-espaces représentant les mouvements individuels de chaque objet mobile, mais aussi le problème d'estimation de l'échelle relative de reconstruction de ces objets lors de leur agrégation au sein de la scène statique. La problématique industrielle de cette thèse, consistant en la réutilisation des systèmes multi-caméras déjà implantés au sein des véhicules, majoritairement composés d'un caméra frontale et de caméras surround équipées d'objectifs très grand angle, a donné lieu au développement d'une méthode de reconstruction multi-objets adaptée aux systèmes multi-caméras hétérogènes en stéréo wide-baseline. Cette méthode est incrémentale et permet la reconstruction de points mobiles éparses, grâce notamment à plusieurs contraintes géométriques de segmentation des points reconstruits ainsi que de leur trajectoire. Enfin, une évaluation quantitative et qualitative des performances de la méthode a été menée sur deux jeux de données distincts, dont un a été développé durant ces travaux afin de présenter des caractéristiques similaires aux systèmes hétérogènes existants. / This Ph.D. thesis, which has been carried out in the automotive industry in association with Renault Group, mainly focuses on the development of advanced driver-assistance systems and autonomous vehicles. The progress made by the scientific community during the last decades in the fields of computer science and robotics has been so important that it now enables the implementation of complex embedded systems in vehicles. These systems, primarily designed to provide assistance in simple driving scenarios and emergencies, now aim to offer fully autonomous transport. Multibody SLAM methods currently used in autonomous vehicles often rely on high-performance and expensive onboard sensors such as LIDAR systems. On the other hand, digital video cameras are much cheaper, which has led to their increased use in newer vehicles to provide driving assistance functions, such as parking assistance or emergency braking. Furthermore, this relatively common implementation now allows to consider their use in order to reconstruct the dynamic environment surrounding a vehicle in three dimensions. From a scientific point of view, existing multibody visual SLAM techniques can be divided into two categories of methods. The first and oldest category concerns stereo methods, which use several cameras with overlapping fields of view in order to reconstruct the observed dynamic scene. Most of these methods use identical stereo pairs in short baseline, which allows for the dense matching of feature points to estimate disparity maps that are then used to compute the motions of the scene. The other category concerns monocular methods, which only use one camera during the reconstruction process, meaning that they have to compensate for the ego-motion of the acquisition system in order to estimate the motion of other objects. These methods are more difficult in that they have to address several additional problems, such as motion segmentation, which consists in clustering the initial data into separate subspaces representing the individual movement of each object, but also the problem of the relative scale estimation of these objects before their aggregation within the static scene. The industrial motive for this work lies in the use of existing multi-camera systems already present in actual vehicles to perform dynamic scene reconstruction. These systems, being mostly composed of a front camera accompanied by several surround fisheye cameras in wide-baseline stereo, has led to the development of a multibody reconstruction method dedicated to such heterogeneous systems. The proposed method is incremental and allows for the reconstruction of sparse mobile points as well as their trajectory using several geometric constraints. Finally, a quantitative and qualitative evaluation conducted on two separate datasets, one of which was developed during this thesis in order to present characteristics similar to existing multi-camera systems, is provided. Reconstruction 3D SLAM visuel multi-objets Systèmes multi-caméras hétérogènes Stéréo wide-baseline Ajustement de faisceaux 3D reconstruction Multibody VSLAM Heterogeneous multi-camera systems Wide-baseline stereo Bundle-adjustment
412	Mesure de vibrations par vision 3D / 3D vision vibration measurement Durand-Texte, Thomas 11 January 2019 (has links) La finalité de cette thèse est d’étudier la pertinence et les limites des méthodes de vision 3D couplées à une caméra ultra-rapide pour la mesure de champs vibratoires, sans contact et de manière synchrone, dans le domaine des fréquences associées à la vibro-acoustique. Un premier montage pseudo stéréoscopique, issu de la robotique, mobilisant un jeu de quatre miroirs afin de générer deux vues virtuelles à partir d’une seule caméra réelle, a été testé sur une plaque et un haut-parleur. Les résultats, validés par comparaison avec ceux obtenus avec un vibromètre laser, attestent de la pertinence de l’approche en dépit des contraintes liées aux éléments optiques. Dans une logique de simplification, trois autres montages ont alors été proposés et testés, permettant de concevoir deux techniques de mesure de vibration plein champ et une méthode itérative de rectification d’images (IRIs), adaptées au contexte. La méthode sans miroir utilise une ligne mathématique pour la triangulation et est fondamentalement adaptée à la mesure de vibrations mono-axiales d'objets globalement plans, affichant des déplacements non-répétables selon la normale de la surface ou selon un axe connu. La méthode à caméras asynchrones, quant à elle, utilise une caméra ultra-rapide et une caméra rapide, et permet la mesure de vibrations multi-axiales de phénomènes vibratoires 3D. Les résultats obtenus sur le capot d’une voiture et sur un haut-parleur attestent de son potentiel pour la caractérisation de panneaux ou le test qualité de fin de chaine d’assemblage de haut-parleurs par exemple. En conclusion, les trois protocoles de mesure et les résultats associés sont comparés afin de cibler leurs potentialités et limites respectives dans le contexte de la mesure de vibrations. / The objective of this Ph.D is to study the relevance and limits of 3D vision methods coupled to high-speed cameras and applied to non-contact synchronous vibration measurement, in the vibro-acoustic range of frequencies. A first pseudo stereoscopic set-up, taken from robotics, using a four-mirror adapter in order to generate two virtual viewpoints from a single real camera, has been tested on a plate and a loudspeaker. The results, validated by comparison with those obtained with a laser vibrometre, prove the relevance of the approach, despite some constraints related to the optical elements. In a logic of simplification, three other set-ups have then been proposed and tested, allowing designing two full-field vibration measurement techniques and a method for the Iterative Rectification of Images (IRIs), adapted to the context. The no-mirror method uses a mathematical line to triangulate positions and is basically suited to measure the single-axis vibrations of globally plane objects, displaying non-repeatable displacements along the normal of the surface or along a known axis. The asynchronous cameras technique requires a high-speed and an industrial camera used simultaneously to measure the multi-axis displacements of 3D vibratory phenomena. The results obtained on a car bonnet and a loudspeaker prove its potential to characterise large panels or to carry out end-of-line testing of loud-speakers for example. In conclusion, the three measurement protocols and the associated results are compared in order to assess their respective potentialities and limits in the context of vibration measurement. Vision 3D Caméra ultra-rapide Vibro-acoustique Stéréo corrélation d’images (SDIC) 3D vision High-speed camera Vibro-acoustics Stereo Digital Image Correlation (SDIC) 534.5
413	強健式視覺追蹤應用於擴增實境之研究 / Robust visual tracking for augmented reality 王瑞鴻, Wang, Ruei Hong Unknown Date (has links) 視覺追蹤(visual tracking)一直是傳統電腦視覺研究中相當重要的議題，許多電腦視覺的應用都需要結合視覺追蹤的幫助才能實現。近年來擴增實境(augmented reality)能快速成功的發展，均有賴於視覺追蹤技術上之精進。擴增實境採用視覺追蹤的技術，可將虛擬的物件呈現在被追蹤的物體(真實場景)上，進而達成所需之應用。由於在視覺追蹤上，被追蹤之物體易受外在環境因素影響，例如位移、旋轉、縮放、光照改變等，影響追蹤結果之精確度。本研究中，我們設計了一套全新的圖形標記方法作為視覺追蹤之參考點，能降低位移、旋轉與光照改變所造成追蹤結果的誤差，也能在複雜的背景中定位出標記圖形的正確位置，提高視覺追蹤的精確度。同時我們使用立體視覺追蹤物體，將過去只使用單一攝影機於二維影像資訊的追蹤問題，提升至使用三維空間的幾何資訊來做追蹤。然後透過剛體(rigid)特性找出旋轉量、位移量相同的物件，並且結合一致性隨機取樣(random sample consensus)之技巧以估測最佳的剛體物件運動模型，達到強健性追蹤的目的。另外，我們可由使用者提供之影片資訊中擷取特定資料，透過建模技術將所產生之虛擬物件呈現於使用者介面(或被追蹤之物體)上，並藉由這些虛擬物件，提供真實世界外之資訊，達成導覽指引(或擴增實境)的效果。實驗結果顯示，我們的方法具有辨識時間快、抗光照變化強、定位準確度高的特性，適合於擴增實境應用，同時我們設計的標記圖形尺寸小，方便適用於導覽指引等應用。 / Visual tracking is one of the most important research topics in traditional computer vision. Many computer vision applications can not be realized without the integration of visual tracking techniques. The fast growing of augmented reality in recent years relied on the improvement of visual tracking technologies. External environment such as object displacement, rotation, and scaling as well as illumination conditions will always influence the accuracy of visual tracking. In this thesis, we designed a set of markers that can reduce the errors induced by the illumination condition changes as well as that by the object displacement, rotation, and scaling. It can also correctly position the markers in complicated background to increase the tracking accuracy. Instead of using single camera tracking in 2D spaces, we used stereo vision techniques to track the objects in 3D spaces. We also used the properties of rigid objects and search for the objects with the same amount of rotation and displacement. Together with the techniques of random sample consensus, we can estimate the best rigid object motion model and achieve tracking robustness. Moreover, from the user supplied video, we can capture particular information and then generate the virtual objects that can be displaced on the user’s device (or on the tracked objects). Using these techniques we can either achieve navigation or guidance in real world or achieve augmented reality as we expected. The experimental results show that our mechanism has the characteristics of fast recognition, accurate positioning, and resisting to illumination changes that are suitable for augmented reality. Also, the size of the markers we designed is very small and good for augmented reality application. 擴增實境視覺追蹤立體視覺剛體運動 Augmented reality visual tracking stereo vision rigid body motion
414	Multi-scale Methods for Omnidirectional Stereo with Application to Real-time Virtual Walkthroughs Brunton, Alan P 28 November 2012 (has links) This thesis addresses a number of problems in computer vision, image processing, and geometry processing, and presents novel solutions to these problems. The overarching theme of the techniques presented here is a multi-scale approach, leveraging mathematical tools to represent images and surfaces at different scales, and methods that can be adapted from one type of domain (eg., the plane) to another (eg., the sphere). The main problem addressed in this thesis is known as stereo reconstruction: reconstructing the geometry of a scene or object from two or more images of that scene. We develop novel algorithms to do this, which work for both planar and spherical images. By developing a novel way to formulate the notion of disparity for spherical images, we are able effectively adapt our algorithms from planar to spherical images. Our stereo reconstruction algorithm is based on a novel application of distance transforms to multi-scale matching. We use matching information aggregated over multiple scales, and enforce consistency between these scales using distance transforms. We then show how multiple spherical disparity maps can be efficiently and robustly fused using visibility and other geometric constraints. We then show how the reconstructed point clouds can be used to synthesize a realistic sequence of novel views, images from points of view not captured in the input images, in real-time. Along the way to this result, we address some related problems. For example, multi-scale features can be detected in spherical images by convolving those images with a filterbank, generating an overcomplete spherical wavelet representation of the image from which the multiscale features can be extracted. Convolution of spherical images is much more efficient in the spherical harmonic domain than in the spatial domain. Thus, we develop a GPU implementation for fast spherical harmonic transforms and frequency domain convolutions of spherical images. This tool can also be used to detect multi-scale features on geometric surfaces. When we have a point cloud of a surface of a particular class of object, whether generated by stereo reconstruction or by some other modality, we can use statistics and machine learning to more robustly estimate the surface. If we have at our disposal a database of surfaces of a particular type of object, such as the human face, we can compute statistics over this database to constrain the possible shape a new surface of this type can take. We show how a statistical spherical wavelet shape prior can be used to efficiently and robustly reconstruct a face shape from noisy point cloud data, including stereo data. multi-scale wavelets stereo reconstruction omnidirectional vision real-time novel view synthesis real-time virtual walkthroughs spherical parameterizations spherical harmonics GPU programming
415	Variable-aperture Photography Hasinoff, Samuel William 19 January 2009 (has links) While modern digital cameras incorporate sophisticated engineering, in terms of their core functionality, cameras have changed remarkably little in more than a hundred years. In particular, from a given viewpoint, conventional photography essentially remains limited to manipulating a basic set of controls: exposure time, focus setting, and aperture setting. In this dissertation we present three new methods in this domain, each based on capturing multiple photos with different camera settings. In each case, we show how defocus can be exploited to achieve different goals, extending what is possible with conventional photography. These methods are closely connected, in that all rely on analyzing changes in aperture. First, we present a 3D reconstruction method especially suited for scenes with high geometric complexity, for which obtaining a detailed model is difficult using previous approaches. We show that by controlling both the focus and aperture setting, it is possible compute depth for each pixel independently. To achieve this, we introduce the "confocal constancy" property, which states that as aperture setting varies, the pixel intensity of an in-focus scene point will vary in a scene-independent way that can be predicted by prior calibration. Second, we describe a method for synthesizing photos with adjusted camera settings in post-capture, to achieve changes in exposure, focus setting, etc. from very few input photos. To do this, we capture photos with varying aperture and other settings fixed, then recover the underlying scene representation best reproducing the input. The key to the approach is our layered formulation, which handles occlusion effects but is tractable to invert. This method works with the built-in "aperture bracketing" mode found on most digital cameras. Finally, we develop a "light-efficient" method for capturing an in-focus photograph in the shortest time, or with the highest quality for a given time budget. While the standard approach involves reducing the aperture until the desired region is in-focus, we show that by "spanning" the region with multiple large-aperture photos,we can reduce the total capture time and generate the in-focus photo synthetically. Beyond more efficient capture, our method provides 3D shape at no additional cost. computer vision computer graphics computational photography 3D reconstruction shape-from-focus shape-from-defocus image-based rendering high dynamic range imaging camera calibration multiview stereo optics 0984
416	Variable-aperture Photography Hasinoff, Samuel William 19 January 2009 (has links) While modern digital cameras incorporate sophisticated engineering, in terms of their core functionality, cameras have changed remarkably little in more than a hundred years. In particular, from a given viewpoint, conventional photography essentially remains limited to manipulating a basic set of controls: exposure time, focus setting, and aperture setting. In this dissertation we present three new methods in this domain, each based on capturing multiple photos with different camera settings. In each case, we show how defocus can be exploited to achieve different goals, extending what is possible with conventional photography. These methods are closely connected, in that all rely on analyzing changes in aperture. First, we present a 3D reconstruction method especially suited for scenes with high geometric complexity, for which obtaining a detailed model is difficult using previous approaches. We show that by controlling both the focus and aperture setting, it is possible compute depth for each pixel independently. To achieve this, we introduce the "confocal constancy" property, which states that as aperture setting varies, the pixel intensity of an in-focus scene point will vary in a scene-independent way that can be predicted by prior calibration. Second, we describe a method for synthesizing photos with adjusted camera settings in post-capture, to achieve changes in exposure, focus setting, etc. from very few input photos. To do this, we capture photos with varying aperture and other settings fixed, then recover the underlying scene representation best reproducing the input. The key to the approach is our layered formulation, which handles occlusion effects but is tractable to invert. This method works with the built-in "aperture bracketing" mode found on most digital cameras. Finally, we develop a "light-efficient" method for capturing an in-focus photograph in the shortest time, or with the highest quality for a given time budget. While the standard approach involves reducing the aperture until the desired region is in-focus, we show that by "spanning" the region with multiple large-aperture photos,we can reduce the total capture time and generate the in-focus photo synthetically. Beyond more efficient capture, our method provides 3D shape at no additional cost. computer vision computer graphics computational photography 3D reconstruction shape-from-focus shape-from-defocus image-based rendering high dynamic range imaging camera calibration multiview stereo optics 0984
417	Three-Dimensional Hand Tracking and Surface-Geometry Measurement for a Robot-Vision System Liu, Chris Yu-Liang 17 January 2009 (has links) Tracking of human motion and object identification and recognition are important in many applications including motion capture for human-machine interaction systems. This research is part of a global project to enable a service robot to recognize new objects and perform different object-related tasks based on task guidance and demonstration provided by a general user. This research consists of the calibration and testing of two vision systems which are part of a robot-vision system. First, real-time tracking of a human hand is achieved using images acquired from three calibrated synchronized cameras. Hand pose is determined from the positions of physical markers and input to the robot system in real-time. Second, a multi-line laser camera range sensor is designed, calibrated, and mounted on a robot end-effector to provide three-dimensional (3D) geometry information about objects in the robot environment. The laser-camera sensor includes two cameras to provide stereo vision. For the 3D hand tracking, a novel score-based hand tracking scheme is presented employing dynamic multi-threshold marker detection, a stereo camera-pair utilization scheme, marker matching and labeling using epipolar geometry and hand pose axis analysis, to enable real-time hand tracking under occlusion and non-uniform lighting environments. For surface-geometry measurement using the multi-line laser range sensor, two different approaches are analyzed for two-dimensional (2D) to 3D coordinate mapping, using Bezier surface fitting and neural networks, respectively. The neural-network approach was found to be a more viable approach for surface-geometry measurement worth future exploration for its lower magnitude of 3D reconstruction error and consistency over different regions of the object space. Three-dimensional hand tracking Laser-camera range sensor Surface-geometry measurement Three-dimensional reconstruction Vision–based tracking Stereo vision System Design Engineering
418	Three-Dimensional Hand Tracking and Surface-Geometry Measurement for a Robot-Vision System Liu, Chris Yu-Liang 17 January 2009 (has links) Tracking of human motion and object identification and recognition are important in many applications including motion capture for human-machine interaction systems. This research is part of a global project to enable a service robot to recognize new objects and perform different object-related tasks based on task guidance and demonstration provided by a general user. This research consists of the calibration and testing of two vision systems which are part of a robot-vision system. First, real-time tracking of a human hand is achieved using images acquired from three calibrated synchronized cameras. Hand pose is determined from the positions of physical markers and input to the robot system in real-time. Second, a multi-line laser camera range sensor is designed, calibrated, and mounted on a robot end-effector to provide three-dimensional (3D) geometry information about objects in the robot environment. The laser-camera sensor includes two cameras to provide stereo vision. For the 3D hand tracking, a novel score-based hand tracking scheme is presented employing dynamic multi-threshold marker detection, a stereo camera-pair utilization scheme, marker matching and labeling using epipolar geometry and hand pose axis analysis, to enable real-time hand tracking under occlusion and non-uniform lighting environments. For surface-geometry measurement using the multi-line laser range sensor, two different approaches are analyzed for two-dimensional (2D) to 3D coordinate mapping, using Bezier surface fitting and neural networks, respectively. The neural-network approach was found to be a more viable approach for surface-geometry measurement worth future exploration for its lower magnitude of 3D reconstruction error and consistency over different regions of the object space. Three-dimensional hand tracking Laser-camera range sensor Surface-geometry measurement Three-dimensional reconstruction Vision–based tracking Stereo vision System Design Engineering
419	Strukturelle Ansätze für die Stereorekonstruktion / Stuctural approaches for stereo-reconstruction Shlezinger, Dmytro 15 August 2005 (has links) (PDF) Die Dissertation beschäftigt sich mit Labeling Problemen. Dieses Forschungsgebiet bildet einen wichtigen Teil der strukturellen Mustererkennung, in der die Struktur des zu erkennenden Objektes explizit berücksichtigt wird. Die entwickelte Theorie wird auf die Aufgabe der Stereorekonstruktion angewendet. / The thesis studies the class of labeling problems. This theory contributes to the new stream in pattern recognition in which structure is explicitly taken into account. The developed theory is applied to practical problem of stereo reconstruction. Labeling Probleme Stereorekonstruktion Strukturelle Mustererkennung Gibbs probability distributions Labeling problems Stereo-reconstruction Structural pattern recognition ddc:004 rvk:SK 800 Dreidimensionale Rekonstruktion Mustererkennung Wahrscheinlichkeitsverteilung
420	Étude des antineutrinos de réacteurs : mesure de l'angle de mélange leptonique θ₁₃ et recherche d'éventuels neutrinos stériles Collin, Antoine 07 January 2014 (has links) (PDF) L'expérience Double Chooz a pour but la mesure précise de l'angle de mélange θ₁₃. Son évaluation repose sur l'étude de la disparition des antineutrinos produits par les réacteurs de la centrale de Chooz, disparition due au phénomène d'oscillation des neutrinos. Deux détecteurs identiques composés de liquide scintillant permettent d'effectuer une mesure relative, diminuant ainsi les incertitudes systématiques. Le détecteur proche, qui fournit la normalisation du flux de neutrinos émis, est en cours d'installation, son achèvement est prévu pour le printemps 2014. Le détecteur lointain, sensible à l'effet de θ₁₃, est situé à un kilomètre environ et prend des données depuis 2011. Dans cette première phase de l'expérience, les données acquises par le détecteur lointain sont comparées à une prédiction du flux de neutrinos émis par les réacteurs pour estimer le paramètre θ₁₃. Au sein de cette thèse, l'expérience Double Chooz et son analyse sont présentées. Une attention particulière est portée à l'étude des bruits de fond et au rejet de signaux parasites constitués de flashs lumineux émis par les photo-multiplicateurs. Les flux de neutrons aux interfaces entre les différents volumes du détecteur affectent la définition du volume d'interaction et partant l'efficacité de détection. L'étude détaillée de ces effets de bord est présentée. Dans le cadre de l'expérience Double Chooz, des études ont été menées afin d'améliorer la prédiction des flux de neutrinos émis par les réacteurs. Ces travaux ont mis à jour un déficit des taux de neutrinos observés dans les expériences passées à courtes distances des réacteurs. Ce déficit pourrait s'expliquer par une oscillation vers une saveur stérile. Le projet Stereo a pour but d'observer la distorsion -- caractéristique de l'oscillation -- du spectre des neutrinos en énergie et en distance de propagation. Cette thèse s'attache à présenter le concept du détecteur, les simulations réalisées, ainsi que les études de sensibilité. Les différents bruits de fond et les blindages envisagés pour s'en prémunir sont enfin discutés. Double Chooz Angle de mélange θ₁₃ Oscillations de neutrinos Stereo Neutrino stérile Liquide scintillant

Search results