Global ETD Search

231	Contributions to 3D Data Registration and Representation Morell, Vicente 02 October 2014 (has links) Nowadays, new computers generation provides a high performance that enables to build computationally expensive computer vision applications applied to mobile robotics. Building a map of the environment is a common task of a robot and is an essential part to allow the robots to move through these environments. Traditionally, mobile robots used a combination of several sensors from different technologies. Lasers, sonars and contact sensors have been typically used in any mobile robotic architecture, however color cameras are an important sensor due to we want the robots to use the same information that humans to sense and move through the different environments. Color cameras are cheap and flexible but a lot of work need to be done to give robots enough visual understanding of the scenes. Computer vision algorithms are computational complex problems but nowadays robots have access to different and powerful architectures that can be used for mobile robotics purposes. The advent of low-cost RGB-D sensors like Microsoft Kinect which provide 3D colored point clouds at high frame rates made the computer vision even more relevant in the mobile robotics field. The combination of visual and 3D data allows the systems to use both computer vision and 3D processing and therefore to be aware of more details of the surrounding environment. The research described in this thesis was motivated by the need of scene mapping. Being aware of the surrounding environment is a key feature in many mobile robotics applications from simple robotic navigation to complex surveillance applications. In addition, the acquisition of a 3D model of the scenes is useful in many areas as video games scene modeling where well-known places are reconstructed and added to game systems or advertising where once you get the 3D model of one room the system can add furniture pieces using augmented reality techniques. In this thesis we perform an experimental study of the state-of-the-art registration methods to find which one fits better to our scene mapping purposes. Different methods are tested and analyzed on different scene distributions of visual and geometry appearance. In addition, this thesis proposes two methods for 3d data compression and representation of 3D maps. Our 3D representation proposal is based on the use of Growing Neural Gas (GNG) method. This Self-Organizing Maps (SOMs) has been successfully used for clustering, pattern recognition and topology representation of various kind of data. Until now, Self-Organizing Maps have been primarily computed offline and their application in 3D data has mainly focused on free noise models without considering time constraints. Self-organising neural models have the ability to provide a good representation of the input space. In particular, the Growing Neural Gas (GNG) is a suitable model because of its flexibility, rapid adaptation and excellent quality of representation. However, this type of learning is time consuming, specially for high-dimensional input data. Since real applications often work under time constraints, it is necessary to adapt the learning process in order to complete it in a predefined time. This thesis proposes a hardware implementation leveraging the computing power of modern GPUs which takes advantage of a new paradigm coined as General-Purpose Computing on Graphics Processing Units (GPGPU). Our proposed geometrical 3D compression method seeks to reduce the 3D information using plane detection as basic structure to compress the data. This is due to our target environments are man-made and therefore there are a lot of points that belong to a plane surface. Our proposed method is able to get good compression results in those man-made scenarios. The detected and compressed planes can be also used in other applications as surface reconstruction or plane-based registration algorithms. Finally, we have also demonstrated the goodness of the GPU technologies getting a high performance implementation of a CAD/CAM common technique called Virtual Digitizing. 3D representation method Growing Neural Gas Self-Organizing Maps Topology Preservation Parallel Computing CUDA Real-time Point Cloud GPGPU RGB-D Noisy 3D data 3D registration 3D compression
232	High Speed, Micron Precision Scanning Technology for 3D Printing Applications Emord, Nicholas 01 January 2018 (has links) Modern 3D printing technology is becoming a more viable option for use in industrial manufacturing. As the speed and precision of rapid prototyping technology improves, so too must the 3D scanning and verification technology. Current 3D scanning technology (such as CT Scanners) produce the resolution needed for micron precision inspection. However, the method lacks in speed. Some scans can be multiple gigabytes in size taking several minutes to acquire and process. Especially in high volume manufacturing of 3D printed parts, such delays prohibit the widespread adaptation of 3D scanning technology for quality control. The limiting factors of current technology boil down to computational and processing power along with available sensor resolution and operational frequency. Realizing a 3D scanning system that produces micron precision results within a single minute promises to revolutionize the quality control industry. The specific 3D scanning method considered in this thesis utilizes a line profile triangulation sensor with high operational frequency, and a high-precision mechanical actuation apparatus for controlling the scan. By syncing the operational frequency of the sensor to the actuation velocity of the apparatus, a 3D point cloud is rapidly acquired. Processing of the data is then performed using MATLAB on contemporary computing hardware, which includes proper point cloud formatting and implementation of the Iterative Closest Point (ICP) algorithm for point cloud stitching. Theoretical and physical experiments are performed to demonstrate the validity of the method. The prototyped system is shown to produce multiple loosely-registered micron precision point clouds of a 3D printed object that are then stitched together to form a full point cloud representative of the original part. This prototype produces micron precision results in approximately 130 seconds, but the experiments illuminate upon the additional investments by which this time could be further reduced to approach the revolutionizing one-minute milestone. Thesis University of North Florida UNF 3D Scanning 3D Printing Point Cloud Iterative Closest Point ICP Electrical and Electronics Electro-Mechanical Systems
233	基於多視角幾何萃取精確影像對應之研究 / Accurate image matching based on multiple view geometry 謝明龍, Hsieh, Ming Lung Unknown Date (has links) 近年來諸多學者專家致力於從多視角影像獲取精確的點雲資訊，並藉由點雲資訊進行三維模型重建等研究，然而透過多視角影像求取三維資訊的精確度仍然有待提升，其中萃取影像對應與重建三維資訊方法，是多視角影像重建三維資訊的關鍵核心，決定點雲資訊的形成方式與成效。本論文中，我們提出了一套新的方法，由多視角影像之間的幾何關係出發，萃取多視角影像對應與重建三維點，可以有效地改善對應點與三維點的精確度。首先，在萃取多視角影像對應的部份，我們以相互支持轉換、動態高斯濾波法與綜合性相似度評估函數，改善補綴面為基礎的比對方法，提高相似度測量值的辨識力與可信度，可從多視角影像中獲得精確的對應點。其次，在重建三維點的部份，我們使用K均值分群演算法與線性內插法發掘潛在的三維點，讓求出的三維點更貼近三維空間真實物體表面，能在多視角影像中獲得更精確的三維點。實驗結果顯示，採用本研究所提出的方法進行改善後，在對應點精確度的提升上有很好的成效，所獲得的點雲資訊存在數萬個精確的三維點，而且僅有少數的離群點。 / Recently, many researchers pay attentions in obtaining accurate point cloud data from multi-view images and use these data in 3D model reconstruction. However, this accuracy still needs to be improved. Among these researches, the methods in extracting the corresponding points as well as computing the 3D point information are the most critical ones. These methods practically affect the final results of the point cloud data and the 3D models so constructed. In this thesis, we propose new approaches, based on multi-view geometry, to improve the accuracy of corresponding points and 3D points. Mutual support transformation, dynamic Gaussian filtering, and similarity evaluation function were used to improve the patch-based matching methods in multi-view image correspondence. Using these mechanisms, the discrimination ability and reliability of the similarity function and, hence, the accuracy of the extracted corresponding points can be greatly improved. We also used K-mean algorithms and linear interpolations to find the better 3D point candidates. The 3D point so computed will be much closer to the surface of the actual 3D object. Thus, this mechanism will produce highly accurate 3D points. Experimental results show that our mechanism can improve the accuracy of corresponding points as well as the 3D point cloud data. We successfully generated accurate point cloud data that contains tens of thousands 3D points, and, moreover, only has a few outliers. 多視角影像對應點匹配補綴面點雲三維模型重建 multi-view images corresponding point matching patch point cloud 3D model reconstruction
234	Examination of airborne discrete-return lidar in prediction and identification of unique forest attributes Wing, Brian M. 08 June 2012 (has links) Airborne discrete-return lidar is an active remote sensing technology capable of obtaining accurate, fine-resolution three-dimensional measurements over large areas. Discrete-return lidar data produce three-dimensional object characterizations in the form of point clouds defined by precise x, y and z coordinates. The data also provide intensity values for each point that help quantify the reflectance and surface properties of intersected objects. These data features have proven to be useful for the characterization of many important forest attributes, such as standing tree biomass, height, density, and canopy cover, with new applications for the data currently accelerating. This dissertation explores three new applications for airborne discrete-return lidar data. The first application uses lidar-derived metrics to predict understory vegetation cover, which has been a difficult metric to predict using traditional explanatory variables. A new airborne lidar-derived metric, understory lidar cover density, created by filtering understory lidar points using intensity values, increased the coefficient of variation (R²) from non-lidar understory vegetation cover estimation models from 0.2-0.45 to 0.7-0.8. The method presented in this chapter provides the ability to accurately quantify understory vegetation cover (± 22%) at fine spatial resolutions over entire landscapes within the interior ponderosa pine forest type. In the second application, a new method for quantifying and locating snags using airborne discrete-return lidar is presented. The importance of snags in forest ecosystems and the inherent difficulties associated with their quantification has been well documented. A new semi-automated method using both 2D and 3D local-area lidar point filters focused on individual point spatial location and intensity information is used to identify points associated with snags and eliminate points associated with live trees. The end result is a stem map of individual snags across the landscape with height estimates for each snag. The overall detection rate for snags DBH ≥ 38 cm was 70.6% (standard error: ± 2.7%), with low commission error rates. This information can be used to: analyze the spatial distribution of snags over entire landscapes, provide a better understanding of wildlife snag use dynamics, create accurate snag density estimates, and assess achievement and usefulness of snag stocking standard requirements. In the third application, live above-ground biomass prediction models are created using three separate sets of lidar-derived metrics. Models are then compared using both model selection statistics and cross-validation. The three sets of lidar-derived metrics used in the study were: 1) a 'traditional' set created using the entire plot point cloud, 2) a 'live-tree' set created using a plot point cloud where points associated with dead trees were removed, and 3) a 'vegetation-intensity' set created using a plot point cloud containing points meeting predetermined intensity value criteria. The models using live-tree lidar-derived metrics produced the best results, reducing prediction variability by 4.3% over the traditional set in plots containing filtered dead tree points. The methods developed and presented for all three applications displayed promise in prediction or identification of unique forest attributes, improving our ability to quantify and characterize understory vegetation cover, snags, and live above ground biomass. This information can be used to provide useful information for forest management decisions and improve our understanding of forest ecosystem dynamics. Intensity information was useful for filtering point clouds and identifying lidar points associated with unique forest attributes (e.g., understory components, live and dead trees). These intensity filtering methods provide an enhanced framework for analyzing airborne lidar data in forest ecosystem applications. / Graduation date: 2013 Airborne lidar Understory vegetation Snag detection & quantification Standing tree biomass Intensity Beta regression Weighted regression Log-linear regression Optical radar Forest biomass -- Remote sensing Understory plants -- Remote sensing Snags (Forestry) -- Remote sensing
235	利用近紅外光影像之近景攝影測量建立數值表面模型之研究 / Construction of digital surface model using Near-IR close range photogrammetry 廖振廷, Liao, Chen Ting Unknown Date (has links) 點雲（point cloud）為以大量三維坐標描述地表實際情形的資料形式，其中包含其三維坐標及相關屬性。通常點雲資料取得方式為光達測量，其以單一波段雷射光束掃描獲取資料，以光達獲取點雲，常面臨掃描時間差、缺乏多波段資訊、可靠邊緣線及角點資訊、大量離散點雲又缺乏語意資訊（semantic information）難以直接判讀及缺乏多餘觀測量等問題。攝影測量藉由感測反射自太陽光或地物本身放射之能量，可記錄為二維多光譜影像，透過地物在不同光譜範圍表現之特性，可輔助分類，改善分類成果。若匹配多張高重疊率的多波段影像，可以獲取包含多波段資訊且位於明顯特徵點上的點雲，提供光達以外的點雲資料來源。傳統空中三角測量平差解算地物點坐標及產製數值表面模型（Digital Surface Model, DSM）時，多採用可見光影像為主；而目前常見之高空間解析度數值航照影像，除了記錄可見光波段之外，亦可蒐集近紅外光波段影像。但較少採用近紅外光波段影像，以求解地物點坐標及建立DSM。因此本研究利用多波段影像所蘊含的豐富光譜資訊，以取像方式簡易及低限制條件的近景攝影測量方式，匹配多張可見光、近紅外光及紅外彩色影像，分別建立可見光、近紅外光及紅外彩色之DSM，其目的在於探討加入近紅外光波段後，所產生的近紅外光及紅外彩色DSM，和可見光DSM之異同；並比較該DSM是否更能突顯植被區。研究顯示，以可見光點雲為檢核資料，計算近紅外光與紅外彩色點雲的均方根誤差為其距離門檻值之相對檢核方法，可獲得約21%的點雲增加率；然而使用近紅外光或紅外彩色影像，即使能增加點雲資料量，但對於增加可見光影像未能匹配的資料方面，其效果仍屬有限。 / Point cloud represents the surface as mass 3D coordinates and attributes. Generally, these data are usually collected by LIDAR (LIght Detection And Ranging), which acquires data through single band laser scanning. But the data collected by LIDAR could face problems, such as scanning process is not instantaneous, lack of multispectral information, breaklines, corners, semantic information and redundancies. However, photogrammetry record the electromagnetic energy reflected or emitted from the surface as 2D multispectral images, via ground features with different characteristics differ in spectrum, it can be classified more efficiently and precisely. By matching multiple high overlapping multispectral images, point cloud including multispectral information and locating on obvious feature points can be acquired. This provides another point cloud source aparting from LIDAR. In most studies, visible light (VIS) images are used primarily, while calculating ground point coordinates and generating digital surface models （DSM） through aerotriangulation. Although nowadays, high spatial resolution digital aerial images can acquire not only VIS channel, but also near infrared (NIR) channel as well. But there is lack of research doing the former procedures by using NIR images. Therefore, this research focuses on the rich spectral information in multispectral images, by using easy image collection and low restriction close range photogrammetry method. It matches several VIS, NIR and color infrared (CIR) images, and generate DSMs respectively. The purpose is to analyze the difference between VIS, NIR and CIR data sets, and whether it can emphasize the vegetation area, after adding NIR channel in DSM generation. The result shows that by using relative check points between NIR, CIR data with VIS one. First, VIS point cloud was set as check point data, then, the RMSE (Root Mean Square Error) of NIR and CIR point cloud was calculated as distance threshold. Its data increment is 21% ca. However, the point cloud data amount can be increased, by matching NIR and CIR images. But the effect of increasing data, which was not being matched from VIS images are limited. 近紅外光影像近景攝影測量點雲數值表面模型 Near Infrared Images Close Range Photogrammetry Point Cloud Digital Surface Model (DSM)
236	Design and Calibration of a Network of RGB-D Sensors for Robotic Applications over Large Workspaces Rizwan, Macknojia 21 March 2013 (has links) This thesis presents an approach for configuring and calibrating a network of RGB-D sensors used to guide a robotic arm to interact with objects that get rapidly modeled in 3D. The system is based on Microsoft Kinect sensors for 3D data acquisition. The work presented here also details an analysis and experimental study of the Kinect’s depth sensor capabilities and performance. The study comprises examination of the resolution, quantization error, and random distribution of depth data. In addition, the effects of color and reflectance characteristics of an object are also analyzed. The study examines two versions of Kinect sensors, one dedicated to operate with the Xbox 360 video game console and the more recent Microsoft Kinect for Windows version. The study of the Kinect sensor is extended to the design of a rapid acquisition system dedicated to large workspaces by the linkage of multiple Kinect units to collect 3D data over a large object, such as an automotive vehicle. A customized calibration method for this large workspace is proposed which takes advantage of the rapid 3D measurement technology embedded in the Kinect sensor and provides registration accuracy between local sections of point clouds that is within the range of the depth measurements accuracy permitted by the Kinect technology. The method is developed to calibrate all Kinect units with respect to a reference Kinect. The internal calibration of the sensor in between the color and depth measurements is also performed to optimize the alignment between the modalities. The calibration of the 3D vision system is also extended to formally estimate its configuration with respect to the base of a manipulator robot, therefore allowing for seamless integration between the proposed vision platform and the kinematic control of the robot. The resulting vision-robotic system defines the comprehensive calibration of reference Kinect with the robot. The latter can then be used to interact under visual guidance with large objects, such as vehicles, that are positioned within a significantly enlarged field of view created by the network of RGB-D sensors. The proposed design and calibration method is validated in a real world scenario where five Kinect sensors operate collaboratively to rapidly and accurately reconstruct a 180 degrees coverage of the surface shape of various types of vehicles from a set of individual acquisitions performed in a semi-controlled environment, that is an underground parking garage. The vehicle geometrical properties generated from the acquired 3D data are compared with the original dimensions of the vehicle. camera calibration RGB-D imaging 3D reconstruction 3D profiling Kinect vehicle inspection robotic guidance point cloud registration OpenNI depth measurement multi camera system
237	以全波形光達之波形資料輔助製作植被覆蓋區數值高程模型 / DEM Generation with Full-Waveform LiDAR Data in Vegetation Area 廖思睿, Liao, Sui Jui Unknown Date (has links) 在植被覆蓋的山區中，由於空載雷射掃描可穿透植被間縫隙的特性，有較高機會收集到植被下的地面資訊，因此適合作為製作植被覆蓋地區數值高程模型的資料來源，而在過濾過程中，一般僅利用點雲間的三維位置關係進行幾何過濾，而全波形空載雷射掃描可另外提供點位的波形寬、振幅值、散射截面積以及散射截面積數等波形資料，本研究將透過波形資料分析進行點雲過濾。首先經最低點採樣後，本研究利用貝氏定理自動分析並計算得到地面點的波形資料的特徵區間範圍，採用振幅值、散射截面積以及散射截面積係數得到的特徵區間範圍開始第一階段的波形資料過濾，完成後再以第二階段的一般幾何過濾濾除剩餘之非地面點，最後的成果將與航測以及只採用幾何過濾時的成果比較。由研究成果中顯示，不同的植被覆蓋間的單一回波波形資料的差異較明顯，最後回波類似。同一植被覆蓋下的單一回波及最後回波反應不同。而在成果的比較中，本實驗的成果與不採用波形資料輔助的成果大致相同本研究的成果在部分植被覆蓋的區域成果稍差，但透過波形過濾，可將幾何過濾所需計算的點雲數減少許多，可以增進整理過濾的效率。本研究的成果與航測相比時，在植被覆蓋區域較航測成果貼近實際的地面起伏，數值高程模型成果較為正確。 / In mountain areas covered with vegetation, discrete airborne laser scanning is an appropriate technique to produce DEMs for its laser signal is able to reach the ground beneath the vegetation. Once the scanned data was derived, point cloud filtering was performed based on the geometry relationship between the points at the processing stage. With the development of the advanced full-waveform laser scanning system, the additional waveform data has been proved useful for improving the performance of point cloud filtering. This research therefore focused on using the waveform data to extract DEM over vegetation covered area. The amplitude, backscatter cross-section and backscatter cross-section coefficient were the waveform parameters used to do the filtering. After initial waveform analysis was accomplished, an automated method to determine threshold range of each parameter representing ground points was proposed. By applying the thresholds, the original point cloud was filtered. Geometric filtering method was then used to eliminate the remained non-ground points. As a result, the DEM over the target vegetated area was derived. With the comparison against photogrammetric DEM and DEM derived from traditional filtering method, it was demonstrated that the quality of the resultant DEM was improved. 空載全波形雷射掃描植被貝氏定理點雲過濾數值高程模型 Airborne full-waveform laser scanning vegetation land cover point cloud filtering Bayes theorem DEM
238	Design and Calibration of a Network of RGB-D Sensors for Robotic Applications over Large Workspaces Macknojia, Rizwan 21 March 2013 (has links) This thesis presents an approach for configuring and calibrating a network of RGB-D sensors used to guide a robotic arm to interact with objects that get rapidly modeled in 3D. The system is based on Microsoft Kinect sensors for 3D data acquisition. The work presented here also details an analysis and experimental study of the Kinect’s depth sensor capabilities and performance. The study comprises examination of the resolution, quantization error, and random distribution of depth data. In addition, the effects of color and reflectance characteristics of an object are also analyzed. The study examines two versions of Kinect sensors, one dedicated to operate with the Xbox 360 video game console and the more recent Microsoft Kinect for Windows version. The study of the Kinect sensor is extended to the design of a rapid acquisition system dedicated to large workspaces by the linkage of multiple Kinect units to collect 3D data over a large object, such as an automotive vehicle. A customized calibration method for this large workspace is proposed which takes advantage of the rapid 3D measurement technology embedded in the Kinect sensor and provides registration accuracy between local sections of point clouds that is within the range of the depth measurements accuracy permitted by the Kinect technology. The method is developed to calibrate all Kinect units with respect to a reference Kinect. The internal calibration of the sensor in between the color and depth measurements is also performed to optimize the alignment between the modalities. The calibration of the 3D vision system is also extended to formally estimate its configuration with respect to the base of a manipulator robot, therefore allowing for seamless integration between the proposed vision platform and the kinematic control of the robot. The resulting vision-robotic system defines the comprehensive calibration of reference Kinect with the robot. The latter can then be used to interact under visual guidance with large objects, such as vehicles, that are positioned within a significantly enlarged field of view created by the network of RGB-D sensors. The proposed design and calibration method is validated in a real world scenario where five Kinect sensors operate collaboratively to rapidly and accurately reconstruct a 180 degrees coverage of the surface shape of various types of vehicles from a set of individual acquisitions performed in a semi-controlled environment, that is an underground parking garage. The vehicle geometrical properties generated from the acquired 3D data are compared with the original dimensions of the vehicle. camera calibration RGB-D imaging 3D reconstruction 3D profiling Kinect vehicle inspection robotic guidance point cloud registration OpenNI depth measurement multi camera system
239	Room layout estimation on mobile devices / Création de plans d’intérieur avec une tablette Angladon, Vincent 27 April 2018 (has links) L’objectif de cette thèse CIFRE est d’étudier et de tirer parti des derniers appareils mobiles du marché pour générer des 3D des pièces observées. De nous jours, ces appareils intègrent un grand nombre de capteurs, tel que des capteurs inertiels, des cameras RGB, et depuis peu, des capteurs de profondeur. Sans compter la présence de l’écran tactile qui offre une interface pour interagir avec l’utilisateur. Un cas d’usage typique de ces modèles 3D est la génération de plans d’intérieur, ou de fichiers CAO 3D (conception assistée par ordinateur) appliques a l’industrie du bâtiment. Le modèle permet d’esquisser les travaux de rénovation d’un appartement, ou d’évaluer la fidélité d’un chantier en cours avec le modèle initial. Pour le secteur de l’immobilier, la génération automatique de plans et modèles 3D peut faciliter le calcul de la surface habitable et permet de proposer des visites virtuelles a d’éventuels acquéreurs. Concernant le grand public, ces modèles 3D peuvent être intégrés a des jeux en réalité mixte afin d’offrir une expérience encore plus immersive, ou pour des applications de réalité augmentée, telles que la décoration d’intérieur. La thèse a trois contributions principales. Nous commençons par montrer comment le problème classique de détection des points de fuite dans une image, peut être revisite pour tirer parti de l’utilisation de données inertielles. Nous proposons un algorithme simple et efficace de détection de points de fuite reposant sur l’utilisation du vecteur gravite obtenu via ces données. Un nouveau jeu de données contenant des photos avec des données inertielles est présenté pour l’évaluation d’algorithmes d’estimation de points de fuite et encourager les travaux ultérieurs dans cette direction. Dans une deuxième contribution, nous explorons les approches d’odométrie visuelle de l’état de l’art qui exploitent des capteurs de profondeur. Localiser l’appareil mobile en temps réel est fondamental pour envisager des applications reposant sur la réalité augmentée. Nous proposons une comparaison d’algorithmes existants développés en grande partie pour ordinateur de bureau, afin d’étudier si leur utilisation sur un appareil mobile est envisageable. Pour chaque approche considérée, nous évaluons la précision de la localisation et les performances en temps de calcul sur mobile. Enfin, nous présentons une preuve de concept d’application permettant de générer le plan d’une pièce, en utilisant une tablette du projet Tango, équipée d’un capteur RGB-D. Notre algorithme effectue un traitement incrémental des données 3D acquises au cours de l’observation de la pièce considérée. Nous montrons comment notre approche utilise les indications de l’utilisateur pour corriger pendant la capture le modèle de la pièce. / Room layout generation is the problem of generating a drawing or a digital model of an existing room from a set of measurements such as laser data or images. The generation of floor plans can find application in the building industry to assess the quality and the correctness of an ongoing construction w.r.t. the initial model, or to quickly sketch the renovation of an apartment. Real estate industry can rely on automatic generation of floor plans to ease the process of checking the livable surface and to propose virtual visits to prospective customers. As for the general public, the room layout can be integrated into mixed reality games to provide a better immersiveness experience, or used in other related augmented reality applications such room redecoration. The goal of this industrial thesis (CIFRE) is to investigate and take advantage of the state-of-the art mobile devices in order to automate the process of generating room layouts. Nowadays, modern mobile devices usually come a wide range of sensors, such as inertial motion unit (IMU), RGB cameras and, more recently, depth cameras. Moreover, tactile touchscreens offer a natural and simple way to interact with the user, thus favoring the development of interactive applications, in which the user can be part of the processing loop. This work aims at exploiting the richness of such devices to address the room layout generation problem. The thesis has three major contributions. We first show how the classic problem of detecting vanishing points in an image can benefit from an a-priori given by the IMU sensor. We propose a simple and effective algorithm for detecting vanishing points relying on the gravity vector estimated by the IMU. A new public dataset containing images and the relevant IMU data is introduced to help assessing vanishing point algorithms and foster further studies in the field. As a second contribution, we explored the state of-the-art of real-time localization and map optimization algorithms for RGB-D sensors. Real-time localization is a fundamental task to enable augmented reality applications, and thus it is a critical component when designing interactive applications. We propose an evaluation of existing algorithms for the common desktop set-up in order to be employed on a mobile device. For each considered method, we assess the accuracy of the localization as well as the computational performances when ported on a mobile device. Finally, we present a proof of concept of application able to generate the room layout relying on a Project Tango tablet equipped with an RGB-D sensor. In particular, we propose an algorithm that incrementally processes and fuses the 3D data provided by the sensor in order to obtain the layout of the room. We show how our algorithm can rely on the user interactions in order to correct the generated 3D model during the acquisition process. Scène intérieure Plan Reconstruction 3D Mobile Smartphone Capteur de profondeur Point de fuite Nuage de points Intéraction utilisateur Indoor Room layout Floor plan 3d reconstruction Mobile Smartphone Depth sensor Vanishing point Point cloud User interaction
240	3D Semantic SLAM of Indoor Environment with Single Depth Sensor / SLAM sémantique 3D de l'environnement intérieur avec capteur de profondeur simple Ghorpade, Vijaya Kumar 20 December 2017 (has links) Pour agir de manière autonome et intelligente dans un environnement, un robot mobile doit disposer de cartes. Une carte contient les informations spatiales sur l’environnement. La géométrie 3D ainsi connue par le robot est utilisée non seulement pour éviter la collision avec des obstacles, mais aussi pour se localiser et pour planifier des déplacements. Les robots de prochaine génération ont besoin de davantage de capacités que de simples cartographies et d’une localisation pour coexister avec nous. La quintessence du robot humanoïde de service devra disposer de la capacité de voir comme les humains, de reconnaître, classer, interpréter la scène et exécuter les tâches de manière quasi-anthropomorphique. Par conséquent, augmenter les caractéristiques des cartes du robot à l’aide d’attributs sémiologiques à la façon des humains, afin de préciser les types de pièces, d’objets et leur aménagement spatial, est considéré comme un plus pour la robotique d’industrie et de services à venir. Une carte sémantique enrichit une carte générale avec les informations sur les entités, les fonctionnalités ou les événements qui sont situés dans l’espace. Quelques approches ont été proposées pour résoudre le problème de la cartographie sémantique en exploitant des scanners lasers ou des capteurs de temps de vol RGB-D, mais ce sujet est encore dans sa phase naissante. Dans cette thèse, une tentative de reconstruction sémantisée d’environnement d’intérieur en utilisant une caméra temps de vol qui ne délivre que des informations de profondeur est proposée. Les caméras temps de vol ont modifié le domaine de l’imagerie tridimensionnelle discrète. Elles ont dépassé les scanners traditionnels en termes de rapidité d’acquisition des données, de simplicité fonctionnement et de prix. Ces capteurs de profondeur sont destinés à occuper plus d’importance dans les futures applications robotiques. Après un bref aperçu des approches les plus récentes pour résoudre le sujet de la cartographie sémantique, en particulier en environnement intérieur. Ensuite, la calibration de la caméra a été étudiée ainsi que la nature de ses bruits. La suppression du bruit dans les données issues du capteur est menée. L’acquisition d’une collection d’images de points 3D en environnement intérieur a été réalisée. La séquence d’images ainsi acquise a alimenté un algorithme de SLAM pour reconstruire l’environnement visité. La performance du système SLAM est évaluée à partir des poses estimées en utilisant une nouvelle métrique qui est basée sur la prise en compte du contexte. L’extraction des surfaces planes est réalisée sur la carte reconstruite à partir des nuages de points en utilisant la transformation de Hough. Une interprétation sémantique de l’environnement reconstruit est réalisée. L’annotation de la scène avec informations sémantiques se déroule sur deux niveaux : l’un effectue la détection de grandes surfaces planes et procède ensuite en les classant en tant que porte, mur ou plafond; l’autre niveau de sémantisation opère au niveau des objets et traite de la reconnaissance des objets dans une scène donnée. A partir de l’élaboration d’une signature de forme invariante à la pose et en passant par une phase d’apprentissage exploitant cette signature, une interprétation de la scène contenant des objets connus et inconnus, en présence ou non d’occultations, est obtenue. Les jeux de données ont été mis à la disposition du public de la recherche universitaire. / Intelligent autonomous actions in an ordinary environment by a mobile robot require maps. A map holds the spatial information about the environment and gives the 3D geometry of the surrounding of the robot to not only avoid collision with complex obstacles, but also selflocalization and for task planning. However, in the future, service and personal robots will prevail and need arises for the robot to interact with the environment in addition to localize and navigate. This interaction demands the next generation robots to understand, interpret its environment and perform tasks in human-centric form. A simple map of the environment is far from being sufficient for the robots to co-exist and assist humans in the future. Human beings effortlessly make map and interact with environment, and it is trivial task for them. However, for robots these frivolous tasks are complex conundrums. Layering the semantic information on regular geometric maps is the leap that helps an ordinary mobile robot to be a more intelligent autonomous system. A semantic map augments a general map with the information about entities, i.e., objects, functionalities, or events, that are located in the space. The inclusion of semantics in the map enhances the robot’s spatial knowledge representation and improves its performance in managing complex tasks and human interaction. Many approaches have been proposed to address the semantic SLAM problem with laser scanners and RGB-D time-of-flight sensors, but it is still in its nascent phase. In this thesis, an endeavour to solve semantic SLAM using one of the time-of-flight sensors which gives only depth information is proposed. Time-of-flight cameras have dramatically changed the field of range imaging, and surpassed the traditional scanners in terms of rapid acquisition of data, simplicity and price. And it is believed that these depth sensors will be ubiquitous in future robotic applications. In this thesis, an endeavour to solve semantic SLAM using one of the time-of-flight sensors which gives only depth information is proposed. Starting with a brief motivation in the first chapter for semantic stance in normal maps, the state-of-the-art methods are discussed in the second chapter. Before using the camera for data acquisition, the noise characteristics of it has been studied meticulously, and properly calibrated. The novel noise filtering algorithm developed in the process, helps to get clean data for better scan matching and SLAM. The quality of the SLAM process is evaluated using a context-based similarity score metric, which has been specifically designed for the type of acquisition parameters and the data which have been used. Abstracting semantic layer on the reconstructed point cloud from SLAM has been done in two stages. In large-scale higher-level semantic interpretation, the prominent surfaces in the indoor environment are extracted and recognized, they include surfaces like walls, door, ceiling, clutter. However, in indoor single scene object-level semantic interpretation, a single 2.5D scene from the camera is parsed and the objects, surfaces are recognized. The object recognition is achieved using a novel shape signature based on probability distribution of 3D keypoints that are most stable and repeatable. The classification of prominent surfaces and single scene semantic interpretation is done using supervised machine learning and deep learning systems. To this end, the object dataset and SLAM data are also made publicly available for academic research. Caméras de temps de vol Nuages de points 3D Filtrage Recalage SLAM Détection de plans Segmentation Reconnaissance Détection Classification d’objets Apprentissage automatique Time-of-flight cameras 3D point cloud processing Noise filters Registration SLAM Plane detection Segmentation Object recognition Object detection Object classification Machine learning

Search results