Global ETD Search

451	Robust visual detection and tracking of complex objects : applications to space autonomous rendez-vous and proximity operations Petit, Antoine 19 December 2013 (has links) (PDF) In this thesis, we address the issue of fully localizing a known object through computer vision, using a monocular camera, what is a central problem in robotics. A particular attention is here paid on space robotics applications, with the aims of providing a unified visual localization system for autonomous navigation purposes for space rendezvous and proximity operations. Two main challenges of the problem are tackled: initially detecting the targeted object and then tracking it frame-by-frame, providing the complete pose between the camera and the object, knowing the 3D CAD model of the object. For detection, the pose estimation process is based on the segmentation of the moving object and on an efficient probabilistic edge-based matching and alignment procedure of a set of synthetic views of the object with a sequence of initial images. For the tracking phase, pose estimation is handled through a 3D model-based tracking algorithm, for which we propose three different types of visual features, pertinently representing the object with its edges, its silhouette and with a set of interest points. The reliability of the localization process is evaluated by propagating the uncertainty from the errors of the visual features. This uncertainty besides feeds a linear Kalman filter on the camera velocity parameters. Qualitative and quantitative experiments have been performed on various synthetic and real data, with challenging imaging conditions, showing the efficiency and the benefits of the different contributions, and their compliance with space rendezvous applications. [INFO:INFO_RB] Computer Science/Robotics [INFO:INFO_RB] Informatique/Robotique Visual tracking Object detection Moving object segmentation Space robotics
452	An intuitive motion-based input model for mobile devices Richards, Mark Andrew January 2006 (has links) Traditional methods of input on mobile devices are cumbersome and difficult to use. Devices have become smaller, while their operating systems have become more complex, to the extent that they are approaching the level of functionality found on desktop computer operating systems. The buttons and toggle-sticks currently employed by mobile devices are a relatively poor replacement for the keyboard and mouse style user interfaces used on their desktop computer counterparts. For example, when looking at a screen image on a device, we should be able to move the device to the left to indicate we wish the image to be panned in the same direction. This research investigates a new input model based on the natural hand motions and reactions of users. The model developed by this work uses the generic embedded video cameras available on almost all current-generation mobile devices to determine how the device is being moved and maps this movement to an appropriate action. Surveys using mobile devices were undertaken to determine both the appropriateness and efficacy of such a model as well as to collect the foundational data with which to build the model. Direct mappings between motions and inputs were achieved by analysing users' motions and reactions in response to different tasks. Upon the framework being completed, a proof of concept was created upon the Windows Mobile Platform. This proof of concept leverages both DirectShow and Direct3D to track objects in the video stream, maps these objects to a three-dimensional plane, and determines device movements from this data. This input model holds the promise of being a simpler and more intuitive method for users to interact with their mobile devices, and has the added advantage that no hardware additions or modifications are required the existing mobile devices. Input Model HumanComputer Interface Mobile Device User Interface Input Devices Interaction Styles Automated Survey DirectX Mobile DirectShow Windows Mobile Computer Vision Image Processing Edge Detection Object Detection Motion Tracking Scene Analysis Human Movement ARToolkit Augmented Reality
453	Novel image processing algorithms and methods for improving their robustness and operational performance Romanenko, Ilya January 2014 (has links) Image processing algorithms have developed rapidly in recent years. Imaging functions are becoming more common in electronic devices, demanding better image quality, and more robust image capture in challenging conditions. Increasingly more complicated algorithms are being developed in order to achieve better signal to noise characteristics, more accurate colours, and wider dynamic range, in order to approach the human visual system performance levels.
454	Robot Goalkeeper : A robotic goalkeeper based on machine vision and motor control Adeboye, Taiyelolu January 2018 (has links) This report shows a robust and efficient implementation of a speed-optimized algorithm for object recognition, 3D real world location and tracking in real time. It details a design that was focused on detecting and following objects in flight as applied to a football in motion. An overall goal of the design was to develop a system capable of recognizing an object and its present and near future location while also actuating a robotic arm in response to the motion of the ball in flight. The implementation made use of image processing functions in C++, NVIDIA Jetson TX1, Sterolabs’ ZED stereoscopic camera setup in connection to an embedded system controller for the robot arm. The image processing was done with a textured background and the 3D location coordinates were applied to the correction of a Kalman filter model that was used for estimating and predicting the ball location. A capture and processing speed of 59.4 frames per second was obtained with good accuracy in depth detection while the ball was well tracked in the tests carried out. Object detection 3D reconstruction Object tracking Robot goalkeeper C++ OpenCV CUDA MATLAB Image processing Machine vision Linux OS GPU NVIDIA TX1 Stereolabs ZED Robotics Robotteknik och automation Embedded Systems Inbäddad systemteknik
455	Efficient multi-class objet detection with a hierarchy of classes / Détection efficace des objets multi-classes avec une hiérarchie des classes Odabai Fard, Seyed Hamidreza 20 November 2015 (has links) Dans cet article, nous présentons une nouvelle approche de détection multi-classes basée sur un parcours hiérarchique de classifieurs appris simultanément. Pour plus de robustesse et de rapidité, nous proposons d’utiliser un arbre de classes d’objets. Notre modèle de détection est appris en combinant les contraintes de tri et de classification dans un seul problème d’optimisation. Notre formulation convexe permet d’utiliser un algorithme de recherche pour accélérer le temps d’exécution. Nous avons mené des évaluations de notre algorithme sur les benchmarks PASCAL VOC (2007 et 2010). Comparé à l’approche un-contre-tous, notre méthode améliore les performances pour 20 classes et gagne 10x en vitesse. / Recent years have witnessed a competition in autonomous navigation for vehicles boosted by the advances in computer vision. The on-board cameras are capable of understanding the semantic content of the environment. A core component of this system is to localize and classify objects in urban scenes. There is a need to have multi-class object detection systems. Designing such an efficient system is a challenging and active research area. The algorithms can be found for applications in autonomous driving, object searches in images or video surveillance. The scale of object classes varies depending on the tasks. The datasets for object detection started with containing one class only e.g. the popular INRIA Person dataset. Nowadays, we witness an expansion of the datasets consisting of more training data or number of object classes. This thesis proposes a solution to efficiently learn a multi-class object detector. The task of such a system is to localize all instances of target object classes in an input image. We distinguish between three major efficiency criteria. First, the detection performance measures the accuracy of detection. Second, we strive low execution times during run-time. Third, we address the scalability of our novel detection framework. The two previous criteria should scale suitably with the number of input classes and the training algorithm has to take a reasonable amount of time when learning with these larger datasets. Although single-class object detection has seen a considerable improvement over the years, it still remains a challenge to create algorithms that work well with any number of classes. Most works on this subject extent these single-class detectors to work accordingly with multiple classes but remain hardly flexible to new object descriptors. Moreover, they do not consider all these three criteria at the same time. Others use a more traditional approach by iteratively executing a single-class detector for each target class which scales linearly in training time and run-time. To tackle the challenges, we present a novel framework where for an input patch during detection the closest class is ranked highest. Background labels are rejected as negative samples. The detection goal is to find the highest scoring class. To this end, we derive a convex problem formulation that combines ranking and classification constraints. The accuracy of the system is improved by hierarchically arranging the classes into a tree of classifiers. The leaf nodes represent the individual classes and the intermediate nodes called super-classes group recursively these classes together. The super-classes benefit from the shared knowledge of their descending classes. All these classifiers are learned in a joint optimization problem along with the previouslymentioned constraints. The increased number of classifiers are prohibitive to rapid execution times. The formulation of the detection goal naturally allows to use an adapted tree traversal algorithm to progressively search for the best class but reject early in the detection process the background samples and consequently reduce the system’s run-time. Our system balances between detection performance and speed-up. We further experimented with feature reduction to decrease the overhead of applying the high-level classifiers in the tree. The framework is transparent to the used object descriptor where we implemented the histogram of orientated gradients and deformable part model both introduced in [Felzenszwalb et al., 2010a]. The capabilities of our system are demonstrated on two challenging datasets containing different object categories not necessarily semantically related. We evaluate both the detection performance with different number of classes and the scalability with respect to run-time. Our experiments show that this framework fulfills the requirements of a multi-class object detector and highlights the advantages of structuring class-level knowledge. Détection multi-classes d’objets Classification hiérarchique Inférence rapide Arbre de classifieurs Parcours d’arbre Apprentissage hiérarchique SVM structuré Multi-class object detection Hierarchical classification Rapid inference Tree of classifiers Tree traversal Hierarchical learning Structured SVM
456	3D Semantic SLAM of Indoor Environment with Single Depth Sensor / SLAM sémantique 3D de l'environnement intérieur avec capteur de profondeur simple Ghorpade, Vijaya Kumar 20 December 2017 (has links) Pour agir de manière autonome et intelligente dans un environnement, un robot mobile doit disposer de cartes. Une carte contient les informations spatiales sur l’environnement. La géométrie 3D ainsi connue par le robot est utilisée non seulement pour éviter la collision avec des obstacles, mais aussi pour se localiser et pour planifier des déplacements. Les robots de prochaine génération ont besoin de davantage de capacités que de simples cartographies et d’une localisation pour coexister avec nous. La quintessence du robot humanoïde de service devra disposer de la capacité de voir comme les humains, de reconnaître, classer, interpréter la scène et exécuter les tâches de manière quasi-anthropomorphique. Par conséquent, augmenter les caractéristiques des cartes du robot à l’aide d’attributs sémiologiques à la façon des humains, afin de préciser les types de pièces, d’objets et leur aménagement spatial, est considéré comme un plus pour la robotique d’industrie et de services à venir. Une carte sémantique enrichit une carte générale avec les informations sur les entités, les fonctionnalités ou les événements qui sont situés dans l’espace. Quelques approches ont été proposées pour résoudre le problème de la cartographie sémantique en exploitant des scanners lasers ou des capteurs de temps de vol RGB-D, mais ce sujet est encore dans sa phase naissante. Dans cette thèse, une tentative de reconstruction sémantisée d’environnement d’intérieur en utilisant une caméra temps de vol qui ne délivre que des informations de profondeur est proposée. Les caméras temps de vol ont modifié le domaine de l’imagerie tridimensionnelle discrète. Elles ont dépassé les scanners traditionnels en termes de rapidité d’acquisition des données, de simplicité fonctionnement et de prix. Ces capteurs de profondeur sont destinés à occuper plus d’importance dans les futures applications robotiques. Après un bref aperçu des approches les plus récentes pour résoudre le sujet de la cartographie sémantique, en particulier en environnement intérieur. Ensuite, la calibration de la caméra a été étudiée ainsi que la nature de ses bruits. La suppression du bruit dans les données issues du capteur est menée. L’acquisition d’une collection d’images de points 3D en environnement intérieur a été réalisée. La séquence d’images ainsi acquise a alimenté un algorithme de SLAM pour reconstruire l’environnement visité. La performance du système SLAM est évaluée à partir des poses estimées en utilisant une nouvelle métrique qui est basée sur la prise en compte du contexte. L’extraction des surfaces planes est réalisée sur la carte reconstruite à partir des nuages de points en utilisant la transformation de Hough. Une interprétation sémantique de l’environnement reconstruit est réalisée. L’annotation de la scène avec informations sémantiques se déroule sur deux niveaux : l’un effectue la détection de grandes surfaces planes et procède ensuite en les classant en tant que porte, mur ou plafond; l’autre niveau de sémantisation opère au niveau des objets et traite de la reconnaissance des objets dans une scène donnée. A partir de l’élaboration d’une signature de forme invariante à la pose et en passant par une phase d’apprentissage exploitant cette signature, une interprétation de la scène contenant des objets connus et inconnus, en présence ou non d’occultations, est obtenue. Les jeux de données ont été mis à la disposition du public de la recherche universitaire. / Intelligent autonomous actions in an ordinary environment by a mobile robot require maps. A map holds the spatial information about the environment and gives the 3D geometry of the surrounding of the robot to not only avoid collision with complex obstacles, but also selflocalization and for task planning. However, in the future, service and personal robots will prevail and need arises for the robot to interact with the environment in addition to localize and navigate. This interaction demands the next generation robots to understand, interpret its environment and perform tasks in human-centric form. A simple map of the environment is far from being sufficient for the robots to co-exist and assist humans in the future. Human beings effortlessly make map and interact with environment, and it is trivial task for them. However, for robots these frivolous tasks are complex conundrums. Layering the semantic information on regular geometric maps is the leap that helps an ordinary mobile robot to be a more intelligent autonomous system. A semantic map augments a general map with the information about entities, i.e., objects, functionalities, or events, that are located in the space. The inclusion of semantics in the map enhances the robot’s spatial knowledge representation and improves its performance in managing complex tasks and human interaction. Many approaches have been proposed to address the semantic SLAM problem with laser scanners and RGB-D time-of-flight sensors, but it is still in its nascent phase. In this thesis, an endeavour to solve semantic SLAM using one of the time-of-flight sensors which gives only depth information is proposed. Time-of-flight cameras have dramatically changed the field of range imaging, and surpassed the traditional scanners in terms of rapid acquisition of data, simplicity and price. And it is believed that these depth sensors will be ubiquitous in future robotic applications. In this thesis, an endeavour to solve semantic SLAM using one of the time-of-flight sensors which gives only depth information is proposed. Starting with a brief motivation in the first chapter for semantic stance in normal maps, the state-of-the-art methods are discussed in the second chapter. Before using the camera for data acquisition, the noise characteristics of it has been studied meticulously, and properly calibrated. The novel noise filtering algorithm developed in the process, helps to get clean data for better scan matching and SLAM. The quality of the SLAM process is evaluated using a context-based similarity score metric, which has been specifically designed for the type of acquisition parameters and the data which have been used. Abstracting semantic layer on the reconstructed point cloud from SLAM has been done in two stages. In large-scale higher-level semantic interpretation, the prominent surfaces in the indoor environment are extracted and recognized, they include surfaces like walls, door, ceiling, clutter. However, in indoor single scene object-level semantic interpretation, a single 2.5D scene from the camera is parsed and the objects, surfaces are recognized. The object recognition is achieved using a novel shape signature based on probability distribution of 3D keypoints that are most stable and repeatable. The classification of prominent surfaces and single scene semantic interpretation is done using supervised machine learning and deep learning systems. To this end, the object dataset and SLAM data are also made publicly available for academic research. Caméras de temps de vol Nuages de points 3D Filtrage Recalage SLAM Détection de plans Segmentation Reconnaissance Détection Classification d’objets Apprentissage automatique Time-of-flight cameras 3D point cloud processing Noise filters Registration SLAM Plane detection Segmentation Object recognition Object detection Object classification Machine learning
457	[en] POPULATION DISTRIBUTION MAPPING THROUGH THE DETECTION OF BUILDING AREAS IN GOOGLE EARTH IMAGES OF HETEROGENEOUS REGIONS USING DEEP LEARNING / [pt] MAPEAMENTO DA DISTRIBUIÇÃO POPULACIONAL ATRAVÉS DA DETECÇÃO DE ÁREAS EDIFICADAS EM IMAGENS DE REGIÕES HETEROGÊNEAS DO GOOGLE EARTH USANDO DEEP LEARNING CASSIO FREITAS PEREIRA DE ALMEIDA 08 February 2018 (has links) [pt] Informações precisas sobre a distribuição da população são reconhecidamente importantes. A fonte de informação mais completa sobre a população é o censo, cujos os dados são disponibilizados de forma agregada em setores censitários. Esses setores são unidades operacionais de tamanho e formas irregulares, que dificulta a análise espacial dos dados associados. Assim, a mudança de setores censitários para um conjunto de células regulares com estimativas adequadas facilitaria a análise. Uma metodologia a ser utilizada para essa mudança poderia ser baseada na classificação de imagens de sensoriamento remoto para a identificação de domicílios, que é a base das pesquisas envolvendo a população. A detecção de áreas edificadas é uma tarefa complexa devido a grande variabilidade de características de construção e de imagens. Os métodos usuais são complexos e muito dependentes de especialistas. Os processos automáticos dependem de grandes bases de imagens para treinamento e são sensíveis à variação de qualidade de imagens e características das construções e de ambiente. Nesta tese propomos a utilização de um método automatizado para detecção de edificações em imagens Google Earth que mostrou bons resultados utilizando um conjunto de imagens relativamente pequeno e com grande variabilidade, superando as limitações dos processos existentes. Este resultado foi obtido com uma aplicação prática. Foi construído um conjunto de imagens com anotação de áreas construídas para 12 regiões do Brasil. Estas imagens, além de diferentes na qualidade, apresentam grande variabilidade nas características das edificações e no ambiente geográfico. Uma prova de conceito será feita na utilização da classificação de área construída nos métodos dasimétrico para a estimação de população em gride. Ela mostrou um resultado promissor quando comparado com o método usual, possibilitando a melhoria da qualidade das estimativas. / [en] The importance of precise information about the population distribution is widely acknowledged. The census is considered the most reliable and complete source of this information, and its data are delivered in an aggregated form in sectors. These sectors are operational units with irregular shapes, which hinder the spatial analysis of the data. Thus, the transformation of sectors onto a regular grid would facilitate such analysis. A methodology to achieve this transformation could be based on remote sensing image classification to identify building where the population lives. The building detection is considered a complex task since there is a great variability of building characteristics and on the images quality themselves. The majority of methods are complex and very specialist dependent. The automatic methods require a large annotated dataset for training and they are sensitive to the image quality, to the building characteristics, and to the environment. In this thesis, we propose an automatic method for building detection based on a deep learning architecture that uses a relative small dataset with a large variability. The proposed method shows good results when compared to the state of the art. An annotated dataset has been built that covers 12 cities distributed in different regions of Brazil. Such images not only have different qualities, but also shows a large variability on the building characteristics and geographic environments. A very important application of this method is the use of the building area classification in the dasimetric methods for the population estimation into grid. The concept proof in this application showed a promising result when compared to the usual method allowing the improvement of the quality of the estimates. [pt] APRENDIZADO DE MAQUINA [en] MACHINE LEARNING [pt] CLASSIFICACAO DE IMAGENS [en] IMAGE CLASSIFICATION [pt] DETECCAO DE OBJETOS [en] OBJECT DETECTION [pt] REDE NEURAL CONVOLUCIONAL [en] CONVOLUTIONAL NEURAL NETWORK [pt] SEGMENTACAO DE IMAGENS [en] IMAGE SEGMENTATION [pt] DASIMETRIA [en] DASIMETRY
458	Video Recommendation Based on Object Detection Nyberg, Selma January 2018 (has links) In this thesis, various machine learning domains have been combined in order to build a video recommender system that is based on object detection. The work combines two extensively studied research fields, recommender systems and computer vision, that also are rapidly growing and popular techniques on commercial markets. To investigate the performance of the approach, three different content-based recommender systems have been implemented at Spotify, which are based on the following video features: object detections, titles and descriptions, and user preferences. These systems have then been evaluated and compared against each other together with their hybridized result. Two algorithms have been implemented, the prediction and the top-N algorithm, where the former is the more reliable source for evaluating the system's performance. The evaluation of the system shows that the overall performance scores for predicting values of the users' liked and disliked videos are in the range from about 40 % to 70 % for the prediction algorithm and from about 15 % to 70 % for the top-N algorithm. The approach based on object detection performs worse in comparison to the other approaches. Hence, there seems to be is a low correlation between the user preferences and the video contents in terms of object detection data. Therefore, this data is not very suitable for describing the content of videos and using it in the recommender system. However, the results of this study cannot be generalized to apply for other systems before the approach has been evaluated in other environments and for various data sets. Moreover, there are plenty of room for refinements and improvements to the system, as well as there are many interesting research areas for future work. Spotify Machine Learning Artificial Intelligence Recommender Systems Content-Based Filtering Collaborative Filtering Hybrid Filtering Deep Learning Image Recognition Object Detection Natural Language Processing Paragraph Vectors Doc2Vec TensorFlow Classification K-Nearest Neighbors Cross-Validation Computer and Information Sciences Data- och informationsvetenskap
459	Graph mining for object tracking in videos / Fouille de graphes pour le suivi d’objets dans les vidéos Diot, Fabien 03 June 2014 (has links) Détecter et suivre les objets principaux d’une vidéo est une étape nécessaire en vue d’en décrire le contenu pour, par exemple, permettre une indexation judicieuse des données multimédia par les moteurs de recherche. Les techniques de suivi d’objets actuelles souffrent de défauts majeurs. En effet, soit elles nécessitent que l’utilisateur désigne la cible a suivre, soit il est nécessaire d’utiliser un classifieur pré-entraîné à reconnaitre une classe spécifique d’objets, comme des humains ou des voitures. Puisque ces méthodes requièrent l’intervention de l’utilisateur ou une connaissance a priori du contenu traité, elles ne sont pas suffisamment génériques pour être appliquées aux vidéos amateurs telles qu’on peut en trouver sur YouTube. Pour résoudre ce problème, nous partons de l’hypothèse que, dans le cas de vidéos dont l’arrière-plan n’est pas fixe, celui-ci apparait moins souvent que les objets intéressants. De plus, dans une vidéo, la topologie des différents éléments visuels composant un objet est supposée consistante d’une image a l’autre. Nous représentons chaque image par un graphe plan modélisant sa topologie. Ensuite, nous recherchons des motifs apparaissant fréquemment dans la base de données de graphes plans ainsi créée pour représenter chaque vidéo. Cette approche nous permet de détecter et suivre les objets principaux d’une vidéo de manière non supervisée en nous basant uniquement sur la fréquence des motifs. Nos contributions sont donc réparties entre les domaines de la fouille de graphes et du suivi d’objets. Dans le premier domaine, notre première contribution est de présenter un algorithme de fouille de graphes plans efficace, appelé PLAGRAM. Cet algorithme exploite la planarité des graphes et une nouvelle stratégie d’extension des motifs. Nous introduisons ensuite des contraintes spatio-temporelles au processus de fouille afin d’exploiter le fait que, dans une vidéo, les objets se déplacent peu d’une image a l’autre. Ainsi, nous contraignons les occurrences d’un même motif a être proches dans l’espace et dans le temps en limitant le nombre d’images et la distance spatiale les séparant. Nous présentons deux nouveaux algorithmes, DYPLAGRAM qui utilise la contrainte temporelle pour limiter le nombre de motifs extraits, et DYPLAGRAM_ST qui extrait efficacement des motifs spatio-temporels fréquents depuis les bases de données représentant les vidéos. Dans le domaine du suivi d’objets, nos contributions consistent en deux approches utilisant les motifs spatio-temporels pour suivre les objets principaux dans les vidéos. La première est basée sur une recherche du chemin de poids minimum dans un graphe connectant les motifs spatio-temporels tandis que l’autre est basée sur une méthode de clustering permettant de regrouper les motifs pour suivre les objets plus longtemps. Nous présentons aussi deux applications industrielles de notre méthode / Detecting and following the main objects of a video is necessary to describe its content in order to, for example, allow for a relevant indexation of the multimedia content by the search engines. Current object tracking approaches either require the user to select the targets to follow, or rely on pre-trained classifiers to detect particular classes of objects such as pedestrians or car for example. Since those methods rely on user intervention or prior knowledge of the content to process, they cannot be applied automatically on amateur videos such as the ones found on YouTube. To solve this problem, we build upon the hypothesis that, in videos with a moving background, the main objects should appear more frequently than the background. Moreover, in a video, the topology of the visual elements composing an object is supposed consistent from one frame to another. We represent each image of the videos with plane graphs modeling their topology. Then, we search for substructures appearing frequently in the database of plane graphs thus created to represent each video. Our contributions cover both fields of graph mining and object tracking. In the first field, our first contribution is to present an efficient plane graph mining algorithm, named PLAGRAM. This algorithm exploits the planarity of the graphs and a new strategy to extend the patterns. The next contributions consist in the introduction of spatio-temporal constraints into the mining process to exploit the fact that, in a video, the motion of objects is small from on frame to another. Thus, we constrain the occurrences of a same pattern to be close in space and time by limiting the number of frames and the spatial distance separating them. We present two new algorithms, DYPLAGRAM which makes use of the temporal constraint to limit the number of extracted patterns, and DYPLAGRAM_ST which efficiently mines frequent spatio-temporal patterns from the datasets representing the videos. In the field of object tracking, our contributions consist in two approaches using the spatio-temporal patterns to track the main objects in videos. The first one is based on a search of the shortest path in a graph connecting the spatio-temporal patterns, while the second one uses a clustering approach to regroup them in order to follow the objects for a longer period of time. We also present two industrial applications of our method Fouille de graphes Suivi d'objets Traitement de l'image Fouille de données Détection d'objets Indexation de vidéos Résumé automatique de vidéos Graph mining Objects tracking Image processing Data mining Object detection Indexing video Video summarization
460	Knowledge-based 3D point clouds processing / Traitement 3D de nuages de points basé sur la connaissance Truong, Quoc Hung 15 November 2013 (has links) La modélisation de scènes réelles à travers la capture de données numériques 3D a été prouvée à la fois utile et applicable dans une variété d’applications. Des scènes entières sont généralement numérisées par des scanners laser et représentées par des grands nuages de points non organisés souvent accompagnés de données photogrammétriques. Un problème typique dans le traitement de ces nuages et données réside dans la détection et la classification des objets présents dans la scène. Ces tâches sont souvent entravées par la variabilité des conditions de capture des données, la présence de bruit, les occlusions ainsi que les données manquantes. Compte tenu de la complexité des problèmes sous-jacents, les approches de traitement récentes tentent d’exploiter les connaissances sémantiques pour identifier et classer les objets. Dans cette thèse, nous proposons une nouvelle approche qui fait appel à des stratégies intelligentes de gestion des connaissances pour le traitement des nuages de points 3D ainsi que l’identification et la classification des objets dans les scènes numérisées. Notre approche étend l’utilisation des connaissances sémantiques à toutes les étapes du traitement, y compris le choix et le guidage des algorithmes de traitement axées sur les données individuelles. Notre solution constitue un concept multi-étape itératif sur la base de trois facteurs : la connaissance modélisée, un ensemble d’algorithmes de traitement, et un moteur de classification. L’objectif de ce travail est de sélectionner et d’orienter les algorithmes de manière adaptative et intelligente pour détecter des objets dans les nuages de points. Des expériences avec deux études de cas démontrent l’applicabilité de notre approche. Les études ont été réalisées sur des analyses de la salle d’attente d’un aéroport et le long des voies de chemin de fer. Dans les deux cas, l’objectif était de détecter et d’identifier des objets dans une zone définie. Les résultats montrent que notre approche a réussi à identifier les objets d’intérêt tout en utilisant différents types de données / The modeling of real-world scenes through capturing 3D digital data has proven to be both useful andapplicable in a variety of industrial and surveying applications. Entire scenes are generally capturedby laser scanners and represented by large unorganized point clouds possibly along with additionalphotogrammetric data. A typical challenge in processing such point clouds and data lies in detectingand classifying objects that are present in the scene. In addition to the presence of noise, occlusionsand missing data, such tasks are often hindered by the irregularity of the capturing conditions bothwithin the same dataset and from one data set to another. Given the complexity of the underlyingproblems, recent processing approaches attempt to exploit semantic knowledge for identifying andclassifying objects. In the present thesis, we propose a novel approach that makes use of intelligentknowledge management strategies for processing of 3D point clouds as well as identifying andclassifying objects in digitized scenes. Our approach extends the use of semantic knowledge to allstages of the processing, including the guidance of the individual data-driven processing algorithms.The complete solution consists in a multi-stage iterative concept based on three factors: the modeledknowledge, the package of algorithms, and a classification engine. The goal of the present work isto select and guide algorithms following an adaptive and intelligent strategy for detecting objects inpoint clouds. Experiments with two case studies demonstrate the applicability of our approach. Thestudies were carried out on scans of the waiting area of an airport and along the tracks of a railway.In both cases the goal was to detect and identify objects within a defined area. Results show that ourapproach succeeded in identifying the objects of interest while using various data types Traitement 3D Nuages de points Détection d’objets Segmentation Sélection d’algorithme Systèmes basés connaissance Modélisation des connaissances Ontologies Classification 3D processing Point clouds Object detection Segmentation Algorithm selection Knowledge-based systems Knowledge modeling Ontology Classification 005.1 006.3 516

Search results