Global ETD Search

101	Effective and annotation efficient deep learning for image understanding / Méthodes d'apprentissage profond pour l'analyse efficace d'images en limitant l'annotation humaine Gidaris, Spyridon 11 December 2018 (has links) Le développement récent de l'apprentissage profond a permis une importante amélioration des résultats dans le domaine de l'analyse d'image. Cependant, la conception d'architectures d'apprentissage profond à même de résoudre efficacement les tâches d'analyse d'image est loin d'être simple. De plus, le succès des approches d'apprentissage profond dépend fortement de la disponibilité de données en grande quantité étiquetées manuellement (par des humains), ce qui est à la fois coûteux et peu pratique lors du passage à grande échelle. Dans ce contexte, l'objectif de cette thèse est d'explorer des approches basées sur l'apprentissage profond pour certaines tâches de compréhension de l'image qui permettraient d'augmenter l'efficacité avec laquelle celles-ci sont effectuées ainsi que de rendre le processus d'apprentissage moins dépendant à la disponibilité d'une grande quantité de données annotées à la main. Nous nous sommes d'abord concentrés sur l'amélioration de l'état de l'art en matière de détection d'objets. Plus spécifiquement, nous avons tenté d'améliorer la capacité des systèmes de détection d'objets à reconnaître des instances d'objets (même difficiles à distinguer) en proposant une représentation basée sur des réseaux de neurone convolutionnels prenant en compte le aspects multi-région et de segmentation sémantique, et capable de capturer un ensemble diversifié de facteurs d'apparence discriminants. De plus, nous avons visé à améliorer la précision de localisation des systèmes de détection d'objets en proposant des schémas itératifs de détection d'objets et un nouveau modèle de localisation pour estimer la boîte de délimitation d'un objet. En ce qui concerne le problème de l'étiquetage des images à l'échelle du pixel, nous avons exploré une famille d'architectures de réseaux de neurones profonds qui effectuent une prédiction structurée des étiquettes de sortie en apprenant à améliorer (itérativement) une estimation initiale de celles-ci. L'objectif est d'identifier l'architecture optimale pour la mise en œuvre de tels modèles profonds de prévision structurée. Dans ce contexte, nous avons proposé de décomposer la tâche d'amélioration de l'étiquetage en trois étapes : 1) détecter les estimations initialement incorrectes des étiquettes, 2) remplacer les étiquettes incorrectes par de nouvelles étiquettes, et finalement 3) affiner les étiquettes renouvelées en prédisant les corrections résiduelles. Afin de réduire la dépendance à l'effort d'annotation humaine, nous avons proposé une approche d'apprentissage auto-supervisée qui apprend les représentations sémantiques d'images à l'aide d'un réseau de neurones convolutionnel en entraînant ce dernier à reconnaître la rotation 2d qui est appliquée à l'image qu'il reçoit en entrée. Plus précisément, les caractéristiques de l'image tirées de cette tâche de prédiction de rotation donnent de très bons résultats lorsqu'elles sont transférées sur les autres tâches de détection d'objets et de segmentation sémantique, surpassant les approches d'apprentissage antérieures non supervisées et réduisant ainsi l'écart avec le cas supervisé. Enfin, nous avons proposé un nouveau système de reconnaissance d'objets qui, après son entraînement, est capable d'apprendre dynamiquement de nouvelles catégories à partir de quelques exemples seulement (typiquement, seulement un ou cinq), sans oublier les catégories sur lesquelles il a été formé. Afin de mettre en œuvre le système de reconnaissance proposé, nous avons introduit deux nouveautés techniques, un générateur de poids de classification basé sur l'attention et un modèle de reconnaissance basé sur un réseau neuronal convolutionnel dont le classificateur est implémenté comme une fonction de similarité cosinusienne entre les représentations de caractéristiques et les vecteurs de classification / Recent development in deep learning have achieved impressive results on image understanding tasks. However, designing deep learning architectures that will effectively solve the image understanding tasks of interest is far from trivial. Even more, the success of deep learning approaches heavily relies on the availability of large-size manually labeled (by humans) data. In this context, the objective of this dissertation is to explore deep learning based approaches for core image understanding tasks that would allow to increase the effectiveness with which they are performed as well as to make their learning process more annotation efficient, i.e., less dependent on the availability of large amounts of manually labeled training data. We first focus on improving the state-of-the-art on object detection. More specifically, we attempt to boost the ability of object detection systems to recognize (even difficult) object instances by proposing a multi-region and semantic segmentation-aware ConvNet-based representation that is able to capture a diverse set of discriminative appearance factors. Also, we aim to improve the localization accuracy of object detection systems by proposing iterative detection schemes and a novel localization model for estimating the bounding box of the objects. We demonstrate that the proposed technical novelties lead to significant improvements in the object detection performance of PASCAL and MS COCO benchmarks. Regarding the pixel-wise image labeling problem, we explored a family of deep neural network architectures that perform structured prediction by learning to (iteratively) improve some initial estimates of the output labels. The goal is to identify which is the optimal architecture for implementing such deep structured prediction models. In this context, we propose to decompose the label improvement task into three steps: 1) detecting the initial label estimates that are incorrect, 2) replacing the incorrect labels with new ones, and finally 3) refining the renewed labels by predicting residual corrections w.r.t. them. We evaluate the explored architectures on the disparity estimation task and we demonstrate that the proposed architecture achieves state-of-the-art results on the KITTI 2015 benchmark.In order to accomplish our goal for annotation efficient learning, we proposed a self-supervised learning approach that learns ConvNet-based image representations by training the ConvNet to recognize the 2d rotation that is applied to the image that it gets as input. We empirically demonstrate that this apparently simple task actually provides a very powerful supervisory signal for semantic feature learning. Specifically, the image features learned from this task exhibit very good results when transferred on the visual tasks of object detection and semantic segmentation, surpassing prior unsupervised learning approaches and thus narrowing the gap with the supervised case.Finally, also in the direction of annotation efficient learning, we proposed a novel few-shot object recognition system that after training is capable to dynamically learn novel categories from only a few data (e.g., only one or five training examples) while it does not forget the categories on which it was trained on. In order to implement the proposed recognition system we introduced two technical novelties, an attention based few-shot classification weight generator, and implementing the classifier of the ConvNet based recognition model as a cosine similarity function between feature representations and classification vectors. We demonstrate that the proposed approach achieved state-of-the-art results on relevant few-shot benchmarks Reconnaissance des objets Prédiction structurée Structured prediction Object recognition
102	Object Recognition with Progressive Refinement for Collaborative Robots Task Allocation Wu, Wenbo 18 December 2020 (has links) With the rapid development of deep learning techniques, the application of Convolutional Neural Network (CNN) has benefited the task of target object recognition. Several state-of-the-art object detectors have achieved excellent performance on the precision for object recognition. When it comes to applying the detection results for the real world application of collaborative robots, the reliability and robustness of the target object detection stage is essential to support efficient task allocation. In this work, collaborative robots task allocation is based on the assumption that each individual robotic agent possesses specialized capabilities to be matched with detected targets representing tasks to be performed in the surrounding environment which impose specific requirements. The goal is to reach a specialized labor distribution among the individual robots based on best matching their specialized capabilities with the corresponding requirements imposed by the tasks. In order to further improve task recognition with convolutional neural networks in the context of robotic task allocation, this thesis proposes an innovative approach for progressively refining the target detection process by taking advantage of the fact that additional images can be collected by mobile cameras installed on robotic vehicles. The proposed methodology combines a CNN-based object detection module with a refinement module. For the detection module, a two-stage object detector, Mask RCNN, for which some adaptations on region proposal generation are introduced, and a one-stage object detector, YOLO, are experimentally investigated in the context considered. The generated recognition scores serve as input for the refinement module. In the latter, the current detection result is considered as the a priori evidence to enhance the next detection for the same target with the goal to iteratively improve the target recognition scores. Both the Bayesian method and the Dempster-Shafer theory are experimentally investigated to achieve the data fusion process involved in the refinement process. The experimental validation is conducted on indoor search-and-rescue (SAR) scenarios and the results presented in this work demonstrate the feasibility and reliability of the proposed progressive refinement framework, especially when the combination of adapted Mask RCNN and D-S theory data fusion is exploited. Object recognition Convolutional neural network Deep learning Machine vision
103	Contributions to 3D data processing and social robotics Escalona, Félix 30 September 2021 (has links) In this thesis, a study of artificial intelligence applied to 3D data and social robotics is carried out. The first part of the present document is dedicated to 3D object recognition. Object recognition consists on the automatic detection and categorisation of the objects that appear in a scene. This capability is an important need for social robots, as it allows them to understand and interact with their environment. Image-based methods have been largely studied with great results, but they only rely on visual features and can confuse different objects with similar appearances (picture with the object depicted in it), so 3D data can help to improve these systems using topological features. For this part, we present different novel techniques that use pure 3D data. The second part of the thesis is about the mapping of the environment. Mapping of the environment consists on constructing a map that can be used by a robot to locate itself. This capability enables them to perform a more elaborated navigation strategy, which is tremendously usable by a social robot to interact with the different rooms of a house and its objects. In this section, we will explore 2D and 3D maps and their refinement with object recognition. Finally, the third part of this work is about social robotics. Social robotics is focused on serving people in a caring interaction rather than to perform a mechanical task. Previous sections are related to two main capabilities of a social robot, and this final section contains a survey about this kind of robots and other projects that explore other aspects of them. 3D Object Recognition Mapping Social Robotics
104	Applications of Artificial Neural Networks to Synthetic Aperture Radar for Feature Extraction in Noisy Environments Roberts, David James 01 June 2013 (has links) (PDF) It is often that images generated from Synthetic Aperture Radar (SAR) are noisy, distorted, or incomplete pictures of a target or target region. As the goal for most SAR research pertains to automatic target recognition (ATR), extensive filtering and image processing is required in order to extract the features necessary to carry out ATR. This thesis investigates the use of Artificial Neural Networks (ANNs) in order to improve upon the feature extraction process by laying the foundation for ANN SAR ATR algorithms and programs. The first technique investigated is that of an ANN edge detector designed to be invariant to multiplicative speckle noise. The algorithm designed uses the Back Propagation (BP) algorithm to teach a multi-layer perceptron network to detect edges. In order to do so, several parameters within a Sliding Window (SW), are calculated as the inputs to the ANN. The ANN then outputs an edge map that includes the outer edge features of the target as well as some internal edge features. The next technique that is examined is a pattern recognition and target reconstruction algorithm based off of the associative memory ANN known as the Hopfield Network (HN). For this version of the HN, the network is trained with a collection of varying geometric shapes. The output of the network is a nearest-fit representation of the incomplete image data input. Because of the versatility of this program, it is also able to reconstruct incomplete 3D models determined from SAR data. The final technique investigated is an automatic rotation procedure to detect the change in perspective relative to the platform. This type of detection can prove useful if used for target tracking or 3D modeling where the direction vector or relative angle of the target is a desired piece of information. Neural Network SAR Radar Feature Extraction Object Recognition Signal Processing
105	Continual Object Learning Erculiani, Luca 10 June 2021 (has links) This work focuses on building frameworks to strengthen the relation between human and machine learning. This is achieved by proposing a new category of algorithms and a new theory to formalize the perception and categorizationof objects. For what concerns the algorithmic part, we developed a series of procedures to perform Interactive Continuous Open World learning from the point of view of a single user. As for humans, the input of the algorithms are continuous streams of visual information (sequences of frames), that enable the extraction of richer representations by exploiting the persistence of the same object in the input data. Our approaches are able to incrementally learn and recognize collections of objects, starting from emph{zero} knowledge, and organizing them in a hierarchy that follows the will of the user. We then present a novel Knowledge Representation theory that formalizes the property of our setting and enables the learning over it. The theory is based on the notion of separating the visual representation of objects from the semantic meaning associated with them. This distinction enables to treat both instances and classes of objects as being elements of the same kind, as well as allowing for dynamically rearranging objects according to the needs of the user. The whole framework is gradually introduced through the entire thesis and is coupled with an extensive series of experiments to demonstrate its working principles. The experiments focus also on demonstrating the role of a developmental learning policy, in which new objects are regularly introduced, enabling both an increase in recognition performance while reducing the amount of supervision provided by the user.
106	Color Perception and Object Recognition in a Lake Malawian Cichlid Melanochromis Auratus Didion, Jeremy E. 10 October 2012 (has links) No description available. Biology object recognition operant conditioning visual perception optomotor response
107	Three dimensional primitive CAD-based object recognition from range images Villalobos, Leda January 1994 (has links) No description available. Three dimensional primitive CAD-based object recognition range images
108	Visually guided tactile and force-torque sensing for object recognition and localization Rafla, Nader Iskander January 1991 (has links) No description available. Visual/Tactile sensing Object recognition Force-torque sensors
109	A COMPARISON OF DEFORMABLE CONTOUR METHODS AND MODEL BASED APPROACH USING SKELETON FOR SHAPE RECOVERY FROM IMAGES HE, LEI 04 September 2003 (has links) No description available. image segmentation deformable contour method object recognition skeleton matching
110	Perceptual Salience of Non-accidental Properties Weismantel, Eric January 2013 (has links) No description available. Cognitive Psychology Non-accidental properties object invariance object recognition

Search results