Global ETD Search

61	LabelMe: a database and web-based tool for image annotation Russell, Bryan C., Torralba, Antonio, Murphy, Kevin P., Freeman, William T. 08 September 2005 (has links) Research in object detection and recognition in cluttered scenes requires large image collections with ground truth labels. The labels should provide information about the object classes present in each image, as well as their shape and locations, and possibly other attributes such as pose. Such data is useful for testing, as well as for supervised learning. This project provides a web-based annotation tool that makes it easy to annotate images, and to instantly sharesuch annotations with the community. This tool, plus an initial set of 10,000 images (3000 of which have been labeled), can be found at http://www.csail.mit.edu/$\sim$brussell/research/LabelMe/intro.html AI
62	Dual active contour models for image feature extraction Gunn, Steve R. January 1996 (has links) Active contours are now a very popular technique for shape extraction, achieved by minimising a suitably formulated energy functional. Conventional active contour formulations suffer difficulty in appropriate choice of an initial contour and values of parameters. Recent approaches have aimed to resolve these problems, but can compromise other performance aspects. To relieve the problem in initialisation, an evolutionary dual active contour has been developed, which is combined with a local shape model to improve the parameterisation. One contour expands from inside the target feature, the other contracts from the outside. The two contours are inter-linked to provide a balanced technique with an ability to reject weak’local energy minima. Additionally a dual active contour configuration using dynamic programming has been developed to locate a global energy minimum and complements recent approaches via simulated annealing and genetic algorithms. These differ from conventional evolutionary approaches, where energy minimisation may not converge to extract the target shape, in contrast with the guaranteed convergence of a global approach. The new techniques are demonstrated to extract successfully target shapes in synthetic and real images, with superior performance to previous approaches. The new technique employing dynamic programming is deployed to extract the inner face boundary, along with a conventional normal-driven contour to extract the outer face boundary. Application to a database of 75 subjects showed that the outer contour was extracted successfully for 96% of the subjects and the inner contour was successful for 82%. This application highlights the advantages new dual active contour approaches for automatic shape extraction can confer. 621.3994
63	Biologically-inspired machine vision Tsitiridis, A 25 September 2013 (has links) This thesis summarises research on the improved design, integration and expansion of past cortex-like computer vision models, following biologically-inspired methodologies. By adopting early theories and algorithms as a building block, particular interest has been shown for algorithmic parameterisation, feature extraction, invariance properties and classification. Overall, the major original contributions of this thesis have been: 1. The incorporation of a salient feature-based method for semantic feature extraction and refinement in object recognition. 2. The design and integration of colour features coupled with the existing morphological-based features for efficient and improved biologically-inspired object recognition. 3. The introduction of the illumination invariance property with colour constancy methods under a biologically-inspired framework. 4. The development and investigation of rotation invariance methods to improve robustness and compensate for the lack of such a mechanism in the original models. 5. Adaptive Gabor filter design that captures texture information, enhancing the morphological description of objects in a visual scene and improving the overall classification performance. 6. Instigation of pioneering research on Spiking Neural Network classification for biologically-inspired vision. Most of the above contributions have also been presented in two journal publications and five conference papers. The system has been fully developed and tested in computers using MATLAB under a variety of image datasets either created for the purposes of this work or obtained from the public domain. / © Cranfield University Colour object recognition Computer vision Electrotechnology and fluidics
64	Multisensory object recognition and tracking for robotic applications Olsson, Lars Jonas January 1995 (has links) No description available. Object recognition/tracking Multisensory Robot, mobile
65	Understanding Representations and Reducing their Redundancy in Deep Networks Cogswell, Michael Andrew 15 March 2016 (has links) Neural networks in their modern deep learning incarnation have achieved state of the art performance on a wide variety of tasks and domains. A core intuition behind these methods is that they learn layers of features which interpolate between two domains in a series of related parts. The first part of this thesis introduces the building blocks of neural networks for computer vision. It starts with linear models then proceeds to deep multilayer perceptrons and convolutional neural networks, presenting the core details of each. However, the introduction also focuses on intuition by visualizing concrete examples of the parts of a modern network. The second part of this thesis investigates regularization of neural networks. Methods like dropout and others have been proposed to favor certain (empirically better) solutions over others. However, big deep neural networks still overfit very easily. This section proposes a new regularizer called DeCov, which leads to significantly reduced overfitting (difference between train and val performance) and greater generalization, sometimes better than dropout and other times not. The regularizer is based on the cross-covariance of hidden representations and takes advantage of the intuition that different features should try to represent different things, an intuition others have explored with similar losses. Experiments across a range of datasets and network architectures demonstrate reduced overfitting due to DeCov while almost always maintaining or increasing generalization performance and often improving performance over dropout. / Master of Science Object Recognition Overfitting Computer Vision Machine learning
66	3D Object Representation and Recognition Based on Biologically Inspired Combined Use of Visual and Tactile Data Rouhafzay, Ghazal 13 May 2021 (has links) Recent research makes use of biologically inspired computation and artificial intelligence as efficient means to solve real-world problems. Humans show a significant performance in extracting and interpreting visual information. In the cases where visual data is not available, or, for example, if it fails to provide comprehensive information due to occlusions, tactile exploration assists in the interpretation and better understanding of the environment. This cooperation between human senses can serve as an inspiration to embed a higher level of intelligence in computational models. In the context of this research, in the first step, computational models of visual attention are explored to determine salient regions on the surface of objects. Two different approaches are proposed. The first approach takes advantage of a series of contributing features in guiding human visual attention, namely color, contrast, curvature, edge, entropy, intensity, orientation, and symmetry are efficiently integrated to identify salient features on the surface of 3D objects. This model of visual attention also learns to adaptively weight each feature based on ground-truth data to ensure a better compatibility with human visual exploration capabilities. The second approach uses a deep Convolutional Neural Network (CNN) for feature extraction from images collected from 3D objects and formulates saliency as a fusion map of regions where the CNN looks at, while classifying the object based on their geometrical and semantic characteristics. The main difference between the outcomes of the two algorithms is that the first approach results in saliencies spread over the surface of the objects while the second approach highlights one or two regions with concentrated saliency. Therefore, the first approach is an appropriate simulation of visual exploration of objects, while the second approach successfully simulates the eye fixation locations on objects. In the second step, the first computational model of visual attention is used to determine scattered salient points on the surface of objects based on which simplified versions of 3D object models preserving the important visual characteristics of objects are constructed. Subsequently, the thesis focuses on the topic of tactile object recognition, leveraging the proposed model of visual attention. Beyond the sensor technologies which are instrumental in ensuring data quality, biological models can also assist in guiding the placement of sensors and support various selective data sampling strategies that allow exploring an object’s surface faster. Therefore, the possibility to guide the acquisition of tactile data based on the identified visually salient features is tested and validated in this research. Different object exploration and data processing approaches were used to identify the most promising solution. Our experiments confirm the effectiveness of computational models of visual attention as a guide for data selection for both simplifying 3D representation of objects as well as enhancing tactile object recognition. In particular, the current research demonstrates that: (1) the simplified representation of objects by preserving visually salient characteristics shows a better compatibility with human visual capabilities compared to uniformly simplified models, and (2) tactile data acquired based on salient visual features are more informative about the objects’ characteristics and can be employed in tactile object manipulation and recognition scenarios. In the last section, the thesis addresses the issue of transfer of learning from vision to touch. Inspired from biological studies that attest similarities between the processing of visual and tactile stimuli in human brain, the thesis studies the possibility of transfer of learning from vision to touch using deep learning architectures and proposes a hybrid CNN that handles both visual and tactile object recognition. 3D modeling 3D object representation 3D object recognition Visual-attention Tactile object recognition Transfer learning
67	OBJECT RECOGNITION BY GROUND-PENETRATING RADAR IMAGING SYSTEMS WITH TEMPORAL SPECTRAL STATISTICS Ono, Sashi, Lee, Hua 10 1900 (has links) International Telemetering Conference Proceedings / October 18-21, 2004 / Town & Country Resort, San Diego, California / This paper describes a new approach to object recognition by using ground-penetrating radar (GPR) imaging systems. The recognition procedure utilizes the spectral content instead of the object shape in traditional methods. To produce the identification feature of an object, the most common spectral component is obtained by singular value decomposition (SVD) of the training sets. The identification process is then integrated into the backward propagation image reconstruction algorithm, which is implemented on the FMCW GPR imaging systems. Object recognition singular value decomposition backward propagation method
68	ScatterNet hybrid frameworks for deep learning Singh, Amarjot January 2019 (has links) Image understanding is the task of interpreting images by effectively solving the individual tasks of object recognition and semantic image segmentation. An image understanding system must have the capacity to distinguish between similar looking image regions while being invariant in its response to regions that have been altered by the appearance-altering transformation. The fundamental challenge for any such system lies within this simultaneous requirement for both invariance and specificity. Many image understanding systems have been proposed that capture geometric properties such as shapes, textures, motion and 3D perspective projections using filtering, non-linear modulus, and pooling operations. Deep learning networks ignore these geometric considerations and compute descriptors having suitable invariance and stability to geometric transformations using (end-to-end) learned multi-layered network filters. These deep learning networks in recent years have come to dominate the previously separate fields of research in machine learning, computer vision, natural language understanding and speech recognition. Despite the success of these deep networks, there remains a fundamental lack of understanding in the design and optimization of these networks which makes it difficult to develop them. Also, training of these networks requires large labeled datasets which in numerous applications may not be available. In this dissertation, we propose the ScatterNet Hybrid Framework for Deep Learning that is inspired by the circuitry of the visual cortex. The framework uses a hand-crafted front-end, an unsupervised learning based middle-section, and a supervised back-end to rapidly learn hierarchical features from unlabelled data. Each layer in the proposed framework is automatically optimized to produce the desired computationally efficient architecture. The term `Hybrid' is coined because the framework uses both unsupervised as well as supervised learning. We propose two hand-crafted front-ends that can extract locally invariant features from the input signals. Next, two ScatterNet Hybrid Deep Learning (SHDL) networks (a generative and a deterministic) were introduced by combining the proposed front-ends with two unsupervised learning modules which learn hierarchical features. These hierarchical features were finally used by a supervised learning module to solve the task of either object recognition or semantic image segmentation. The proposed front-ends have also been shown to improve the performance and learning of current Deep Supervised Learning Networks (VGG, NIN, ResNet) with reduced computing overhead.
69	Object Recognition with Cluster Matching Lennartsson, Mattias January 2009 (has links) <p>Within this thesis an algorithm for object recognition called Cluster Matching has been developed, implemented and evaluated. The image information is sampled at arbitrary sample points, instead of interest points, and local image features are extracted. These sample points are used as a compact representation of the image data and can quickly be searched for prior known objects. The algorithm is evaluated on a test set of images and the result is surprisingly reliable and time efficient.</p> computer vision object recognition cluster matching Image analysis Bildanalys
70	Limitations of Geometric Hashing in the Presence of Gaussian Noise Sarachik, Karen B. 01 October 1992 (has links) This paper presents a detailed error analysis of geometric hashing for 2D object recogition. We analytically derive the probability of false positives and negatives as a function of the number of model and image, features and occlusion, using a 2D Gaussian noise model. The results are presented in the form of ROC (receiver-operating characteristic) curves, which demonstrate that the 2D Gaussian error model always has better performance than that of the bounded uniform model. They also directly indicate the optimal performance that can be achieved for a given clutter and occlusion rate, and how to choose the thresholds to achieve these rates. object recognition error analysis geometric hashing sGaussian error models

Search results