Spelling suggestions: "subject:"abject recognition"" "subject:"6bject recognition""
21 |
Real-time optical intensity correlation using photorefractive BSOWang, Zhao Qi January 1995 (has links)
Real-time optical intensity correlation using a photorefractive BSO crystal and a liquid crystal television is implemented. The underlying physics basis is considered, some specific techniques to improve the operation are proposed, and several optical pattern recognition tasks are achieved. Photorefractive BSO is used as the holographic recording medium in the real-time intensity correlator. To improve the dynamic holographic recording, a moving grating technique is adopted. The nonlinear effects of moving gratings at large fringe modulation are experimentally investigated, and are compared with numerical predictions. Optical bias is adopted to overcome the difficulty of a large drop in the optimum fringe velocity with moving gratings. The effects of optical bias on the optimum fringe velocity and on the diffraction efficiency are studied. To overcome the inherent drawback of low discrimination of intensity correlation in optical pattern recognition, real-time edge-enhanced intensity correlation is achieved by means of nonlinear holographic recording in BSO. Real-time colour object recognition is achieved by using a commercially available and inexpensive colour liquid crystal television in the intensity correlator. Multi-class object recognition is achieved with a synthetic discriminant function filter displayed by the Epson liquid crystal display in the real-time intensity correlator. The phase and intensity modulation properties of the Epson liquid crystal display are studied. A further research topic which uses the Epson liquid crystal display to realize a newly designed spatial filter, the quantized amplitude-compensated matched filter, is proposed. The performance merits of the filter are investigated by means of computer simulations.
|
22 |
Construction of a 3D Object Recognition and Manipulation Database from Grasp DemonstrationsKent, David E 09 April 2014 (has links)
Object recognition and manipulation are critical for enabling robots to operate within a household environment. There are many grasp planners that can estimate grasps based on object shape, but these approaches often perform poorly because they miss key information about non-visual object characteristics, such as weight distribution, fragility of materials, and usability characteristics. Object model databases can account for this information, but existing methods for constructing 3D object recognition databases are time and resource intensive, often requiring specialized equipment, and are therefore difficult to apply to robots in the field. We present an easy-to-use system for constructing object models for 3D object recognition and manipulation made possible by advances in web robotics. The database consists of point clouds generated using a novel iterative point cloud registration algorithm, which includes the encoding of manipulation data and usability characteristics. The system requires no additional equipment other than the robot itself, and non-expert users can demonstrate grasps through an intuitive web interface with virtually no training required. We validate the system with data collected from both a crowdsourcing user study and a set of grasps demonstrated by an expert user. We show that the crowdsourced grasps can produce successful autonomous grasps, and furthermore the demonstration approach outperforms purely vision-based grasp planning approaches for a wide variety of object classes.
|
23 |
Integrating visual and tactile robotic perceptionCorradi, Tadeo January 2018 (has links)
The aim of this project is to enable robots to recognise objects and object categories by combining vision and touch. In this thesis, a novel inexpensive tactile sensor design is presented, together with a complete, probabilistic sensor-fusion model. The potential of the model is demonstrated in four areas: (i) Shape Recognition, here the sensor outperforms its most similar rival, (ii) Single-touch Object Recognition, where state-of-the-art results are produced, (iii) Visuo-tactile object recognition, demonstrating the benefits of multi-sensory object representations, and (iv) Object Classification, which has not been reported in the literature to date. Both the sensor design and the novel database were made available. Tactile data collection is performed by a robot. An extensive analysis of data encodings, data processing, and classification methods is presented. The conclusions reached are: (i) the inexpensive tactile sensor can be used for basic shape and object recognition, (ii) object recognition combining vision and touch in a probabilistic manner provides an improvement in accuracy over either modality alone, (iii) when both vision and touch perform poorly independently, the sensor-fusion model proposed provides faster learning, i.e. fewer training samples are required to achieve similar accuracy, and (iv) such a sensor-fusion model is more accurate than either modality alone when attempting to classify unseen objects, as well as when attempting to recognise individual objects from amongst similar other objects of the same class. (v) The preliminary potential is identified for real-life applications: underwater object classification. (vi) The sensor fusion model providesimprovements in classification even for award-winning deep-learning basedcomputer vision models.
|
24 |
A theory of scene understanding and object recognition.Dillon, Craig January 1996 (has links)
This dissertation presents a new approach to image interpretation which can produce hierarchical descriptions of visually sensed scenes based on an incrementally learnt hierarchical knowledge base. Multiple segmentation and labelling hypotheses are generated with local constraint satisfaction being achieved through a hierarchical form of relaxation labelling. The traditionally unidirectional segmentation-matching process is recast into a dynamic closed-loop system where the current interpretation state is used to drive the lower level image processing functions. The theory presented in this dissertation is applied to a new object recognition and scene understanding system called Cite which is described in detail.
|
25 |
View-Based Strategies for 3D Object RecognitionSinha, Pawan, Poggio, Tomaso 21 April 1995 (has links)
A persistent issue of debate in the area of 3D object recognition concerns the nature of the experientially acquired object models in the primate visual system. One prominent proposal in this regard has expounded the use of object centered models, such as representations of the objects' 3D structures in a coordinate frame independent of the viewing parameters [Marr and Nishihara, 1978]. In contrast to this is another proposal which suggests that the viewing parameters encountered during the learning phase might be inextricably linked to subsequent performance on a recognition task [Tarr and Pinker, 1989; Poggio and Edelman, 1990]. The 'object model', according to this idea, is simply a collection of the sample views encountered during training. Given that object centered recognition strategies have the attractive feature of leading to viewpoint independence, they have garnered much of the research effort in the field of computational vision. Furthermore, since human recognition performance seems remarkably robust in the face of imaging variations [Ellis et al., 1989], it has often been implicitly assumed that the visual system employs an object centered strategy. In the present study we examine this assumption more closely. Our experimental results with a class of novel 3D structures strongly suggest the use of a view-based strategy by the human visual system even when it has the opportunity of constructing and using object-centered models. In fact, for our chosen class of objects, the results seem to support a stronger claim: 3D object recognition is 2D view-based.
|
26 |
Learning Language-vision CorrespondencesJamieson, Michael 15 February 2011 (has links)
Given an unstructured collection of captioned images of cluttered scenes featuring a variety of objects, our goal is to simultaneously learn the names and appearances of the objects. Only a small fraction of local features within any given image are associated with a particular caption word, and captions may contain irrelevant words not associated with any image object. We propose a novel algorithm that uses the repetition of feature neighborhoods across training images and a measure of correspondence with caption words to learn meaningful feature configurations (representing named objects). We also introduce a graph-based appearance model that captures some of the structure of an object by encoding the spatial relationships among the local visual features. In an iterative procedure we use language (the words) to drive a perceptual grouping process that assembles an appearance model for a named object. We also exploit co-occurrences among appearance models to learn hierarchical appearance models. Results of applying our method to three data sets in a variety of conditions demonstrate that from complex, cluttered, real-world scenes with noisy captions, we can learn both the names and appearances of objects, resulting in a set of models invariant to translation, scale, orientation, occlusion, and minor changes in viewpoint or articulation. These named models, in turn, are used to automatically annotate new, uncaptioned images, thereby facilitating keyword-based image retrieval.
|
27 |
Learning Language-vision CorrespondencesJamieson, Michael 15 February 2011 (has links)
Given an unstructured collection of captioned images of cluttered scenes featuring a variety of objects, our goal is to simultaneously learn the names and appearances of the objects. Only a small fraction of local features within any given image are associated with a particular caption word, and captions may contain irrelevant words not associated with any image object. We propose a novel algorithm that uses the repetition of feature neighborhoods across training images and a measure of correspondence with caption words to learn meaningful feature configurations (representing named objects). We also introduce a graph-based appearance model that captures some of the structure of an object by encoding the spatial relationships among the local visual features. In an iterative procedure we use language (the words) to drive a perceptual grouping process that assembles an appearance model for a named object. We also exploit co-occurrences among appearance models to learn hierarchical appearance models. Results of applying our method to three data sets in a variety of conditions demonstrate that from complex, cluttered, real-world scenes with noisy captions, we can learn both the names and appearances of objects, resulting in a set of models invariant to translation, scale, orientation, occlusion, and minor changes in viewpoint or articulation. These named models, in turn, are used to automatically annotate new, uncaptioned images, thereby facilitating keyword-based image retrieval.
|
28 |
Using Local Invariant in Occluded Object Recognition by Hopfield Neural NetworkTzeng, Chih-Hung 11 July 2003 (has links)
In our research, we proposed a novel invariant in 2-D image contour recognition based on Hopfield-Tank neural network. At first, we searched the feature points, the position of feature points where are included high curvature and corner on the contour. We used polygonal approximation to describe the image contour. There have two patterns we set, one is model pattern another is test pattern. The Hopfield-Tank network was employed to perform feature matching. In our results show that we can overcome the test pattern which consists of translation, rotation, scaling transformation and no matter single or occlusion pattern.
|
29 |
Towards 3D vision from range images : an optimisation framework and parallel distributed networksZiqing Li, S. January 1991 (has links)
No description available.
|
30 |
Learning Patch-based Structural Element Models with Hierarchical PalettesChua, Jeroen 21 November 2012 (has links)
Image patches can be factorized into ‘shapelets’ that describe segmentation patterns, and palettes that describe how to paint the segments. This allows a flexible factorization of local shape (segmentation patterns) and appearance (palettes), which we argue is useful for tasks such as object and scene recognition. Here, we introduce the
‘shapelet’ model- a framework that is able to learn a library of ‘shapelet’ segmentation patterns to capture local shape, and hierarchical palettes of colors to capture appearance. Using a learned shapelet library, image patches can be analyzed using a variational technique to produce descriptors that separately describe local shape and local appearance. These descriptors can be used for high-level vision tasks, such as object and scene recognition. We show that the shapelet model is competitive with SIFT-based methods and structure element (stel) model variants on the object recognition datasets Caltech28 and Caltech101, and the scene recognition dataset All-I-Have-Seen.
|
Page generated in 0.0978 seconds