Spelling suggestions: "subject:"abject recognition."" "subject:"6bject recognition.""
41 |
Embodied Visual Object Recognition / Förkroppsligad objektigenkänningWallenberg, Marcus January 2017 (has links)
Object recognition is a skill we as humans often take for granted. Due to our formidable object learning, recognition and generalisation skills, it is sometimes hard to see the multitude of obstacles that need to be overcome in order to replicate this skill in an artificial system. Object recognition is also one of the classical areas of computer vision, and many ways of approaching the problem have been proposed. Recently, visually capable robots and autonomous vehicles have increased the focus on embodied recognition systems and active visual search. These applications demand that systems can learn and adapt to their surroundings, and arrive at decisions in a reasonable amount of time, while maintaining high object recognition performance. This is especially challenging due to the high dimensionality of image data. In cases where end-to-end learning from pixels to output is needed, mechanisms designed to make inputs tractable are often necessary for less computationally capable embodied systems.Active visual search also means that mechanisms for attention and gaze control are integral to the object recognition procedure. Therefore, the way in which attention mechanisms should be introduced into feature extraction and estimation algorithms must be carefully considered when constructing a recognition system.This thesis describes work done on the components necessary for creating an embodied recognition system, specifically in the areas of decision uncertainty estimation, object segmentation from multiple cues, adaptation of stereo vision to a specific platform and setting, problem-specific feature selection, efficient estimator training and attentional modulation in convolutional neural networks. Contributions include the evaluation of methods and measures for predicting the potential uncertainty reduction that can be obtained from additional views of an object, allowing for adaptive target observations. Also, in order to separate a specific object from other parts of a scene, it is often necessary to combine multiple cues such as colour and depth in order to obtain satisfactory results. Therefore, a method for combining these using channel coding has been evaluated. In order to make use of three-dimensional spatial structure in recognition, a novel stereo vision algorithm extension along with a framework for automatic stereo tuning have also been investigated. Feature selection and efficient discriminant sampling for decision tree-based estimators have also been implemented. Finally, attentional multi-layer modulation of convolutional neural networks for recognition in cluttered scenes has been evaluated. Several of these components have been tested and evaluated on a purpose-built embodied recognition platform known as Eddie the Embodied. / Embodied Visual Object Recognition / FaceTrack
|
42 |
How to Play Twenty Questions with Nature and WinRichards, Whitman 01 December 1982 (has links)
The 20 Questions Game played by children has an impressive record of rapidly guessing an arbitrarily selected object with rather few, well-chosen questions. This same strategy can be used to drive the perceptual process, likewise beginning the search with the intent of deciding whether the object is Animal-Vegetable-or-Mineral. For a perceptual system, however, several simple questions are required even to make this first judgment as to the Kingdom the object belongs. Nevertheless, the answers to these first simple questions, or their modular outputs, provide a rich data base which can serve to classify objects or events in much more detail than one might expect, thanks to constraints and laws imposed upon natural processes and things. The questions, then, suggest a useful set of primitive modules for initializing perception.
|
43 |
Recognition and Structure from One 2D Model View: Observations on Prototypes, Object Classes and SymmetriesPoggio, Tomaso, Vetter, Thomas 01 February 1992 (has links)
In this note we discuss how recognition can be achieved from a single 2D model view exploiting prior knowledge of an object's structure (e.g. symmetry). We prove that for any bilaterally symmetric 3D object one non- accidental 2D model view is sufficient for recognition. Symmetries of higher order allow the recovery of structure from one 2D view. Linear transformations can be learned exactly from a small set of examples in the case of "linear object classes" and used to produce new views of an object from a single view.
|
44 |
Bringing the Grandmother Back into the Picture: A Memory-Based View of Object RecognitionEdelman, Shimon, Poggio, Tomaso 01 April 1990 (has links)
We describe experiments with a versatile pictorial prototype based learning scheme for 3D object recognition. The GRBF scheme seems to be amenable to realization in biophysical hardware because the only kind of computation it involves can be effectively carried out by combining receptive fields. Furthermore, the scheme is computationally attractive because it brings together the old notion of a "grandmother'' cell and the rigorous approximation methods of regularization and splines.
|
45 |
The Combinatorics of Heuristic Search Termination for Object Recognition in Cluttered EnvironmentsGrimson, W. Eric L. 01 May 1989 (has links)
Many recognition systems use constrained search to locate objects in cluttered environments. Earlier analysis showed that the expected search is quadratic in the number of model and data features, if all the data comes from one object, but is exponential when spurious data is included. To overcome this, many methods terminate search once an interpretation that is "good enough" is found. We formally examine the combinatorics of this, showing that correct termination procedures dramatically reduce search. We provide conditions on the object model and the scene clutter such that the expected search is quartic. These results are shown to agree with empirical data for cluttered object recognition.
|
46 |
On the Verification of Hypothesized Matches in Model-Based RecognitionGrimson, W. Eric L., Huttenlocher, Daniel P. 01 May 1989 (has links)
In model-based recognition, ad hoc techniques are used to decide if a match of data to model is correct. Generally an empirically determined threshold is placed on the fraction of model features that must be matched. We rigorously derive conditions under which to accept a match, relating the probability of a random match to the fraction of model features accounted for, as a function of the number of model features, number of image features and the sensor noise. We analyze some existing recognition systems and show that our method yields results comparable with experimental data.
|
47 |
Observations on Cortical Mechanisms for Object Recognition andsLearningPoggio, Tomaso, Hurlbert, Anya 01 December 1993 (has links)
This paper sketches a hypothetical cortical architecture for visual 3D object recognition based on a recent computational model. The view-centered scheme relies on modules for learning from examples, such as Hyperbf-like networks. Such models capture a class of explanations we call Memory-Based Models (MBM) that contains sparse population coding, memory-based recognition, and codebooks of prototypes. Unlike the sigmoidal units of some artificial neural networks, the units of MBMs are consistent with the description of cortical neurons. We describe how an example of MBM may be realized in terms of cortical circuitry and biophysical mechanisms, consistent with psychophysical and physiological data.
|
48 |
Stimulus Simplification and Object Representation: A Modeling StudyKnoblich, Ulf, Riesenhuber, Maximilan 15 March 2002 (has links)
Tsunoda et al. (2001) recently studied the nature of object representation in monkey inferotemporal cortex using a combination of optical imaging and extracellular recordings. In particular, they examined IT neuron responses to complex natural objects and "simplified" versions thereof. In that study, in 42% of the cases, optical imaging revealed a decrease in the number of activation patches in IT as stimuli were "simplified". However, in 58% of the cases, "simplification" of the stimuli actually led to the appearance of additional activation patches in IT. Based on these results, the authors propose a scheme in which an object is represented by combinations of active and inactive columns coding for individual features. We examine the patterns of activation caused by the same stimuli as used by Tsunoda et al. in our model of object recognition in cortex (Riesenhuber 99). We find that object-tuned units can show a pattern of appearance and disappearance of features identical to the experiment. Thus, the data of Tsunoda et al. appear to be in quantitative agreement with a simple object-based representation in which an object's identity is coded by its similarities to reference objects. Moreover, the agreement of simulations and experiment suggests that the simplification procedure used by Tsunoda (2001) is not necessarily an accurate method to determine neuronal tuning.
|
49 |
On the difficulty of feature-based attentional modulations in visual object recognition: A modeling study.Schneider, Robert, Riesenhuber, Maximilian 14 January 2004 (has links)
Numerous psychophysical experiments have shown an important role for attentional modulations in vision. Behaviorally, allocation of attention can improve performance in object detection and recognition tasks. At the neural level, attention increases firing rates of neurons in visual cortex whose preferred stimulus is currently attended to. However, it is not yet known how these two phenomena are linked, i.e., how the visual system could be "tuned" in a task-dependent fashion to improve task performance. To answer this question, we performed simulations with the HMAX model of object recognition in cortex [45]. We modulated firing rates of model neurons in accordance with experimental results about effects of feature-based attention on single neurons and measured changes in the model's performance in a variety of object recognition tasks. It turned out that recognition performance could only be improved under very limited circumstances and that attentional influences on the process of object recognition per se tend to display a lack of specificity or raise false alarm rates. These observations lead us to postulate a new role for the observed attention-related neural response modulations.
|
50 |
Rotation Invariant Object Recognition from One Training ExampleYokono, Jerry Jun, Poggio, Tomaso 27 April 2004 (has links)
Local descriptors are increasingly used for the task of object recognition because of their perceived robustness with respect to occlusions and to global geometrical deformations. Such a descriptor--based on a set of oriented Gaussian derivative filters-- is used in our recognition system. We report here an evaluation of several techniques for orientation estimation to achieve rotation invariance of the descriptor. We also describe feature selection based on a single training image. Virtual images are generated by rotating and rescaling the image and robust features are selected. The results confirm robust performance in cluttered scenes, in the presence of partial occlusions, and when the object is embedded in different backgrounds.
|
Page generated in 0.0962 seconds