Spelling suggestions: "subject:"fontrecognition"" "subject:"andrecognition""
391 |
Indexing for Visual Recognition from a Large Model BaseBreuel, Thomas M. 01 August 1990 (has links)
This paper describes a new approach to the model base indexing stage of visual object recognition. Fast model base indexing of 3D objects is achieved by accessing a database of encoded 2D views of the objects using a fast 2D matching algorithm. The algorithm is specifically intended as a plausible solution for the problem of indexing into very large model bases that general purpose vision systems and robots will have to deal with in the future. Other properties that make the indexing algorithm attractive are that it can take advantage of most geometric and non-geometric properties of features without modification, and that it addresses the incremental model acquisition problem for 3D objects.
|
392 |
Context-Based Vision System for Place and Object RecognitionTorralba, Antonio, Murphy, Kevin P., Freeman, William T., Rubin, Mark A. 19 March 2003 (has links)
While navigating in an environment, a vision system has to be able to recognize where it is and what the main objects in the scene are. In this paper we present a context-based vision system for place and object recognition. The goal is to identify familiar locations (e.g., office 610, conference room 941, Main Street), to categorize new environments (office, corridor, street) and to use that information to provide contextual priors for object recognition (e.g., table, chair, car, computer). We present a low-dimensional global image representation that provides relevant information for place recognition and categorization, and how such contextual information introduces strong priors that simplify object recognition. We have trained the system to recognize over 60 locations (indoors and outdoors) and to suggest the presence and locations of more than 20 different object types. The algorithm has been integrated into a mobile system that provides real-time feedback to the user.
|
393 |
Component based recognition of objects in an office environmentMorgenstern, Christian, Heisele, Bernd 28 November 2003 (has links)
We present a component-based approach for recognizing objects under large pose changes. From a set of training images of a given object we extract a large number of components which are clustered based on the similarity of their image features and their locations within the object image. The cluster centers build an initial set of component templates from which we select a subset for the final recognizer. In experiments we evaluate different sizes and types of components and three standard techniques for component selection. The component classifiers are finally compared to global classifiers on a database of four objects.
|
394 |
Nanotechnology for Molecular Recognition of Biological AnalytesTriulzi, Robert C. 23 January 2009 (has links)
Nanotechnology is a term used to describe nanometer scaled systems. This thesis presents various nanomaterials and systems for the investigation of biologically relevant analytes in general, and in particular for their detection, decontamination, or destruction. The validation of short peptide fragments as models for protein aggregation is initially discussed through applying spectroscopic and microscopic techniques to Langmuir monolayer surface chemistry. Following this validation, the use of nanogold as a photoablative material for the destruction of aggregated protein is investigated. Subsequently, the versatility of nanotechnology is shown by investigating a different form of nanogold; namely, gold quantum dots and the interesting phenomenon that arise when dealing with materials on a nanoscale. Experiments involving a complex between these gold quantum dots and an antibody are performed for the detection of an immunoglobulin in solution. The power of this analytical technique is highlighted by the capability of detecting the analyte at nanomolar concentrations. Finally, a limitation-the multiple synthetic steps necessary for imparting biological activity-- of quantum dots is addressed: a single step reaction is studied that allows for direct stabilization and conjugation of quantum dots with proteins and enzymes. As a representative application of the above mentioned procedure, the detection and decontamination of an organophosphorus compound is explored. In general, methods for overcoming limitations of nanoparticles and nanocrystals are discussed.
|
395 |
Human Identification Based on Three-Dimensional Ear and Face ModelsCadavid, Steven 05 May 2011 (has links)
We propose three biometric systems for performing 1) Multi-modal Three-Dimensional (3D) ear + Two-Dimensional (2D) face recognition, 2) 3D face recognition, and 3) hybrid 3D ear recognition combining local and holistic features. For the 3D ear component of the multi-modal system, uncalibrated video sequences are utilized to recover the 3D ear structure of each subject within a database. For a given subject, a series of frames is extracted from a video sequence and the Region-of-Interest (ROI) in each frame is independently reconstructed in 3D using Shape from Shading (SFS). A fidelity measure is then employed to determine the model that most accurately represents the 3D structure of the subject’s ear. Shape matching between a probe and gallery ear model is performed using the Iterative Closest Point (ICP) algorithm. For the 2D face component, a set of facial landmarks is extracted from frontal facial images using the Active Shape Model (ASM) technique. Then, the responses of the facial images to a series of Gabor filters at the locations of the facial landmarks are calculated. The Gabor features are stored in the database as the face model for recognition. Match-score level fusion is employed to combine the match scores obtained from both the ear and face modalities. The aim of the proposed system is to demonstrate the superior performance that can be achieved by combining the 3D ear and 2D face modalities over either modality employed independently. For the 3D face recognition system, we employ an Adaboost algorithm to builda classifier based on geodesic distance features. Firstly, a generic face model is finely conformed to each face model contained within a 3D face dataset. Secondly, the geodesic distance between anatomical point pairs are computed across each conformed generic model using the Fast Marching Method. The Adaboost algorithm then generates a strong classifier based on a collection of geodesic distances that are most discriminative for face recognition. The identification and verification performances of three Adaboost algorithms, namely, the original Adaboost algorithm proposed by Freund and Schapire, and two variants – the Gentle and Modest Adaboost algorithms – are compared. For the hybrid 3D ear recognition system, we propose a method to combine local and holistic ear surface features in a computationally efficient manner. The system is comprised of four primary components, namely, 1) ear image segmentation, 2) local feature extraction and matching, 3) holistic feature extraction and matching, and 4) a fusion framework combining local and holistic features at the match score level. For the segmentation component, we employ our method proposed in [111], to localize a rectangular region containing the ear. For the local feature extraction and representation component, we extend the Histogram of Categorized Shapes (HCS) feature descriptor, proposed in [111], to an object-centered 3D shape descriptor, termed Surface Patch Histogram of Indexed Shapes (SPHIS), for surface patch representation and matching. For the holistic matching component, we introduce a voxelization scheme for holistic ear representation from which an efficient, element-wise comparison of gallery-probe model pairs can be made. The match scores obtained from both the local and holistic matching components are fused to generate the final match scores. Experimental results conducted on the University of Notre Dame (UND) collection J2 dataset demonstrate that theproposed approach outperforms state-of-the-art 3D ear biometric systems in both accuracy and efficiency.
|
396 |
Team behavior recognition using dynamic bayesian networksGaitanis, Konstantinos 31 October 2008 (has links)
Cette thèse de doctorat analyse les concepts impliqués dans la prise de décisions de groupes d'agents et applique ces concepts dans la création d'un cadre théorique et pratique qui permet la reconnaissance de comportements de groupes.
Nous allons présenter une vue d'ensemble de la théorie de l'intention, étudiée dans le passé par quelques grands théoriciens comme Searle, Bratmann et Cohen, et nous allons montrer le lien avec des recherches plus récentes dans le domaine de la reconnaissance de comportements.
Nous allons étudier les avantages et inconvénients des techniques les plus avancées dans ce domaine et nous allons créer un nouveau modèle qui représente et détecte les comportements de groupes. Ce nouveau modèle s'appelle Multiagent-Abstract Hidden Markov mEmory Model (M-AHMEM) et résulte de la fusion de modèles déjà existants, le but étant de créer une approche unifiée du problème. La plus grande partie de cette thèse est consacrée à la présentation détaillée du M-AHMEM et de l'algorithme responsable de la reconnaissance de comportements.
Notre modèle sera testé sur deux applications différentes : l'analyse gesturale humaine et la fusion multimodale des données audio et vidéo. A travers ces deux applications, nous avançons l'argument qu'un ensemble de données constitué de plusieurs variables corrélées peut être analysé efficacement sous un cadre unifié de reconnaissance de comportements. Nous allons montrer que la corrélation entre les différentes variables peut être modélisée comme une coopération ayant lieu à l'intérieur d'une équipe et que la reconnaissance de comportements constitue une approche moderne de classification et de reconnaissance de patrons.
|
397 |
Sketch Recognition on Mobile DevicesLucchese, George 1987- 14 March 2013 (has links)
Sketch recognition allows computers to understand and model hand drawn sketches and diagrams. Traditionally sketch recognition systems required a pen based PC interface, but powerful mobile devices such as tablets and smartphones can provide a new platform for sketch recognition systems. We describe a new sketch recognition library, Strontium (SrL) that combines several existing sketch recognition libraries modified to run on both personal computers and on the Android platform. We analyzed the recognition speed and accuracy implications of performing low-level shape recognition on smartphones with touch screens. We found that there is a large gap in recognition speed on mobile devices between recognizing simple shapes and more complex ones, suggesting that mobile sketch interface designers limit the complexity of their sketch domains. We also found that a low sampling rate on mobile devices can affect recognition accuracy of complex and curved shapes. Despite this, we found no evidence to suggest that using a finger as an input implement leads to a decrease in simple shape recognition accuracy. These results show that the same geometric shape recognizers developed for pen applications can be used in mobile applications, provided that developers keep shape domains simple and ensure that input sampling rate is kept as high as possible.
|
398 |
Nonlinear compensation and heterogeneous data modeling for robust speech recognitionZhao, Yong 21 February 2013 (has links)
The goal of robust speech recognition is to maintain satisfactory recognition accuracy under mismatched operating conditions. This dissertation addresses the robustness issue from two directions.
In the first part of the dissertation, we propose the Gauss-Newton method as a unified approach to estimating noise parameters for use in prevalent nonlinear compensation models, such as vector Taylor series (VTS), data-driven parallel model combination (DPMC), and unscented transform (UT), for noise-robust speech recognition. While iterative estimation of noise means in a generalized EM framework has been widely known, we demonstrate that such approaches are variants of the Gauss-Newton method. Furthermore, we propose a novel noise variance estimation algorithm that is consistent with the Gauss-Newton principle. The formulation of the Gauss-Newton method reduces the noise estimation problem to determining the Jacobians of the corrupted speech parameters. For sampling-based compensations, we present two methods, sample Jacobian average (SJA) and cross-covariance (XCOV), to evaluate these Jacobians. The Gauss-Newton method is closely related to another noise estimation approach, which views the model compensation from a generative perspective, giving rise to an EM-based algorithm analogous to the ML estimation for factor analysis (EM-FA). We demonstrate a close connection between these two approaches: they belong to the family of gradient-based methods except with different convergence rates. Note that the convergence property can be crucial to the noise estimation in many applications where model compensation may have to be frequently carried out in changing noisy environments to retain desired performance. Furthermore, several techniques are explored to further improve the nonlinear compensation approaches. To overcome the demand of the clean speech data for training acoustic models, we integrate nonlinear compensation with adaptive training. We also investigate the fast VTS compensation to improve the noise estimation efficiency, and combine the VTS compensation with acoustic echo cancellation (AEC) to mitigate issues due to interfering background speech. The proposed noise estimation algorithm is evaluated for various compensation models on two tasks. The first is to fit a GMM model to artificially corrupted samples, the second is to perform speech recognition on the Aurora 2 database, and the third is on a speech corpus simulating the meeting of multiple competing speakers. The significant performance improvements confirm the efficacy of the Gauss-Newton method to estimating the noise parameters of the nonlinear compensation models.
The second research work is devoted to developing more effective models to take full advantage of heterogeneous speech data, which are typically collected from thousands of speakers in various environments via different transducers. The proposed synchronous HMM, in contrast to the conventional HMMs, introduces an additional layer of substates between the HMM state and the Gaussian component variables. The substates have the capability to register long-span non-phonetic attributes, such as gender, speaker identity, and environmental condition, which are integrally called speech scenes in this study. The hierarchical modeling scheme allows an accurate description of probability distribution of speech units in different speech scenes. To address the data sparsity problem in estimating parameters of multiple speech scene sub-models, a decision-based clustering algorithm is presented to determine the set of speech scenes and to tie the substate parameters, allowing us to achieve an excellent balance between modeling accuracy and robustness. In addition, by exploiting the synchronous relationship among the speech scene sub-models, we propose the multiplex Viterbi algorithm to efficiently decode the synchronous HMM within a search space of the same size as for the standard HMM. The multiplex Viterbi can also be generalized to decode an ensemble of isomorphic HMM sets, a problem often arising in the multi-model systems. The experiments on the Aurora 2 task show that the synchronous HMMs produce a significant improvement in recognition performance over the HMM baseline at the expense of a moderate increase in the memory requirement and computational complexity.
|
399 |
Writer Identification by a Combination of Graphical Features in the Framework of Old Handwritten Music ScoresFornés Bisquerra, Alicia 03 July 2009 (has links)
No description available.
|
400 |
Techniques for creating ground-truthed sketch corporaMacLean, Scott January 2009 (has links)
The problem of recognizing handwritten mathematics notation has been studied for over forty years with little practical success. The poor performance of math recognition systems is due, at least in part, to a lack of realistic data for use in
training recognition systems and evaluating their accuracy. In fields for which such data is available, such as face and voice recognition, the data, along with objectively-evaluated recognition contests, has contributed to the rapid advancement of the state of the art.
This thesis proposes a method for constructing data corpora not only for hand-
written math recognition, but for sketch recognition in general. The method consists of automatically generating template expressions, transcribing these expressions by hand, and automatically labelling them with ground-truth. This approach is motivated by practical considerations and is shown to be more extensible and objective than other potential methods.
We introduce a grammar-based approach for the template generation task. In this approach, random derivations in a context-free grammar are controlled so as to generate math expressions for transcription. The generation process may be controlled in terms of expression size and distribution over mathematical semantics.
Finally, we present a novel ground-truthing method based on matching terminal symbols in grammar derivations to recognized symbols. The matching is produced by a best-first search through symbol recognition results. Experiments show that this method is highly accurate but rejects many of its inputs.
|
Page generated in 0.0829 seconds