• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • 1
  • Tagged with
  • 5
  • 5
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

The Link Between Image Segmentation and Image Recognition

Sharma, Karan 01 January 2012 (has links)
A long standing debate in computer vision community concerns the link between segmentation and recognition. The question I am trying to answer here is, Does image segmentation as a preprocessing step help image recognition? In spite of a plethora of the literature to the contrary, some authors have suggested that recognition driven by high quality segmentation is the most promising approach in image recognition because the recognition system will see only the relevant features on the object and not see redundant features outside the object (Malisiewicz and Efros 2007; Rabinovich, Vedaldi, and Belongie 2007). This thesis explores the following question: If segmentation precedes recognition, and segments are directly fed to the recognition engine, will it help the recognition machinery? Another question I am trying to address in this thesis is of scalability of recognition systems. Any computer vision system, concept or an algorithm, without exception, if it is to stand the test of time, will have to address the issue of scalability.
2

Algorithmes d'extraction de modèles géométriques discrets pour la représentation robuste des formes / Recognition algorithms of digital geometric patterns for robust shape representation

Roussillon, Tristan 19 November 2009 (has links)
Cette thèse se situe à l'interface entre l'analyse d'images, dont l'objectif est la description automatique du contenu visuel, et la géométrie discrète, qui est l'un des domaines dédiés au traitement des images numériques. Pour être stocké et manipulé sur un ordinateur, un signal observé est régulièrement échantillonné. L'image numérique, qui est le résultat de ce processus d'acquisition, est donc constituée d'un ensemble fini d'éléments distincts. La géométrie discrète se propose d'étudier les propriétés géométriques d'un tel espace dépourvu de continuité. Dans ce cadre, nous avons considéré les régions homogènes et porteuses de sens d'une image, avec l'objectif de représenter leur contour au moyen de modèles géométriques ou de les décrire à l'aide de mesures. L'étendue des applications de ce travail en analyse d'images est vaste, que ce soit au cours du processus de segmentation, ou en vue de la reconnaissance d'un objet. Nous nous sommes concentrés sur trois modèles géométriques discrets définis par la discrétisation de Gauss : la partie convexe ou concave, l'arc de cercle discret et le segment de droite discrète. Nous avons élaboré des algorithmes dynamiques (mise à jour à la volée de la décision et du paramétrage), exacts (calculs en nombres entiers sans erreur d'approximation) et rapides (calculs simplifiés par l'exploitation de propriétés arithmétiques et complexité en temps linéaire) qui détectent ces modèles sur un contour. L'exécution de ces algorithmes le long d'un contour aboutit à des décompositions ou à des polygonalisations réversibles. De plus, nous avons défini des mesures de convexité, linéarité et circularité, qui vérifient un ensemble de propriétés fondamentales : elles sont robustes aux transformations rigides, elles s'appliquent à des parties de contour et leur valeur maximale est atteinte pour le modèle de forme qui sert de comparaison et uniquement sur celui-ci. Ces mesures servent à l'introduction de nouveaux modèles dotés d'un paramètre variant entre 0 et 1. Le paramètre est fixé à 1 quand on est sûr de la position du contour, mais fixé à une valeur inférieure quand le contour est susceptible d'avoir été déplacé par un bruit d'acquisition. Cette approche pragmatique permet de décomposer de manière robuste un contour en segments de droite ou en parties convexes et concaves. / The work presented in this thesis concerns the fields of image analysis and discrete geometry. Image analysis aims at automatically describing the visual content of a digital image and discrete geometry provides tools devoted to digital image processing. A two-dimensional analog signal is regularly sampled in order to be handled on computers. This acquisition process results in a digital image, which is made up of a finite set of discrete elements. The topic of discrete geometry is to study the geometric properties of such kind of discrete spaces. In this work, we consider homogeneous regions of an image having a meaning for a user. The objective is to represent their digital contour by means of geometric patterns and compute measures. The scope of applications is wide in image analysis. For instance, our results would be of great interest for segmentation or object recognition. We focus on three discrete geometric patterns defined by Gauss digitization: the convex or concave part, the digital straight segment and the digital circular arc. We present several algorithms that detect or recognize these patterns on a digital contour. These algorithms are on-line, exact (integer-only computations without any approximation error) and fast (simplified computations thanks to arithmetic properties and linear-time complexity). They provide a way for segmenting a digital contour or for representing a digital contour by a reversible polygon. Moreover, we define a measure of convexity, a measure of straightness and a measure of circularity. These measures fulfil the following important properties: they are robust to rigid transformations, they may be applied on any part of a digital contour, they reach their maximal value for the template with which the data are compared to. From these measures, we introduce new patterns having a parameter that ranges from 0 to 1. The parameter is set to 1 when the localisation of the digital contour is reliable, but is set to a lower value when the digital contour is expected to have been shifted because of some acquisition noise. This measure-based approach provides a way for robustly decomposing a digital contour into convex, concave or straight parts.
3

A Study Of Utility Of Smile Profile For Face Recognition

Bhat, Srikrishna K K 08 1900 (has links)
Face recognition is one of the most natural activities performed by the human beings. It has wide range of applications in the areas of Human Computer Interaction, Surveillance, Security etc. Face information of people can be obtained in a non-intrusive manner, without violating privacy. But, robust face recognition which is invariant under varying pose, illumination etc is still a challenging problem. The main aim of this thesis is to explore the usefulness of smile profile of human beings as an extra aid in recognizing people by faces. Smile profile of a person is the sequence of images captured by a camera when the person voluntarily smiles. Using sequence of images instead of a single image will increase the required computational resources significantly. The challenge here is to design a feature extraction technique from a smile sample, which is useful for authentication and is also efficient in terms of storage and computational aspects. There are some experimental evidences which support the claim that facial expressions have some person specific information. But, to the best of our knowledge, systematic study of a particular facial expression for biometrical purposes has not been done so far. The smile profile of human beings, which is captured under some reasonably controlled setup, is used for first time for face recognition purpose. As a first step, we applied two of the recent subspace based face classifiers on the smile samples. We were not able to obtain any conclusive results out of this experiment. Next we extracted features using only the difference vectors obtained from smile samples. The difference vectors depend only on the variations which occur in the corresponding smile profile. Hence any characterization we obtain from such features can be fully attributed to the smiling action. The feature extraction technique we employed is very much similar to PCA. The smile signature that we have obtained is named as Principal Direction of Change(PDC). PDC is a unit vector (in some high dimensional space) which represents the direction in which the major changes occurred during the smile. We obtained a reasonable recognition rate by applying Nearest Neighbor Classifier(NNC) on these features. In addition to that, these features turn out to be less sensitive to the speed of smiling action and minor variations in face detection and head orientation, while capturing the pattern of variations in various regions of face due to smiling action. Using set of experiments on PDC based features we establish that smile has some person specific characteristics. But the recognition rates of PDC based features are less than the recent conventional techniques. Next we have used PDC based features to aid a conventional face classifier. We have used smile signatures to reject some candidate faces. Our experiments show that, using smile signatures, we can reject some of the potential false candidate faces which would have been accepted by the conventional face classifier. Using this smile signature based rejection, the performance of the conventional classifier is improved significantly. This improvement suggests that, the biometric information available in smile profiles does not exist in still images. Hence the usefulness of smile profiles for biometric applications is established through this experimental investigation.
4

Joint Evaluation Of Multiple Speech Patterns For Speech Recognition And Training

Nair, Nishanth Ulhas 01 1900 (has links)
Improving speech recognition performance in the presence of noise and interference continues to be a challenging problem. Automatic Speech Recognition (ASR) systems work well when the test and training conditions match. In real world environments there is often a mismatch between testing and training conditions. Various factors like additive noise, acoustic echo, and speaker accent, affect the speech recognition performance. Since ASR is a statistical pattern recognition problem, if the test patterns are unlike anything used to train the models, errors are bound to occur, due to feature vector mismatch. Various approaches to robustness have been proposed in the ASR literature contributing to mainly two topics: (i) reducing the variability in the feature vectors or (ii) modify the statistical model parameters to suit the noisy condition. While some of those techniques are quite effective, we would like to examine robustness from a different perspective. Considering the analogy of human communication over telephones, it is quite common to ask the person speaking to us, to repeat certain portions of their speech, because we don't understand it. This happens more often in the presence of background noise where the intelligibility of speech is affected significantly. Although exact nature of how humans decode multiple repetitions of speech is not known, it is quite possible that we use the combined knowledge of the multiple utterances and decode the unclear part of speech. Majority of ASR algorithms do not address this issue, except in very specific issues such as pronunciation modeling. We recognize that under very high noise conditions or bursty error channels, such as in packet communication where packets get dropped, it would be beneficial to take the approach of repeated utterances for robust ASR. In this thesis, we have formulated a set of algorithms for both joint evaluation/decoding for recognizing noisy test utterances as well as utilize the same formulation for selective training of Hidden Markov Models (HMMs), again for robust performance. We first address joint recognition of multiple speech patterns given that they belong to the same class. We formulated this problem considering the patterns as isolated words. If there are K test patterns (K ≥ 2) of a word by a speaker, we show that it is possible to improve the speech recognition accuracy over independent single pattern evaluation of test speech, for the case of both clean and noisy speech. We also find the state sequence which best represents the K patterns. This formulation can be extended to connected word recognition or continuous speech recognition also. Next, we consider the benefits of joint multi-pattern likelihood for HMM training. In the usual HMM training, all the training data is utilized to arrive at a best possible parametric model. But, it is possible that the training data is not all genuine and therefore may have labeling errors, noise corruptions, or plain outlier exemplars. Such outliers will result in poorer models and affect speech recognition performance. So it is important to selectively train them so that the outliers get a lesser weightage. Giving lesser weight to an entire outlier pattern has been addressed before in speech recognition literature. However, it is possible that only some portions of a training pattern are corrupted. So it is important that only the corrupted portions of speech are given a lesser weight during HMM training and not the entire pattern. Since in HMM training, multiple patterns of speech from each class are used, we show that it is possible to use joint evaluation methods to selectively train HMMs such that only the corrupted portions of speech are given a lesser weight and not the entire speech pattern. Thus, we have addressed all the three main tasks of a HMM, to jointly utilize the availability of multiple patterns belonging to the same class. We experimented the new algorithms for Isolated Word Recognition in the case of both clean speech and noisy speech. Significant improvement in speech recognition performance is obtained, especially for speech affected by transient/burst noise.
5

Learning from biometric distances: Performance and security related issues in face recognition systems

Mohanty, Pranab 01 June 2007 (has links)
We present a theory for constructing linear, black box approximations to face recognition algorithms and empirically demonstrate that a surprisingly diverse set of face recognition approaches can be approximated well using a linear model. The construction of the linear model to a face recognition algorithm involves embedding of a training set of face images constrained by the distances between them, as computed by the face recognition algorithm being approximated. We accomplish this embedding by iterative majorization, initialized by classical multi-dimensional scaling (MDS). We empirically demonstrate the adequacy of the linear model using six face recognition algorithms, spanning both template based and feature based approaches on standard face recognition benchmarks such as the Facial Recognition Technology (FERET) and Face Recognition Grand Challenge (FRGC) data sets. The experimental results show that the average Error in Modeling for six algorithms is 6.3% at 0.001 False Acceptance Rate (FAR), for FERET fafb probe set which contains maximum number of subjects among all the probe sets. We demonstrate the usefulness of the linear model for algorithm dependent indexing of face databases and find that it results in more than 20 times reduction in face comparisons for Bayesian Intra/Extra-class person classifier (BAY), Elastic Bunch Graph Matching algorithm (EBGM), and the commercial face recognition algorithms. We also propose a novel paradigm to reconstruct face templates from match scores using the linear model and use the reconstructed templates to explore the security breach in a face recognition system. We evaluate the proposed template reconstruction scheme using three, fundamentally different, face recognition algorithms: Principal Component Analysis (PCA), Bayesian Intra/Extra-class person classifier (BAY), and a feature based commercial algorithm. With an operational point set at 1% False Acceptance Rate (FAR) and 99% True Acceptance Rate (TAR) for 1196 enrollments (FERET gallery), we show that at most 600 attempts (score computations) are required to achieve 73%, 72% and 100% chance of breaking in as a randomly chosen target subject for the commercial, BAY and PCA based face recognition system, respectively. We also show that the proposed reconstruction scheme has 47% more probability of breaking in as a randomly chosen target subject for the commercial system as compared to a hill climbing approach with the same number of attempts.

Page generated in 0.0988 seconds