• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 43
  • 35
  • 1
  • Tagged with
  • 208
  • 34
  • 33
  • 27
  • 19
  • 17
  • 17
  • 16
  • 13
  • 13
  • 12
  • 12
  • 11
  • 11
  • 11
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
81

Investigation of colour constancy using blind signal separation and physics-based image modelling

Badawi, Waleed Kamal Mohammed January 2011 (has links)
Colour is an important property in image and video processing; it is used for the segmentation, classification, and recognition of objects. The observed colour of a surface, as captured by an imaging sensor, can be affected by factors such as specular reflection, illumination variation and shadows which can lead to erroneous colour identification. This creates a need for techniques that are able to extract an illumination invariant descriptor of the surface reflectance of an object, such techniques would enable the development of image and video processing systems which are able to identify the actual colour of an object, independent of illumination variations. Thus achieving what is referred to as colour constancy. This research aims to investigate the effectiveness of applying blind signal separation integrated with a physical model of image formation into a framework for achieving colour constancy. The particular model considered in this study is the dichromatic reflection model. This model has been used in approaches to colour constancy developed by other researchers. However, most of these approaches use mixed image components (i.e. composed of specular and diffuse components) in order to estimate illumination and consequently achieve colour constancy. In addition, most of these approaches require the segmentation of the image into regions which correspond to different colours on the multi-coloured surfaces, in high specular reflection (highlight) areas of the image. Correct segmentation of multi-coloured surfaces is difficult to achieve. This thesis proposes an alternative approach embodied in a framework which integrates blind signal separation and dichromatic model of image formation. Unlike the conventional approaches, by using blind signal separation, the illumination can be estimated more accurately using the explicitly separated specular image component and colour constancy is achieved by utilising the explicitly separated diffuse image component only. In addition, by using the blind signal separation the multi-coloured surfaces segmentation problem can be avoided. The research questions addressed by this research are “how should blind signal separation be integrated with the dichromatic model?” and “how does the proposed framework perform in the context of achieving colour constancy?” A novel colour constancy framework is developed in this thesis, and experimental findings about the performance of the framework are reported. Unlike the existing work, the proposed framework includes a new method to estimate the illumination spectral power distribution (ISPD) by using an explicitly extracted specular component of images. Furthermore, the proposed framework includes a new method for estimating the surface spectral reflectance using an explicitly extracted diffuse component, instead of mixed image components which are used by other researchers. The framework consists of three stages which are: the separation of image components, the ISPD estimation and the estimation of surface spectral reflectance. The methodology exploited to evaluate the performance of the framework involves the development of algorithms, their implementation in software, and their assessment using well-designed experiments anchored on quantitative performance measurement methods. The goodness-of-fit coefficient (GFC) is used to evaluate the performance of the framework, by measuring the degree of similarity between the estimated spectral distribution and a known reference. Values of GFC range between 0 and 1; a higher value representing a higher degree of similarity. Using an image data set generated by the author, compared to the manufacturer’s specifications, the estimated ISPD has an average GFC value equal to 0.9830 and 0.9215 for two light sources with colour temperature of 5500 K and 2900 K, respectively. The average GFC of the estimated ISPD improves significantly by 2.9% when the explicit specular image component is used instead of mixed image components. Furthermore, using Foster et al’s image data set (a set of hyperspectral images of natural scenes which was collected by Foster, Nascimento, and Amano), the ISPD is estimated using the mixed image components for other light sources with different colour temperatures. The results show that the estimated ISPD has an average value of the GFC equal to 0.9986 compared to the measured illumination. Using the data set collected by the author of this thesis, the surface spectral reflectance is estimated at individual pixels of an object illuminated by two alternative light sources with colour temperatures of 5500 K and 2900 K. A comparative assessment shows that the spectral reflectance, estimated for each given surface, has almost the same spectral signature for the two light sources. The comparison between the surface spectral reflectance estimates corresponding to the two light sources gives an average GFC value which ranges from 0.9611 to 0.9887, depending on the type of the blind separation technique that is used (i.e. the spatially constrained FastICA technique and the technique developed by Umeyama and Godin). Given that the surface spectral reflectance is the output of the last stage of the framework, which depends on the output of the previous two stages, therefore the GFC measured for surface spectral reflectance reflects the performance of the whole framework. The high GFC values mean that the estimates of surface reflectance under the two light sources are very similar, despite the differences between the two illuminants. This similarity implies that the extracted surface reflectance is significantly independent of illumination characteristics, hence showing that the proposed framework achieved a significant degree of colour constancy. Moreover, the observed results show a statistically significant improvement in the accuracy of the estimated surface spectral reflectance by 2.6% in terms of average GFC value when the explicitly extracted diffuse image component is used instead of the mixed image components. Compared to the surface spectral reflectance measurements included in Foster et al’s image data set, the surface spectral reflectance estimated using the mixed image components has an average GFC value equal to 0.9608.
82

Analyse des expressions faciales dans un flux vidéo / Analysis of facial expression in a video stream

Allaert, Benjamin 08 June 2018 (has links)
De nos jours, dans des domaines tels que la sécurité et la santé, une forte demande consiste à pouvoir analyser le comportement des personnes en s'appuyant notamment sur l'analyse faciale. Dans cette thèse, nous explorons de nouvelles approches à destination de systèmes d’acquisition peu contraints. Plus spécifiquement, nous nous intéressons à l'analyse des expressions faciales en présence de variation d'intensité et de variations de pose du visage. Notre première contribution s'intéresse à la caractérisation précise des variations d'intensité des expressions faciales. Nous proposons un descripteur innovant appelé LMP qui s'appuie sur les propriétés physiques déformables du visage afin de conserver uniquement les directions principales du mouvement facial induit par les expressions. La particularité principale de notre travail est de pouvoir caractériser à la fois les micro et les macro expressions, en utilisant le même système d'analyse. Notre deuxième contribution concerne la prise en compte des variations de pose. Souvent, une étape de normalisation est employée afin d'obtenir une invariance aux transformations géométriques. Cependant, ces méthodes sont utilisées sans connaître leur impact sur les expressions faciales. Pour cela, nous proposons un système d'acquisition innovant appelé SNaP-2DFe. Ce système permet de capturer simultanément un visage dans un plan fixe et dans un plan mobile. Grâce à cela, nous fournissons une connaissance du visage à reconstruire malgré les occultations induites par les rotations de la tête. Nous montrons que les récentes méthodes de normalisation ne sont pas parfaitement adaptées pour l'analyse des expressions faciales. / Facial expression recognition has attracted great interest over the past decade in wide application areas, such as human behavior analysis, e-health and marketing. In this thesis we explore a new approach to step forward towards in-the-wild expression recognition. Special attention has been paid to encode respectively small/large facial expression amplitudes, and to analyze facial expressions in presence of varying head pose. The first challenge addressed concerns varying facial expression amplitudes. We propose an innovative motion descriptor called LMP. This descriptor takes into account mechanical facial skin deformation properties. When extracting motion information from the face, the unified approach deals with inconsistencies and noise, caused by face characteristics. The main originality of our approach is a unified approach for both micro and macro expression recognition, with the same facial recognition framework. The second challenge addressed concerns important head pose variations. In facial expression analysis, the face registration step must ensure that minimal deformation appears. Registration techniques must be used with care in presence of unconstrained head pose as facial texture transformations apply. Hence, it is valuable to estimate the impact of alignment-related induced noise on the global recognition performance. For this, we propose a new database, called SNaP-2DFe, allowing to study the impact of head pose and intra-facial occlusions on expression recognition approaches. We prove that the usage of face registration approach does not seem adequate for preserving the features encoding facial expression deformations.
83

Development and implementation of real time image analysis algorithms

Johnstone, Adrian Ivor Clive January 1989 (has links)
This work concerns the development and implementation of real-time image processing algorithms. Such systems may be applied to industrial inspection problems, which typically require basic operations to be performed on 256 x 256 pixel images in 20 to 100ms using systems costing less than about £20000.Building such systems is difficult because conventional processors executing at around 1MIPS with conventional algorithms are some 2 orders of magnitude too slow. A solution to this is to use a closely coupled array processor such as the DAP, or CLIP4 which is designed especially for image processing. However such a space-parallel architecture imposes its own structure on the problem, and this restricts the class of algorithms which may be efficiently executed to those exhibiting similar space parallelism, i.e. so-called 'parallel algorithms'. This thesis examines an alternative approach which uses a mix of conventional processors and high speed hardware processors. A special frame store has been built for the acquisition and display of images stored in memory on a multiprocessor backplane. Also described are an interface to a host mini-computer, a bus interface to the system and its use with some hardwired and microcoded processors. This system is compared to a single computer operating with a frame store optimised for image processing. The basic software and hardware system described in this thesis has been used in a factory environment for foodproduct inspection.
84

Investigations into trainable picture processing systems

Mayer, Martin January 1982 (has links)
This work concerns the development of a new type of picture processing system for images represented as digital arrays of pixels. This is a synthesis of two established ideas, already under independent investigation. The first of these is picture processing by look-up tables. This is a fast method of generating pixel outputs as a result of input pixels accessing a particular region of a look-up table, pre-loaded with the required data. The second idea is the use of RAMs as learning machines. Here, RAM elements are connected together so as to be alterable in data content by training stimuli in a coherent manner. This results in a system able to exhibit definite responses to later test stimuli, and thus identify these stimuli unambiguously. The methods used for bringing these two conceptstogether are described here. A practicable picture processor results, which can be trained by examples. That is, it can perform a picture transformation simply by presenting to the machine (in a prior training phase) examples of the process. From this, the machine deduces the information necessary to be able to perform the same transformation on unseen patterns. Experiments have been performed on a wide range of variations on this theme. Different types of machines acting on different data and tasks have been tried, under various conditions. A description is given of these machine variations, together with a generalized system fordescribing such variations more formally. The machines were simulated in practice on a microcomputer system; The simulation software used in these investigations is also described. Finally, the implications and limitations of such machines are discussed with reference to their ultimate performance and possible applications in fields other than picture processing.
85

Strategies for intelligent interaction management and usability of biometric systems

Wu, Qianqian January 2016 (has links)
Fingerprint biometric systems are one of the most popular biometric systems in current use, which takes a standard measure of a person's fingerprint to compare against the measure from an original stored template, which they have pre-acquired and associated with the known personal identification claimed by the user. Generally, the fingerprint biometric system consists of three stages including a data acquisition stage, a feature extraction stage and a matching extraction. This study will explore some essential limitations of an automatic fingerprint biometric system relating to the effects of capturing poor quality fingerprint images in a fingerprint biometric system and will investigate the interrelationship between the quality of a fingerprint image and other primary components of a fingerprint biometric system, such as the feature extraction operation and the matching process. In order to improve the overall performance of an automatic fingerprint biometric system, the study will investigate some possible ways to overcome these limitations. With the purpose of acquisition of an acceptable quality of fingerprint images, three components/enhancements are added into the traditional fingerprint recognition system in our proposed system. These are a fingerprint image enhancement algorithm, a fingerprint image quality evaluation algorithm and a feedback unit, the purpose of which is to provide analytical information collected at the image capture stage to the system user. In this thesis, all relevant information will be introduced, and we will also show some experimental results obtained with the proposed algorithms, and comparative studies with other existed algorithms will also be presented.
86

Statistical computing on manifolds for 3D face analysis and recognition / Calcul statistique sur des variétés de formes 3D pour la reconnaissance de visages 3D

Drira, Hassen 04 July 2011 (has links)
La reconnaissance de visage automatique offre de nombreux avantages par rapport aux autres technologies biométriques en raison de la nature non-intrusive. Ainsi, les techniques de reconnaissance faciale ont reçu une attention croissante au sein de la communauté de vision par ordinateur au cours des trois dernières décennies. Un atout majeur de scans 3D sur l'imagerie couleur 2D est que les variations de éclairage et mise à l'échelle ont moins d'influence sur les scans 3D. Toutefois, la numérisation des données souffrent souvent du problème de données manquantes à cause de l'auto-occultation ou des imperfections des technologies de numérisation. En outre, les variations dues aux expressions faciales rendent difficile la reconnaissance automatique des visages 3D. Pour être utiles dans des applications du monde réel, les approches de reconnaissance faciale 3D devraient être en mesure de reconnaitre les surfaces faciales 3D, même dans la présence de grandes déformations dues aux expressions et des données manquantes. La plupart des recherches récentes ont été dirigés vers des techniques invariantes aux expressions faciales. Ils ont toutefois dépensé moins d'efforts pour faire face aux problème des données manquantes. Dans cet thèse, nous présentons un framework commun pour faire face aux expressions et aux données manquantes. En outre, dans le même cadre, notre framework permet de calculer des moyennes surfaces qui permettent une organization hiérarchique des bases de données de visages 3D pour permettre des recherches efficaces. Dans cette thèse, nous nous concentrons sur la tâche fondamentale de la reconnaissance faciale en 3D, fournir une analyse comparative de plusieurs approches, et offrir des solutions originales pour chacun des problèmes analysés. / Automatic face recognition has many benefits over other biometric technologies due to the natural, non-intrusive, and high throughput nature of face data acquisition. Thus, the techniques for face recognition have received a growing attention within the computer vision community over the past three decades. In terms of a modality for face imaging, a major advantage of 3D scans over 2D color imaging is that variations in illumination and scaling have less influence on the 3D scans.However, scan data often suffer from the problem of missing parts dueto self-occlusions or imperfections in scanning technologies. Additionally, variations in face data due to facial expressions are challenging to 3D face recognition. In order to be useful in real-world applications, 3D face recognition approaches should be able to successfully recognize face scans even in the presence of large expression-based deformations and missing data due to occlusions and pose variation. Most recent research has been directed towards expression-invariant techniques and spent less effort to handle the missing parts problem. Few approaches handles the missing part problem but none has performed on a full database containing real missing data, they simulate some missing parts. We present a common framework handling both large expressions and missing parts due to large pose variation. In addition, with the same framework, we are able to average surfaces and hierarchically organize databases to allow efficient searches. In presence of occlusion, we propose to delete and restore occluded parts. The surface is first represented by radial curves (emanating from the nose tip fo the 3D face). Then a base is built using PCA for each curve. Hence, the missing part of the curve can be restored by projecting the existing part of it on the base. PCA is applied on the tangent space of the mean curve as it is linear space. Once the occlusion was detected and removed, the occlusion challenge can be handled as a missing data problem. Hence, we apply the restoration framework and then apply our radial-curve-based 3D face recognition algorithm.
87

3D Facial Expressions Recognition Using Shape Analysis and Machine Learning / Reconnaissance d'expressions faciales 3D basée sur l'analyse de forme et l'apprentissage automatique

Maalej, Ahmed 23 May 2012 (has links)
La reconnaissance des expressions faciales est une tâche difficile, qui a reçu un intérêt croissant au sein de la communauté des chercheurs, et qui impacte les applications dans des domaines liés à l'interaction homme-machine (IHM). Dans le but de construire des systèmes IHM approchant le comportement humain et émotionnellement intelligents, les scientifiques essaient d'introduire la composante émotionnelle dans ce type de systèmes. Le développement récent des capteurs d'acquisition 3D a fait que les données 3D deviennent de plus en plus disponibles, et ce type de données vient pour remédier à des problèmes inhérents aux données 2D tels que les variations d'éclairage, de pose et d'échelle et de faible résolution. Plusieurs bases de données 3D du visage sont publiquement disponibles pour les chercheurs dans le domaine de la reconnaissance d'expression faciale leur permettant ainsi de valider et d'évaluer leurs approches. Cette thèse traite le problème la reconnaissance d'expression faciale et propose une approche basée sur l'analyse de forme pour la reconnaissance d'expression dans un cadre statique (relatif à une seule image) et dynamique (relatif à une séquence vidéo). Tout d'abord, une représentation du modèle 3D du visage basée sur les courbes est proposée pour décrire les traits du visage. Puis, une fois ces courbes sont extraites, l'information de forme qui leur est liée est quantifiée en utilisant un cadre de travail basé sur la géométrie Riemannienne. Nous obtenons, par la suite, des scores de similarité entre les différentes formes locales du visage. Nous constituons, alors, un vecteur de caractéristiques associées à chaque surface faciale. Ensuite, ces caractéristiques sont utilisées comme paramètres d'entrée à des algorithmes d'apprentissage automatique et de classification pour la reconnaissance d'expressions. Des expérimentations exhaustives sont alors entreprises pour valider notre approche et des résultats sont présentés et comparés aux résultats des travaux de l'état de l'art. / Facial expression recognition is a challenging task, which has received growing interest within the research community, impacting important applications in fields related to human machine interaction (HMI). Toward building human-like emotionally intelligent HMI devices, scientists are trying to include the essence of human emotional state in such systems. The recent development of 3D acquisition sensors has made 3D data more available, and this kind of data comes to alleviate the problems inherent in 2D data such as illumination, pose and scale variations as well as low resolution. Several 3D facial databases are publicly available for the researchers in the field of face and facial expression recognition to validate and evaluate their approaches. This thesis deals with facial expression recognition (FER) problem and proposes an approach based on shape analysis to handle both static and dynamic FER tasks. Our approach includes the following steps: first, a curve-based representation of the 3D face model is proposed to describe facial features. Then, once these curves are extracted, their shape information is quantified using a Riemannain framework. We end up with similarity scores between different facial local shapes constituting feature vectors associated with each facial surface. Afterwards, these features are used as entry parameters to some machine learning and classification algorithms to recognize expressions. Exhaustive experiments are derived to validate our approach and results are presented and compared to the related work achievements.
88

Pedestrians counting and event detection in crowded environment

Shbib, Reda January 2015 (has links)
Crowd density estimation and pedestrian counting are becoming an area of interest such as assessing the social effect and impact between small groups of people within a crowd. Still, existing experimental crowd analyses performed by operators are time consuming. Generally, human controllers are engaged to achieve this task, however, more and more, visual surveillance are becoming an essential need, it is a hard task to watch and study all recorded video due to the huge number of cameras being installed. Currently, image-processing field has attracted all academic and research to develop automatic counting and monitoring algorithms. In this thesis, some novel contributions in different fields are presented: pedestrian counting, event detection, and queue monitoring. Firstly, this thesis presents an original contribution in the pedestrian counting domain. In recent years, many of proposed counting techniques have used global features to estimate crowd density. In this thesis, a new approach has been introduced to replace global image features by the low level- features, which are specific to individuals and clusters within the crowd. Thus, the total number of pedestrians is the summation of all clusters, which construct the crowd. Experimental results through different datasets showed that low-level features have performed better than global features. In addition to the pedestrian counting, this thesis presents another contribution in the area of pedestrian flow monitoring through the developing of a virtual door algorithm, in which pedestrians are counted while they are passing through a proposed virtual count line. Important features have been extracted from the region of interest. Discriminant features are detected, and optical flow of these points are assembled .The proposed system assembles optical flow in the trajectory direction in a discrete group of extracted feature points. Finally, this thesis presents a novel technique for estimating queue parameters, such as number of entrance, leaving and the frequency, in order to obtain a clear picture about the queue traffic and flow. Therefore, in order to obtain these parameters, the proposed pedestrian counting and virtual door approach have been integrated together. Experimental results conducted demonstrate that the proposed system is strong to real-life environments.
89

Biometric person identification using near-infrared hand-dorsa vein images

Li, Kefeng January 2013 (has links)
Biometric recognition is becoming more and more important with the increasing demand for security, and more usable with the improvement of computer vision as well as pattern recognition technologies. Hand vein patterns have been recognised as a good biometric measure for personal identification due to many excellent characteristics, such as uniqueness and stability, as well as difficulty to copy or forge. This thesis covers all the research and development aspects of a biometric person identification system based on near-infrared hand-dorsa vein images. Firstly, the design and realisation of an optimised vein image capture device is presented. In order to maximise the quality of the captured images with relatively low cost, the infrared illumination and imaging theory are discussed. Then a database containing 2040 images from 102 individuals, which were captured by this device, is introduced. Secondly, image analysis and the customised image pre-processing methods are discussed. The consistency of the database images is evaluated using mean squared error (MSE) and peak signal-to-noise ratio (PSNR). Geometrical pre-processing, including shearing correction and region of interest (ROI) extraction, is introduced to improve image consistency. Image noise is evaluated using total variance (TV) values. Grey-level pre-processing, including grey-level normalisation, filtering and adaptive histogram equalisation are applied to enhance vein patterns. Thirdly, a gradient-based image segmentation algorithm is compared with popular algorithms in references like Niblack and Threshold Image algorithm to demonstrate its effectiveness in vein pattern extraction. Post-processing methods including morphological filtering and thinning are also presented. Fourthly, feature extraction and recognition methods are investigated, with several new approaches based on keypoints and local binary patterns (LBP) proposed. Through comprehensive comparison with other approaches based on structure and texture features as well as performance evaluation using the database created with 2040 images, the proposed approach based on multi-scale partition LBP is shown to provide the best recognition performance with an identification rate of nearly 99%. Finally, the whole hand-dorsa vein identification system is presented with a user interface for administration of user information and for person identification.
90

New method for mathematical modelling of human visual speech

Sadaghiani, Mohammad Hossein January 2015 (has links)
Audio-visual speech recognition and visual speech synthesisers are used as interfaces between humans and machines. Such interactions specifically rely on the analysis and synthesis of both audio and visual information, which humans use for face-to-face communication. Currently, there is no global standard to describe these interactions nor is there a standard mathematical tool to describe lip movements. Furthermore, the visual lip movement for each phoneme is considered in isolation rather than a continuation from one to another. Consequently, there is no globally accepted standard method for representing lip movement during articulation. This thesis addresses these issues by designing a transcribed group of words, by mathematical formulas, and so introducing the concept of a visual word, allocating signatures to visual words and finally building a visual speech vocabulary database. In addition, visual speech information has been analysed in a novel way by considering both lip movements and phonemic structure of the English language. In order to extract the visual data, three visual features on the lip have been chosen; these are on the outer upper, lower and corner of the lip. The extracted visual data during articulation is called the visual speech sample set. The final visual data is obtained after processing the visual speech sample sets to correct experimented artefacts such as head tilting, which happened during articulation and visual data extraction. The ‘Barycentric Lagrange Interpolation’ (BLI) formulates the visual speech sample sets into visual speech signals. The visual word is defined in this work and consists of the variation of three visual features. Further processing on relating the visual speech signals to the uttered word leads to the allocation of signatures that represent the visual word. This work suggests the visual word signature can be used either as a ‘visual word barcode’, a ‘digital visual word’ or a ‘2D/3D representations’. The 2D version of the visual word provides a unique signature that allows the identification of the words being uttered. In addition, identification of visual words has also been performed using a technique called ‘volumetric representations of the visual words’. Furthermore, the effect of altering the amplitudes and sampling rate for BLI has been evaluated. In addition, the performance of BLI in reconstructing the visual speech sample sets has been considered. Finally, BLI has been compared to signal reconstruction approach by RMSE and correlation coefficients. The results show that the BLI is the more reliable method for the purpose of this work according to Section 7.7.

Page generated in 0.0211 seconds