• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • No language data
  • Tagged with
  • 56
  • 56
  • 56
  • 11
  • 10
  • 9
  • 8
  • 7
  • 6
  • 6
  • 4
  • 4
  • 3
  • 3
  • 3
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Intelligent emotion recognition from facial and whole-body expressions using adaptive ensemble models

Zhang, Yang January 2015 (has links)
Automatic emotion recognition has been widely studied and applied to various computer vision tasks (e.g. health monitoring, driver state surveillance, personalized learning, and security monitoring). With the great potential provided by current advanced 3D scanners technology (e.g. the Kinect), we shed light on robust emotion recognition based one users’ facial and whole-body expressions. As revealed by recent psychological and behavioral research, facial expressions are good in communicating categorical emotions (e.g. happy, sad, surprise, etc.), while bodily expressions could contribute more to the perception of dimensional emotional states (e.g. the arousal and valence dimensions). Thus, we propose two novel emotion recognition systems respectively applying adaptive ensemble classification and regression models respectively based on the facial and bodily modalities. The proposed real-time 3D facial Action Unit (AU) intensity estimation and emotion recognition system automatically selects 16 motion-based facial feature sets to estimate the intensities of 16 diagnostic AUs. Then a set of six novel adaptive ensemble classifiers are proposed for robust classification of the six basic emotions and the detection of newly arrived unseen novel emotion classes (emotions that are not included in the training set). In both offline-line and on-line real-time evaluation, the system shows the highest recognition accuracy in comparison with other related work and flexibility and good adaptation for newly arrived novel emotion detection(e.g. ‘contempt’ which is not included in the six basic emotions). The second system focuses on continuous and dimensional affect prediction from users’ bodily expressions using adaptive regression. Both static posture and dynamic motion bodily features are extracted and subsequently selected by a Genetic Algorithm to identify their most discriminative combinations for both valence and arousal dimensions. Then an adaptive ensemble regression model is proposed to robustly map subjects’ emotional states onto a continuous arousal-valence affective space using the identified feature subsets. Experimental results show that the proposed system outperforms other benchmark models and achieves promising performance compared to other state-of-the-art research reported in the literature. Furthermore, we also propose a novel semi-feature level bimodal fusion framework that integrates both facial and bodily information together to draw a more comprehensive and robust dimensional interpretation of subjects’ emotional states. By combining the optimal discriminative bodily features and the derived AU intensities as inputs, the proposed adaptive ensemble regression model achieves remarkable improvements in comparison to solely applying the bodily features.
12

Robust and secure perceptual image hashing in the transform domain

Prungsinchai, Supakorn January 2014 (has links)
The rapid development of multimedia devices such as computers, network technologies, and cell phones have made it easier for users to create, broadcast, convey, share, store, and distribute multimedia data including images, videos and audio files on a daily basis. However, the availability of image processing software in the public domain has facilitated illegal copying and distribution of digital images with unnoticeable quality changes. Thus, security and identification of media content has become an important and demanding area for research. Perceptual hashing is one of the recent technologies used for multimedia content security. A perceptual image hash function is a hash function that is robust against content-preserving operations (CPOs), such as noise, JPEG lossy compression and rotation. This aim of this research is to study and investigate existing techniques and then contribute to the development of new perceptual image hashing techniques in the transform domain for image identification and copy detection applications. The design requirements for any perceptual image hashing system are robustness, discriminative capability (uniqueness), and unpredictability (security). The feature extraction stage plays a key role in ensuring the system output is robust and discriminative. This thesis mainly focuses on the robust feature extraction stage and the analysis of the proposed system's security. The following contributions have been made: A new perceptual hashing technique using pseudo-random sub-images in the discrete wavelet transform (DWT) domain for extracting features has been developed. The idea employs a recent dimension reduction technique, referred to as non-negative matrix factorization (NMF) in the literature, for enhancing the robustness and security of the hash. This approach is proposed to select the most stable coefficients under various content-preserving operations, compact. The robust image hashes are generated by applies DWT and NMF into image. The proposed sub-images-DWT technique has been shown to yield good performance under image processing operations, but it still suffers from geometric attacks. A new rotation-invariant FMT-based hashing technique incorporating the Fourier-Mellin transform and using overlapping blocks to improve the robustness against rotation attacks has also been proposed. The robust FMT-based image hashing is proposed to improve its performances under rotation, translation attacks and achieve better overall robustness. The invariance property to rotation, scaling and translation of FMT makes it more suitable for image hashing. Based on our experimental results, it has been shown that the proposed FMT-based image hashing technique is robust to a large class of image processing operations and geometric attacks. A new robust and secure DCT overlapping block-based hashing technique incorporating the discrete cosine transforms (DCT) to combat image processing attacks has been investigated. An improved DCT sign-based hashing technique robust against image processing attacks and well as small geometric manipulations developed. From the experimental results, it was observed that the low frequency coefficients for DCT sign based-image hashing were robust to a large of content-preserving operations (CPOs). The main idea was to exploit the energy compaction property of the DCT and its ability to carry information of edges and texture in DCT sign values. From the experimental results, it was observed that the low frequency coefficients for DCT sign-based image hashing were robust to a large class of content-preserving operations (CPOs). The main idea was to exploit the energy compaction property of the DCT and its ability to carry information of edges and texture in DCT sign values. Finally, the security of the proposed image hashing systems are discussed and analysed in the light of the corresponding design requirement. The DCT sign-based image hashing scheme has hash also been shown to be the most secure technique compared to other techniques proposed in this research as it offers the highest rate of bit independence in a hash.
13

Signature-based videos' visual similarity detection and measurement

Bekhet, Saddam January 2016 (has links)
The quantity of digital videos is huge, due to technological advances in video capture, storage and compression. However, the usefulness of these enormous volumes is limited by the effectiveness of content-based video retrieval systems (CBVR) that still requires time-consuming annotating/tagging to feed the text-based search. Visual similarity is the core of these CBVR systems where videos are matched based on their respective visual features and their evolvement across video frames. Also, it acts as an essential foundational layer to infer semantic similarity at advanced stage, in collaboration with metadata. Furthermore, handling such amounts of video data, especially the compressed-domain, forces certain challenges for CBVR systems: speed, scalability and genericness. The situation is even more challenging with availability of nonpixelated features, due to compression, e.g. DC/AC coefficients and motion vectors, that requires sophisticated processing. Thus, a careful features’ selection is important to realize the visual similarity based matching within boundaries of the aforementioned challenges. Matching speed is crucial, because most of the current research is biased towards the accuracy and leaves the speed lagging behind, which in many cases affect the practical uses. Scalability is the key for benefiting from these enormous available videos amounts. Genericness is an essential aspect to develop systems that is applicable to, both, compressed and uncompressed videos. This thesis presents a signature-based framework for efficient visual similarity based video matching. The proposed framework represents a vital component for search and retrieval systems, where it could be used in three possible different ways: (1)Directly for CBVR systems where a user submits a query video and the system retrieves a ranked list of visually similar ones. (2)For text-based video retrieval systems, e.g. YouTube, when a user submits a textual description and the system retrieves a ranked list of relevant videos. The retrieval in this case works by finding videos that were manually assigned similar textual description (annotations). For this scenario, the framework could be used to enhance the annotation process. This is achievable by suggesting an annotations-set for the newly uploading videos. These annotations are derived from other visually similar videos that can be retrieved by the proposed framework. In this way, the framework could make annotations more relevant to video contents (compared to the manual way) which improves the overall CBVR systems’ performance as well. (3)The top-N matched list obtained by the framework, could be used as an input to higher layers, e.g. semantic analysis, where it is easier to perform complex processing on this limited set of videos. i The proposed framework contributes and addresses the aforementioned problems, i.e. speed, scalability and genericness, by encoding a given video shot into a single compact fixed-length signature. This signature is able to robustly encode the shot contents for later speedy matching and retrieval tasks. This is in contrast with the current research trend of using an exhaustive complex features/descriptors, e.g. dense trajectories. Moreover, towards a higher matching speed, the framework operates over a sequence of tiny images (DC-images) rather than full size frames. This limits the need to fully decompress compressed-videos, as the DC-images are exacted directly from the compressed stream. The DC-image is highly useful for complex processing, due to its small size compared to the full size frame. In addition, it could be generated from uncompressed videos as well, while the proposed framework is still applicable in the same manner (genericness aspect). Furthermore, for a robust capturing of the visual similarity, scene and motion information are extracted independently, to better address their different characteristics. Scene information is captured using a statistical representation of scene key colours’ profiles, while motion information is captured using a graph-based structure. Then, both information from scene and motion are fused together to generate an overall video signature. The signature’s compact fixedlength aspect contributes to the scalability aspect. This is because, compact fixedlength signatures are highly indexable entities, which facilitates the retrieval process over large-scale video data. The proposed framework is adaptive and provides two different fixed-length video signatures. Both works in a speedy and accurate manner, but with different degrees of matching speed and retrieval accuracy. Such granularity of the signatures is useful to accommodate for different applications’ trade-offs between speed and accuracy. The proposed framework was extensively evaluated using black-box tests for the overall fused signatures and white-box tests for its individual components. The evaluation was done on multiple challenging large-size datasets against a diverse set of state-ofart baselines. The results supported by the quantitative evaluation demonstrated the promisingness of the proposed framework to support real-time applications.
14

Investigation of colour constancy using blind signal separation and physics-based image modelling

Badawi, Waleed Kamal Mohammed January 2011 (has links)
Colour is an important property in image and video processing; it is used for the segmentation, classification, and recognition of objects. The observed colour of a surface, as captured by an imaging sensor, can be affected by factors such as specular reflection, illumination variation and shadows which can lead to erroneous colour identification. This creates a need for techniques that are able to extract an illumination invariant descriptor of the surface reflectance of an object, such techniques would enable the development of image and video processing systems which are able to identify the actual colour of an object, independent of illumination variations. Thus achieving what is referred to as colour constancy. This research aims to investigate the effectiveness of applying blind signal separation integrated with a physical model of image formation into a framework for achieving colour constancy. The particular model considered in this study is the dichromatic reflection model. This model has been used in approaches to colour constancy developed by other researchers. However, most of these approaches use mixed image components (i.e. composed of specular and diffuse components) in order to estimate illumination and consequently achieve colour constancy. In addition, most of these approaches require the segmentation of the image into regions which correspond to different colours on the multi-coloured surfaces, in high specular reflection (highlight) areas of the image. Correct segmentation of multi-coloured surfaces is difficult to achieve. This thesis proposes an alternative approach embodied in a framework which integrates blind signal separation and dichromatic model of image formation. Unlike the conventional approaches, by using blind signal separation, the illumination can be estimated more accurately using the explicitly separated specular image component and colour constancy is achieved by utilising the explicitly separated diffuse image component only. In addition, by using the blind signal separation the multi-coloured surfaces segmentation problem can be avoided. The research questions addressed by this research are “how should blind signal separation be integrated with the dichromatic model?” and “how does the proposed framework perform in the context of achieving colour constancy?” A novel colour constancy framework is developed in this thesis, and experimental findings about the performance of the framework are reported. Unlike the existing work, the proposed framework includes a new method to estimate the illumination spectral power distribution (ISPD) by using an explicitly extracted specular component of images. Furthermore, the proposed framework includes a new method for estimating the surface spectral reflectance using an explicitly extracted diffuse component, instead of mixed image components which are used by other researchers. The framework consists of three stages which are: the separation of image components, the ISPD estimation and the estimation of surface spectral reflectance. The methodology exploited to evaluate the performance of the framework involves the development of algorithms, their implementation in software, and their assessment using well-designed experiments anchored on quantitative performance measurement methods. The goodness-of-fit coefficient (GFC) is used to evaluate the performance of the framework, by measuring the degree of similarity between the estimated spectral distribution and a known reference. Values of GFC range between 0 and 1; a higher value representing a higher degree of similarity. Using an image data set generated by the author, compared to the manufacturer’s specifications, the estimated ISPD has an average GFC value equal to 0.9830 and 0.9215 for two light sources with colour temperature of 5500 K and 2900 K, respectively. The average GFC of the estimated ISPD improves significantly by 2.9% when the explicit specular image component is used instead of mixed image components. Furthermore, using Foster et al’s image data set (a set of hyperspectral images of natural scenes which was collected by Foster, Nascimento, and Amano), the ISPD is estimated using the mixed image components for other light sources with different colour temperatures. The results show that the estimated ISPD has an average value of the GFC equal to 0.9986 compared to the measured illumination. Using the data set collected by the author of this thesis, the surface spectral reflectance is estimated at individual pixels of an object illuminated by two alternative light sources with colour temperatures of 5500 K and 2900 K. A comparative assessment shows that the spectral reflectance, estimated for each given surface, has almost the same spectral signature for the two light sources. The comparison between the surface spectral reflectance estimates corresponding to the two light sources gives an average GFC value which ranges from 0.9611 to 0.9887, depending on the type of the blind separation technique that is used (i.e. the spatially constrained FastICA technique and the technique developed by Umeyama and Godin). Given that the surface spectral reflectance is the output of the last stage of the framework, which depends on the output of the previous two stages, therefore the GFC measured for surface spectral reflectance reflects the performance of the whole framework. The high GFC values mean that the estimates of surface reflectance under the two light sources are very similar, despite the differences between the two illuminants. This similarity implies that the extracted surface reflectance is significantly independent of illumination characteristics, hence showing that the proposed framework achieved a significant degree of colour constancy. Moreover, the observed results show a statistically significant improvement in the accuracy of the estimated surface spectral reflectance by 2.6% in terms of average GFC value when the explicitly extracted diffuse image component is used instead of the mixed image components. Compared to the surface spectral reflectance measurements included in Foster et al’s image data set, the surface spectral reflectance estimated using the mixed image components has an average GFC value equal to 0.9608.
15

Development of a secure biometric recognition system

Muthu, Rajesh January 2016 (has links)
Biometric based security systems are becoming an integral part of many security agencies and organisations. These systems have a number of applications ranging from national security, law enforcement, the identification of people, particularly for building access control, the identification of suspects by the police, driver’s licences and many other spheres. However, the main challenge is to ensure the integrity of digital content under different intentional and non-intentional distortions; along with the robustness and security of the digital content. This thesis focuses on improving the security of fingerprint templates to allow accurate comparison of the fingerprint content. The current methods to generate fingerprint templates for comparison purposes mostly rely on using a single feature extraction technique such as Scale Invariant Feature Transform (SIFT) or Fingerprint Minutiae. However, the combination of two feature extraction techniques (e.g., SIFT-Minutiae) has not been studied in the literature. This research, therefore, combines the existing feature extraction techniques, SIFTHarris: Feature point detection is critical in image hashing in term of robust feature extraction, SIFT to incorporate the Harris criterion to select most robust feature points and SIFT-Wavelet: Wavelet based technique is basically used to provide more security and reliability of image, SIFT feature with efficient wavelet-based salient points to generate robust SIFT - wavelet feature that provides sufficient invariance to common image manipulations. The above said feature detector are known work well on the natural images (e.g., faces, buildings or shapes) and tests them in the new context of fingerprint images. The results in this thesis demonstrate that new approach contributes towards the improvement of fingerprint template security and accurate fingerprint comparisons. The fingerprint minutiae extraction method is combined individually with the SIFTHarris method, SIFT-Wavelet method and the SIFT method, to generate the most prominent fingerprint features. These features are post-processed into perceptual hashes using Radial Shape Context Hashing (RSCH) and Angular Shape Context Hashing (ASCH) methods. The accuracy of fingerprint comparison in each case is evaluated using the Receiver Operating Characteristic (ROC) curves. The experimental results demonstrate that for the JPEG lossy compression and geometric attacks, including rotation and translation, the fingerprint template and accuracy of fingerprint matching improved when combinations of two different Feature extraction techniques are used, in contrast to using only a single feature extraction technique. The ROC plots illustrates the SIFT-Harris-Minutiae, SIFT-Wavelet-Minutiae, SIFTMinutiae perform better than the SIFT method. The ROC plots further demonstrate that SIFT-Harris-Minutiae outperform all the other techniques. Therefore, SIFTHarris-Minutiae technique is more suitable for generating a template to compare the fingerprint content. Furthermore, this research focuses on perceptual hashing to improve the minutiae extraction of fingerprint images, even if the fingerprint image has been distorted. The extraction of hash is performed after wavelet transform and singular value decomposition (SVD). The performance evaluation of this approach includes important metrics, such as the Structural Similarity Index Measure (SSIM) and the Peak Signal-to-Noise Ratio (PSNR). Experimentally, it has confirmed its robustness against image processing operations and geometric attacks.
16

Intelligent facial expression recognition with unsupervised facial point detection and evolutionary feature optimization

Mistry, Kamlesh January 2016 (has links)
Facial expression is one of the effective channels to convey emotions and feelings. Many shape-based, appearance-based or hybrid methods for automatic facial expression recognition have been proposed. However, it is still a challenging task to identify emotions from facial images with scaling differences, pose variations, and occlusions. In addition, it is also difficult to identify significant discriminating facial features that could represent the characteristic of each expression because of the subtlety and variability of facial expressions. In order to deal with the above challenges, this research proposes two novel approaches: unsupervised facial point detection and texture-based facial expression recognition with feature optimisation. First of all, unsupervised automatic facial point detection integrated with regression-based intensity estimation for facial Action Units (AUs) and emotion clustering is proposed to deal with challenges such as scaling differences, pose variations, and occlusions. The proposed facial point detector can detect 54 facial points in images of faces with occlusions, pose variations and scaling differences. We conduct AU intensity estimation respectively using support vector regression and neural networks for 18 selected AUs. FCM is also subsequently employed to recognise seven basic emotions as well as neutral expressions. It also shows great potential to deal with compound and newly arrived novel emotion class detection. The second proposed system focuses on a texture-based approach for facial expression recognition by proposing a novel variant of the local binary pattern for discriminative feature extraction and Particle Swarm Optimization (PSO)-based feature optimisation. Multiple classifiers are applied for recognising seven facial expressions. Finally, evaluations are conducted to show the efficiency of the above two proposed systems. Evaluated using well-known facial databases: Helen, labelled faces in the wild, PUT, and CK+ the proposed unsupervised facial point detector outperforms other supervised landmark detection models dramatically and shows excellent robustness and capability in dealing with rotations, occlusions and illumination changes. Moreover, a comprehensive evaluation is also conducted for the proposed texture-based facial expression recognition with mGA-embedded PSO feature optimisation. Evaluated using the CK+ and MMI benchmark databases, the experimental results indicate that it outperforms other state-of-the-art metaheuristic search methods and facial emotion recognition research reported in the literature by a significant margin.
17

Detection of online phishing email using dynamic evolving neural network based on reinforcement learning

Smadi, Sami January 2017 (has links)
Phishing has been the most frequent cybercrime activity over the last 15 years and has caused billions of dollars to be stolen. This happens due to the fact that phishing attackers always use new (zero-day) and sophisticated techniques to deceive online customers. The most common way to initiate a phishing attack is by using email. In this thesis, a novel framework is proposed that combines a neural network with reinforcement learning for detecting online phishing attacks. This thesis addresses the effectiveness of phishing email detection, and it makes the following contributions. Firstly, a novel pre-processing system has been designed to gather and extract the features and patterns of phishing email. To cover all behaviour that phishers use to deceive online customers, fifty features were selected. Pre-processing is divided into three layers, based on the main types of email content. Secondly, a novel algorithm has been proposed for the exploration of new phishing behaviour. The proposed algorithm has the ability to select the effective list of features from the list of features extracted in the pre-processing phase. Thirdly, this thesis proposed a novel Dynamic Evolving Neural Network using Reinforcement Learning (DENNuRL) algorithm, which can be used to generate the best neural network for classification problem based on reinforcement learning idea. Finally, a novel framework for Phishing Email Detection System (PEDS) has been proposed. The PEDS has the ability to adapt itself to generate a new PEDS that reflects changes in behaviour. Therefore, reinforcement learning is adopted in the proposed framework combined with neural network to enhance the system dynamically over time in the online mode. The proposed technique can effectively handle zero-day phishing attacks. The proposed phishing email detection model was trained, tested and validated in the online mode using an approved dataset. The promising results showed that the DENNuRL can provide an effective means of phishing detection. The proposed model successfully classified and identified approximately 98.6% of phishing emails selected from the test dataset, with low false positive rates of 1.8%. A comparison with other similar techniques using the same dataset shows that the proposed technique outperforms the existing methods.
18

Finger knuckle print and palmprint for efficient person recognition

El-Tarhouni, Wafa January 2017 (has links)
Biometric person recognition systems are increasingly being used to enhance the security of physical and logical security systems. Palmprint and finger knuckle print recognition have gained attention in research and practical domains, providing a means of identification for security system access and personal recognition and presenting an interesting and challenging research problem. The overall aim of this work is to investigate biometric systems able to recognise people using their palmprints and finger knuckle prints. The work investigates the theoretical concepts behind palmprint and finger knuckle print recognition and proposes new algorithms to extract features for recognition systems able to identify a person from a test sample with a strong degree of confidence. The research has led to five contributions. The first contribution is concerned with the development of an ensemble learning framework using a variant of local binary patterns constructed from Pascal's coefficients of order n, termed Pascal's coefficient multiscale local binary pattern. In addition, a feature extraction technique which combines pyramid histograms of oriented gradients and Pascal's coefficient local binary patterns by concatenating the features for use in classification is also proposed. Secondly, a fusion approach is proposed by combining local binary pattern histograms of Fourier features with Gabor filter technique to generate a single feature extraction to improve palmprint recognition. The third contribution is related to a novel feature extraction method applied for use in palmprint and Finger Knuckle Print recognition. The multi-shift local binary pattern approach extends the original shift local binary pattern concept to a multi-scale dimension to obtain more robust and discriminating feature representations by extracting histograms and concatenating them into a single feature vector. The fifth contribution proposes a novel Fibonacci sequence local binary pattern descriptor and multi-scale Fibonacci sequence local binary pattern descriptor by carefully modifying the operator thresholding scheme at the pixel values. To achieve this Fibonacci numbers have been used to generate a distribution of binary codes at every pixel position in order to create descriptors that are more robust against lighting variations of images. Finally, a new feature set is developed for finger knuckle print recognition. This is inspired by using the completed local binary pattern, termed the dynamic threshold CLBP, which employs only the sign and magnitude components. The novelty proposes to encode the magnitude features using a dynamic thresholding technique to concatenate the sign and magnitude features.
19

Methods for the efficient deployment and coordination of swarm robotic systems

Eliot, Neil January 2017 (has links)
Swarming has been observed in many animal species, including fish, birds, insects and mammals. These biological observations have inspired mathematical models of distributed coordination that have been applied to the development of multi-agent robotic systems, such as collections of unmanned autonomous vehicles (UAVs). The advantages of a swarming approach to distributed coordination are clear: each agent acts according to a simple set of rules that can be implemented on resource-constrained devices, and so it becomes feasible to replicate agents in order to build more resilient systems. However, there remain significant challenges in making the approach practicable. This thesis addresses two of the most significant: coordination and scalability. New coordination algorithms are proposed here, all of which manage the problem of scalability by requiring only local proximity sensing between agents, without the need for any other communications infrastructure. A major source of inefficiency in the deployment of a swarm is ‘oscillation’: small movements of agents that arise as a side effect of the application of their rules but which are not strictly necessary in order to satisfy the overall system function. The thesis introduces a new metric for ‘oscillation’ that allows it to be identified and measured in swarm control algorithms. A new perimeter detection mechanism is introduced and applied to the coordination of goal-based swarms. The mechanism is used to improve the internal coordination of agents whilst maintaining a directional focus to the swarm; this is then analysed using the new metric. A mechanism is proposed to allow a swarm to exhibit a ‘healing’ behaviour by identifying internal perimeter edges (doughnuts) and then altering the movement of agents, based upon a simple criterion, to remove the holes; this also has the emergent effect of smoothing the outer edges of a swarm and creating a more uniform swarm structure. Area coverage is an important requirement in many swarm applications. Two new, efficient area-filling techniques are introduced here and exit conditions are identified to determine when a swarm has filled an area. In summary, the thesis makes significant contributions to the analysis and design of efficient control algorithms for the coordination of large scale swarms.
20

Person recognition using gait energy imaging

Lishani, Ait January 2018 (has links)
Biometric technology has emerged as a viable identification and authentication solution with various systems in operation worldwide. The technology uses various modalities, including fingerprint, face, iris, palmprint, speech, and gait. Biometric recognition often involves images or videos and other image impressions that are fragile and include subtle details that are difficult to see or capture. Thus, there is a need for developing imaging applications that allow for accurate feature extraction from images for identification and recognition purposes. Biometric modalities can be classified into two classes: physiological (i.e. fingerprint, iris, face, palm-print) or behavioural traits (speech, gait). This work is concerned with an investigation of biometric recognition at a distance and the gait modality has been chosen for various reasons. Gait data can be captured at a distance and is non-invasive. Additionally, it has advantages such as the fact that a person’s gait is hard to copy, and by trying to do so, the imitator will likely appear more suspicious. Although, due to covariates, for example, a change in viewing angle, clothes, shoes, shadow or elapsed time can make gait recognition additionally challenging. There are several approaches for studying gait recognition systems such as model-based and model-free. This thesis is based on a model-free approach and proposes a supervised feature extraction approach capable of selecting distinctive features for the recognition of human gait under clothing and carrying conditions. In this work; to allow for the characterisation of human gait properties for individual recognition, a spatiotemporal gait representation technique called Gait Energy Image (GEI) has been used. This approach is aimed at improving the recognition performance based on the principles of feature texture descriptors extracted from GEI. Furthermore, as part of this work, the dynamic parts of the energy gait representation have been proposed as means to extract more discriminative information from a gait sequence using reduction techniques in order to further improve the human identification rate. The four methods proposed were evaluated using CASIA Gait Database (dataset B) and USF Database under variations of clothing and carrying conditions for different viewing angles. The first method is based on Haralick texture feature, and use the RELIEF selection algorithm. This method showed that a judicious deployment of horizontal GEI features outperforms similar methods by up to 7.00%. In addition, this method achieved an improved classification rate of up to 80.00% from a side view of 90o. The second and third contributions are concerned with an investigation of the Gabor filter bank and Multi-scale Local Binary Pattern (MLBP) as an efficient feature extraction for gait recognition under clothing distortions. To achieve this, various dimension reduction techniques including Kernel Principal Component Analysis, Maximum Margin Projection, Spectral Regression Kernel Discriminant Analysis and Locality Preserving Projections were investigated. The results showed that the proposed methods outperform the state-of-the-art counterparts by achieving up to 93.00% Identification Rate (IR) at rank-1 using the Gabor filter method, and achieving up to 92.00% IR using the MLBP method, when using a k-NN classifier for a side view of 90o. The final contribution of this work is concerned with an investigation of the Haar wavelet transform and its use for extracting powerful features for human gait recognition under clothing distortions. The experimental results using a k-NN classifier yielded attractive results of up to 93.00% in terms of highest IR at rank-1, compared to existing and similar state-of-the-art methods. It should be noted that all the experiments were carried out using the MATLAB programming environment.

Page generated in 0.0916 seconds