Global ETD Search

1	Computational analysis of facial expressions Shenoy, A. January 2010 (has links) This PhD work constitutes a series of inter-disciplinary studies that use biologically plausible computational techniques and experiments with human subjects in analyzing facial expressions. The performance of the computational models and human subjects in terms of accuracy and response time are analyzed. The computational models process images in three stages. This includes: Preprocessing, dimensionality reduction and Classification. The pre-processing of face expression images includes feature extraction and dimensionality reduction. Gabor filters are used for feature extraction as they are closest biologically plausible computational method. Various dimensionality reduction methods: Principal Component Analysis (PCA), Curvilinear Component Analysis (CCA) and Fisher Linear Discriminant (FLD) are used followed by the classification by Support Vector Machines (SVM) and Linear Discriminant Analysis (LDA). Six basic prototypical facial expressions that are universally accepted are used for the analysis. They are: angry, happy, fear, sad, surprise and disgust. The performance of the computational models in classifying each expression category is compared with that of the human subjects. The Effect size and Encoding face enable the discrimination of the areas of the face specific for a particular expression. The Effect size in particular emphasizes the areas of the face that are involved during the production of an expression. This concept of using Effect size on faces has not been reported previously in the literature and has shown very interesting results. The detailed PCA analysis showed the significant PCA components specific for each of the six basic prototypical expressions. An important observation from this analysis was that with Gabor filtering followed by non linear CCA for dimensionality reduction, the dataset vector size may be reduced to a very small number, in most cases it was just 5 components. The hypothesis that the average response time (RT) for the human subjects in classifying the different expressions is analogous to the distance measure of the data points from the classification hyper-plane was verified. This means the harder a facial expression is to classify by human subjects, the closer to the classifying hyper-plane of the classifier it is. A bi-variate correlation analysis of the distance measure and the average RT suggested a significant anti-correlation. The signal detection theory (SDT) or the d-prime determined how well the model or the human subjects were in making the classification of an expressive face from a neutral one. On comparison, human subjects are better in classifying surprise, disgust, fear, and sad expressions. The RAW computational model is better able to distinguish angry and happy expressions. To summarize, there seems to some similarities between the computational models and human subjects in the classification process. 005.3
2	Estudo dimensional de características aplicadas à leitura labial automática Madureira, Fillipe Levi Guedes 31 August 2018 (has links) Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES / This work is a study of the relationship between the intrinsic dimension of feature vectors applied to the classification of video signals in order to perform lip reading. In pattern recognition tasks, the extraction of relevant features is crucial for a good performance of the classifiers. The starting point of this work was the reproduction of the work of J.R. Movellan [1], which classifies lips gestures with HMM using only the video signal from the Tulips1 database. The database consists of videos of volunteers’ mouths while they utter the first 4 numerals in English. The original work uses feature vectors of high dimensionality in relation to the size of the database. Consequently, the adjustment of HMM classifiers has become problematic and the maximum accuracy was only 66.67%. Alternative strategies for feature extraction and classification schemes were proposed in order to analyze the influence of the intrinsic dimension in the performance of classifiers. The best solution, in terms of results, achieved an accuracy of approximately 83%. / Este trabalho é um estudo da relação entre a dimensão intrínseca de vetores de características aplicados à classificação de sinais de vídeo no intuito de realizar-se a leitura labial. Nas tarefas de reconhecimento de padrões, a extração de características relevantes é crucial para um bom desempenho dos classificadores. O ponto de partida deste trabalho foi a reprodução do trabalho de J.R. Movellan [1], que realiza a classificação de gestos labiais com HMM na base de dados Tulips1, utilizando somente o sinal de vídeo. A base é composta por vídeos das bocas de voluntários enquanto esses pronunciam os primeiros 4 numerais em inglês. O trabalho original utiliza vetores de características de dimensão muito alta em relação ao tamanho da base. Consequentemente, o ajuste de classificadores HMM se tornou problemático e só se alcançou 66,67% de acurácia. Estratégias de extração de características e esquemas de classificação alternativos foram propostos, a fim de analisar a influência da dimensão intrínseca no desempenho de classificadores. A melhor solução, em termos de resultados, obteve uma acurácia de aproximadamente 83%. / São Cristóvão, SE Engenharia elétrica Surdos Comunicação oral Sistemas de reconhecimento de padrões Dimensão intrínseca Extração de características Leitura labial Hidden Markov Model (HMM) Intrinsic dimension Feature extraction Lip-reading ENGENHARIAS::ENGENHARIA ELETRICA

Search results

Computational analysis of facial expressions

Estudo dimensional de características aplicadas à leitura labial automática