• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 75
  • 55
  • 18
  • 10
  • 6
  • 4
  • 3
  • 3
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 205
  • 62
  • 39
  • 33
  • 32
  • 21
  • 19
  • 18
  • 18
  • 16
  • 16
  • 16
  • 12
  • 12
  • 12
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
131

Facial-based Analysis Tools: Engagement Measurements and Forensics Applications

Bonomi, Mattia 27 July 2020 (has links)
The last advancements in technology leads to an easy acquisition and spreading of multi-dimensional multimedia content, e.g. videos, which in many cases depict human faces. From such videos, valuable information describing the intrinsic characteristic of the recorded user can be retrieved: the features extracted from the facial patch are relevant descriptors that allow for the measurement of subject's emotional status or the identification of synthetic characters. One of the emerging challenges is the development of contactless approaches based on face analysis aiming at measuring the emotional status of the subject without placing sensors that limit or bias his experience. This raises even more interest in the context of Quality of Experience (QoE) measurement, or the measurement of user emotional status when subjected to a multimedia content, since it allows for retrieving the overall acceptability of the content as perceived by the end user. Measuring the impact of a given content to the user can have many implications from both the content producer and the end-user perspectives. For this reason, we pursue the QoE assessment of a user watching multimedia stimuli, i.e. 3D-movies, through the analysis of his facial features acquired by means of contactless approaches. More specifically, the user's Heart Rate (HR) was retrieved by using computer vision techniques applied to the facial recording of the subject and then analysed in order to compute the level of engagement. We show that the proposed framework is effective for long video sequences, being robust to facial movements and illumination changes. We validate it on a dataset of 64 sequences where users observe 3D movies selected to induce variations in users' emotional status. From one hand understanding the interaction between the user's perception of the content and his cognitive-emotional aspects leads to many opportunities to content producers, which may influence people's emotional statuses according to needs that can be driven by political, social, or business interests. On the other hand, the end-user must be aware of the authenticity of the content being watched: advancements in computer renderings allowed for the spreading of fake subjects in videos. Because of this, as a second challenge we target the identification of CG characters in videos by applying two different approaches. We firstly exploit the idea that fake characters do not present any pulse rate signal, while humans' pulse rate is expressed by a sinusoidal signal. The application of computer vision techniques on a facial video allows for the contactless estimation of the subject's HR, thus leading to the identification of signals that lack of a strong sinusoidality, which represent virtual humans. The proposed pipeline allows for a fully automated discrimination, validated on a dataset consisting of 104 videos. Secondly, we make use of facial spatio-temporal texture dynamics that reveal the artefacts introduced by computer renderings techniques when creating a manipulation, e.g. face swapping, on videos depicting human faces. To do so, we consider multiple temporal video segments on which we estimated multi-dimensional (spatial and temporal) texture features. A binary decision of the joint analysis of such features is applied to strengthen the classification accuracy. This is achieved through the use of Local Derivative Patterns on Three Orthogonal Planes (LDP-TOP). Experimental analyses on state-of-the-art datasets of manipulated videos show the discriminative power of such descriptors in separating real and manipulated sequences and identifying the creation method used. The main finding of this thesis is the relevance of facial features in describing intrinsic characteristics of humans. These can be used to retrieve significant information like the physiological response to multimedia stimuli or the authenticity of the human being itself. The application of the proposed approaches also on benchmark dataset returned good results, thus demonstrating real advancements in this research field. In addition to that, these methods can be extended to different practical application, from the autonomous driving safety checks to the identification of spoofing attacks, from the medical check-ups when doing sports to the users' engagement measurement when watching advertising. Because of this, we encourage further investigations in such direction, in order to improve the robustness of the methods, thus allowing for the application to increasingly challenging scenarios.
132

Developing a computer system for the generation of unique wrinkle maps for human faces : generating 2D wrinkle maps using various image processing techniques and the design of 3D facial ageing system using 3D modelling tools

Mehdi, Ali January 2011 (has links)
Facial Ageing (FA) is a very fundamental issue, as ageing in general, is part of our daily life process. FA is used in security, finding missing children and other applications. It is also a form of Facial Recognition (FR) that helps identifying suspects. FA affects several parts of the human face under the influence of different biological and environmental factors. One of the major facial feature changes that occur as a result of ageing is the appearance and development of wrinkles. Facial wrinkles are skin folds; their shapes and numbers differ from one person to another, therefore, an advantage can be taken over these characteristics if a system is implemented to extract the facial wrinkles in a form of maps. This thesis is presenting a new technique for three-dimensional facial wrinkle pattern information that can also be utilised for biometric applications, which will back up the system for further increase of security. The procedural approaches adopted for investigating this new technique are the extraction of two-dimensional wrinkle maps of frontal human faces for digital images and the design of three-dimensional wrinkle pattern formation system that utilises the generated wrinkle maps. The first approach is carried out using image processing tools so that for any given individual, two wrinkle maps are produced; the first map is in a binary form that shows the positions of the wrinkles on the face while the other map is a coloured version that indicates the different intensities of the wrinkles. The second approach of the 3D system development involves the alignment of the binary wrinkle maps on the corresponding 3D face models, followed by the projection of 3D curves in order to acquire 3D representations of the wrinkles. With the aid of the coloured wrinkle maps as well as some ageing parameters, simulations and predictions for the 3D wrinkles are performed.
133

Corrélats neuronaux sous-jacents aux expressions émotionnelles : une comparaison entre musique, voix et visage

Aubé, William 11 1900 (has links)
Cette thèse a pour objectif de comparer les expressions émotionnelles évoquées par la musique, la voix (expressions non-linguistiques) et le visage sur les plans comportemental et neuronal. Plus précisément, le but est de bénéficier de l’indéniable pouvoir émotionnel de la musique afin de raffiner notre compréhension des théories et des modèles actuels associés au traitement émotionnel. Qui plus est, il est possible que cette disposition surprenante de la musique pour évoquer des émotions soit issue de sa capacité à s’immiscer dans les circuits neuronaux dédiés à la voix, bien que les évidences à cet effet demeurent éparses pour le moment. Une telle comparaison peut potentiellement permettre d’élucider, en partie, la nature des émotions musicales. Pour ce faire, différentes études ont été réalisées et sont ici présentées dans deux articles distincts. Les études présentées dans le premier article ont comparé, sur le plan comportemental, les effets d’expressions émotionnelles sur la mémoire entre les domaines musical et vocal (non-linguistique). Les résultats ont révélé un avantage systématique en mémoire pour la peur dans les deux domaines. Aussi, une corrélation dans la performance individuelle en mémoire a été trouvée entre les expressions de peur musicales et vocales. Ces résultats sont donc cohérents avec l’hypothèse d’un traitement perceptif similaire entre la musique et la voix. Dans le deuxième article, les corrélats neuronaux associés à la perception d’expressions émotionnelles évoquées par la musique, la voix et le visage ont été directement comparés en imagerie par résonnance magnétique fonctionnelle (IRMf). Une augmentation significative du signal « Blood Oxygen Level Dependent » (BOLD) a été trouvée dans l’amygdale (et à l’insula postérieure) en réponse à la peur, parmi l’ensemble des domaines et des modalités à l’étude. Une corrélation dans la réponse BOLD individuelle de l’amygdale, entre le traitement musical et vocal, a aussi été mise en évidence, suggérant à nouveau des similarités entre les deux domaines. En outre, des régions spécifiques à chaque domaine ont été relevées. Notamment, le gyrus fusiforme (FG/FFA) pour les expressions du visage, le sulcus temporal supérieur (STS) pour les expressions vocales ainsi qu’une portion antérieure du gyrus temporal supérieur (STG) particulièrement sensible aux expressions musicales (peur et joie), dont la réponse s’est avérée modulée par l’intensité des stimuli. Mis ensemble, ces résultats révèlent des similarités mais aussi des différences dans le traitement d’expressions émotionnelles véhiculées par les modalités visuelle et auditive, de même que différents domaines dans la modalité auditive (musique et voix). Plus particulièrement, il appert que les expressions musicales et vocales partagent d’étroites similarités surtout en ce qui a trait au traitement de la peur. Ces données s’ajoutent aux connaissances actuelles quant au pouvoir émotionnel de la musique et contribuent à élucider les mécanismes perceptuels sous-jacents au traitement des émotions musicales. Par conséquent, ces résultats donnent aussi un appui important à l’utilisation de la musique dans l’étude des émotions qui pourra éventuellement contribuer au développement de potentielles interventions auprès de populations psychiatriques. / This thesis aims to compare emotional expressions evoked by music, voices (non-linguistic) and faces at a behavioral and neuronal level. Specifically, the goal was to exploit the emotional power of music to refine our understanding of theories and models associated with emotional processing. Moreover, it has been suggested that music’s surprising ability to evoke emotions might arise from its capacity to invade brain circuits dedicated to voice processing, although evidences are scarce in this regard. Therefore, a direct comparison between these stimulus categories could potentially help understand the bases of musical emotions. To do so, we conducted a series of experiments, which are presented in two separate published articles. Experiments presented in the first paper compared, at a behavioral level, the effect of emotions on memory across musical and vocal (non-linguistic) domains. Results revealed a systematic memory advantage for fear within both domains. Individual memory performances were also correlated across musical and vocal expressions. These results are thus consistent with the hypothesis of a similar behavioral processing between music and voices. In the second article, neural correlates associated with the perception of emotional expressions evoked by music, voices and faces were directly compared using functional magnetic resonance imaging (fMRI). Significant «Blood Oxygen Level Dependent» (BOLD) signal increase in the amygdala (and posterior insula) was found across domains and modalities in response to fear expressions. In addition, subject-specific amygdala BOLD signal responses for music and voice were correlated, again suggesting similarities across domains. Domain specific brain regions were also revealed. Namely, the fusiform gyrus (FG/FFA) for faces, the superior temporal sulcul (STS) in the case of voices, and an anterior portion of the superior temporal gyrus (STG) for musical expressions (fear & happiness), whose response appeared to be modulated by stimulus emotional intensity. Altogether, the results presented here suggest similarities but also differences in the processing of emotional expressions conveyed by visual and auditory modalities, and, within the audition, for different domains (music and voices). In particular, emotional expressions within the musical and vocal domains appear to share close similarities, especially in the case of fear. These data adds on to current knowledge about the emotional power of music and contribute to clarify some of the underlying perceptual mechanisms involved in processing musical emotions. Thus, these findings provide further support that music can be a useful tool in investigating emotions and that it could eventually contribute to develop potential interventions for psychiatric populations.
134

Análise da validade, interpretação e preferência da versão brasileira da Escala Facial de Dor - Revisada, em duas amostras clínicas / Analysis of the validity, interpretability and preference of the Brazilian version of the Faces Pain Scale Revised in two clinic samples.

Poveda, Claudia Ligia Esperanza Charry 27 February 2012 (has links)
A Escala Facial de Dor - Revisada (EFD-R) é uma das escalas mais recomendadas na mensuração da intensidade da dor aguda em crianças. A versão original desta escala foi testada em crianças canadenses. O objetivo deste trabalho foi avaliar a validade, interpretação e preferência da versão brasileira da Escala Facial de Dor - Revisada (EFD-R-B), em duas amostras de crianças brasileiras: uma envolvendo dor aguda procedural e outra dor aguda pós-cirúrgica. Na primeira amostra participaram 77 crianças com idades entre 6 e 12 anos, do sexo feminino e masculino, que foram submetidas à coleta de sangue (dor procedural). As crianças estimaram a intensidade da sua dor, antes e após a punção venosa, na EFD-R-B. Na estimação após a punção venosa, a Escala Colorida Analógica (ECA) foi administrada junto com a EFD-R-B e, além disso, as crianças indicaram as faces que expressavam uma dor leve, moderada e severa, a escala que preferiam e o porquê. Na segunda amostra, participaram 53 crianças com idades entre 6 e 12 anos, do sexo feminino e masculino, que tinham sido submetidas a pequenas cirurgias (dor pós-cirúrgica). Nesta amostra, as crianças estimaram, na EFD-R-B e na ECA, a intensidade da dor que estavam sentindo no momento da entrevista. Também indicaram as faces que expressavam uma dor leve, moderada e severa, o limiar de tratamento da dor, a escala que preferiam e o porquê. Na comparação entre as pontuações obtidas na EFD-R-B e na ECA (validade convergente), nas duas amostras, os valores dos coeficientes Kendall\'s tau foram altos e significativos: =0,75 para o grupo de dor procedural e =0,79 para o grupo de dor pós-cirúrgica (p=0,00 nas duas amostras). No grupo de dor procedural, a EFD-R-B refletiu as mudanças na intensidade da dor vivenciada pelas crianças antes e após a punção venosa (validade concorrente): Teste de Wilcoxon z=-6,65; p=0,00. Considerando uma escala de 0 a 10 para a EFD-R-B, a mediana e a amplitude interquartil (AIQ) para as faces indicadas como expressivas de intensidade leve, moderada e severa, foram 2 (2-2), 4 (4-6) e 10 (10-10) respectivamente, no grupo de dor procedural, e 2 (2-2), 6 (4-8) e 10 (10-10) respectivamente, no grupo de dor pós-cirúrgica. Na estimação do limiar de tratamento da dor (grupo de dor pós-cirúrgica), a mediana (AIQ) foi 6 (4-10). No grupo de dor procedural, a EFD-R-B foi a escala preferida por 57,1% das crianças e a ECA por 41,6%; no grupo de dor pós-cirúrgica, a EFD-R-B foi escolhida por 66% das crianças e a ECA por 34%. Estas proporções somente foram significativas no grupo de dor pós-cirúrgica (X²=5,453 p=0,02). Nossos resultados mostram que a EFD-R-B possui propriedades similares à escala original e boa aceitação entre as crianças entrevistadas. A determinação dos valores das diferentes intensidades de dor e do limiar de tratamento da dor, para cada participante, representa uma evidência importante sobre a interpretação da EFD-R. / The Faces Pain Scale Revised (FPS-R) is one of the most recommended tools in measuring the intensity of acute pain in children. The aim of this study was to assess validity, interpretability and preference of the Brazilian version of the FPS-R (FPS-R-B), in two different clinical samples. The first sample contained seventy-seven children, 6 to 12 years old and both sexes, undergoing venipuncture for blood sample (procedural pain). These children estimated their perceived pain intensity in FPS-R-B before and after venipuncture. Furthermore, after venipuncture, children were asked: a) to evaluate the intensity of their needle pain using the Coloured Analogue Scale (CAS), b) to indicate on the Faces scale the intensities representing the mild, moderate and severe pain, and c) to choose the scale they preferred and indicate the reasons for the preference. The second sample included fifty-three children, 6 to 12 years old and both sexes, undergoing minor surgery (postoperative pain). Following surgery, children were asked: a) to provide a rating of their current pain intensity using the FPS-R-B and the CAS, b) to indicate on the Faces scale the intensities representing the mild, moderate and severe pain, c) to estimate, on the FPS-R-B, the intensity of pain that their felt to warrant pharmacologic intervention (pain treatment threshold), and d) to choose the scale they preferred and indicate the reasons for the preference. The degree of concordance between FPS-R-B and CAS ratings (convergent validity), for both samples, was high and statistically significant Kendall\'s tau value was 0.75 for the first sample, and 0.79 for the second sample, (p<0.05) . FPS-R-B reflected the changes in pain intensity before and after venipuncture (concurrent validity): Wilcoxon Test z=- 6.24; p< 0.05. On the 0-10 scale for the FPS-R-B, the median and interquartile range (IQR) of the intensities that represented mild, moderate and severe pain were 2 (2-2), 4 (4-6) e 10 (10-10) respectively, for the first sample, and 2 (2-2), 6 (4-8) e 10 (10-10) respectively, for the second sample. The median and IQR for pain treatment threshold were 6 (4-10). Fifty-seven percent of children in the first sample and 64.8% in the second sample preferred the FPS-R-B. These proportions were statistically significant for the second sample (X²=5,453 p<0,05). Our data show that the FPS-R-B has similar statistical properties to the original. New evidences were presented regarding interpretability of the FPS-R by determining each children\'s treatment threshold and estimate of mild, moderate and severe pain. In this study, the FPS-R-B was preferred by the majority of children.
135

Segmentação de imagens de pessoas em tempo real para videoconferências

Parolin, Alessandro 22 March 2011 (has links)
Submitted by Mariana Dornelles Vargas (marianadv) on 2015-03-16T14:26:47Z No. of bitstreams: 1 segmentacao_imagens.pdf: 6472132 bytes, checksum: b5a25706eff2375403bc63c7d6a89f0d (MD5) / Made available in DSpace on 2015-03-16T14:26:47Z (GMT). No. of bitstreams: 1 segmentacao_imagens.pdf: 6472132 bytes, checksum: b5a25706eff2375403bc63c7d6a89f0d (MD5) Previous issue date: 2011 / HP - Hewlett-Packard Brasil Ltda / Milton Valente / Segmentação de objetos em imagens e vídeos é uma área relativamente antiga na área de processamento de imagens e visão computacional. De fato, recentemente, devido à grande evolução dos sistemas computacionais em termos de hardware e à popularização da internet, uma aplicação de segmentação de imagens de pessoas que vem ganhando grande destaque na área acadêmica e comercial são as videoconferências. Esse tipo de aplicação traz benefícios a diferentes áreas, como telemedicina, educação à distância, e principalmente empresarial. Diversas empresas utilizam esse tipo de recurso para realizar reuniões/conferências a nível global economizando quantias consideráveis de recursos. No entanto, videoconferências ainda não proporcionam a mesma experiência que as pessoas têm quando estão num mesmo ambiente. Portanto, esse trabalho propõe o desenvolvimento de um sistema de segmentação da imagem do locutor, específico para videoconferências, a fim de permitir futuros processamentos que aumentem a sensação de imersão dos participantes, como por exemplo, a substituição do fundo da imagem por um fundo padrão em todos ambientes. O sistema proposto utiliza basicamente um algoritmo de programação dinâmica guiado por energias extraídas da imagem, envolvendo informações de borda, movimento e probabilidade. Através de diversos testes realizados, observou-se que o sistema apresenta resultados equiparáveis aos do estado da arte do tema, sendo capaz de ser executado em tempo real a uma taxa de 8 FPS, mesmo com um código não otimizado. O grande diferencial do sistema proposto é que nenhum tipo de treinamento prévio é necessário para efetuar a segmentação / Object segmentation has been discussed on Computer Vision and Image processing fields for quite some time. Recently, given the hardware evolution and popularization of the World Wide Web, videoconferences have been the main discussion in this area. This technique brings advantages to many fields, such as telemedicine, education (distance learning), and mainly to the business world. Many companies use videoconferences for worldwide meetings, in order to save a substantial amount o f resources. However, videoconferences still do not provide the same experience a s people have when they are in the same room. Therefore, in this paper we propose the development of a system to segment the image of a person who is attending the videoconference, in order to allow future processing that may increase the experience of being in the same room. For instance, the background of the scene could be replaced by a standard one for all participants. The proposed system uses a dynamic programming algorithm guided by energies, such as image edges, motion and probabilistic information. After extensive tests, we could conclude that the results obtained are comparable to other state of the art works and the system is able to execute in real time at 8 FPS. The advantage of the proposed system when compared to others is that no previous training is required in order to perform the segmentation
136

Comparação entre métodos de normalização de iluminação utilizados para melhorar a taxa do reconhecimento facial / Comparison between illumination normalization methods used to improve the rate of facial recognition

Michelle Magalhães Mendonça 25 June 2008 (has links)
Condições distintas de iluminação numa imagem podem produzir representações desiguais do mesmo objeto, dificultando o processo de segmentação e reconhecimento de padrões, incluindo o reconhecimento facial. Devido a isso, a distribuição de iluminação numa imagem é considerada de grande importância, e novos algoritmos de normalização utilizando técnicas mais recentes ainda vêm sendo pesquisados. O objetivo dessa pesquisa foi o de avaliar os seguintes algoritmos de normalização da iluminação encontrados na literatura, que obtiveram bons resultado no reconhecimento de faces: LogAbout, variação do filtro homomórfico e método baseado em wavelets. O objetivo foi o de identificar o método de normalização da iluminação que resulta na melhor taxa de reconhecimento facial. Os algoritmos de reconhecimento utilizados foram: auto-faces, PCA (Principal Component Analyses) com rede neural LVQ (Learning Vector Quantization) e wavelets com rede neural MLP (Multilayer Perceptron). Como entrada, foram utilizadas imagens do banco Yale, que foram divididas em três subconjuntos. Os resultados mostraram que o método de normalização da iluminação que utiliza wavelet e LogAbout foram os que apresentaram melhoria significativa no reconhecimento facial. Os resultados também evidenciaram que, de uma maneira geral, com a utilização dos métodos de normalização da iluminação, obtém-se uma melhor taxa do reconhecimento facial, exceto para o método de normalização variação do filtro homomórfico com os algoritmos de reconhecimento facial auto-faces e wavelet com rede neural MLP. / Distinct lighting conditions in an image can produce unequal representations of the same object, compromising segmentation and pattern recognition processes, including facial recognition. Hence, the lighting distribution on an image is considered of great importance, and normalization algorithms using new techniques have still been researched. This research aims to evaluate the following illumination normalization algorithms found in literature: LogAbout, variation of homomorphic filter and wavelet based method. The main interest was to find out the illumination normalization method which improves the facial recognition rate. The algorithms used for face recognition were: eigenfaces, PCA (Principal Component Analysis) with LVQ neural network and wavelets with MLP (Multilayer Perceptron) neural network. Images from Yale Face Database B, divided into three subsets have been used. The results show that the wavelet and LogAbout technique provided the best facial recognition rate. Experiments showed that the illumination normalization methods, in general, improve the facial recognition rate, except for the variation of homomorphic filter technique with the algorithms: eigenfaces and PCA with LVQ.
137

Interferência emocional de faces de bebês e de adultos no processamento atencional automático em homens e mulheres com e sem filhos

Oliveira, Vanessa Farias January 2015 (has links)
Faces de bebês são um estímulo emocional extremamente saliente para humanos. O objetivo deste estudo foi comparar a interferência emocional de faces de bebês e estímulos de ameaça (faces adultas com medo) na atenção automática de homens e mulheres com e sem filhos. Após estudo piloto, recrutou-se 61 homens e mulheres de 20 a 35 anos. Os participantes responderam a uma tarefa Go/No-Go na qual imagens de bebês com expressões de sofrimento, alegria ou neutra e imagens de adultos com expressões de medo, alegria e neutra, eram apresentadas. Foram calculados os vieses de atenção para faces de bebês em sofrimento, faces de adultos com medo e faces de bebês vs. adultos. Participantes com filhos, de ambos os sexos, apresentaram mais viés de cuidado (bebês vs adultos) do que os sem filhos. Os resultados indicaram que apenas status parental influenciou o viés para faces de bebês. / Babies are extremely salient emotional stimuli to human beings. The current study aimed to compare the emotional interference of baby faces and threat stimuli (adult fearful faces) on the automatic attention of men and women, parents and non-parents. A pilot study was conducted to adapt instruments and procedures. For the main study 61 men and women aged 20 to 35 years with minimum complete elementary school were recruited. Images of distressed, happy and neutral baby faces and fearful, happy and neutral adult faces were used in a Go/No-Go paradigm. Attentional bias indexes were calculated for biases towards baby distress, adult fear and baby vs. adult faces. Parents, regardless of sex, showed a higher nurturing attentional bias (baby vs. adult faces) in comparison to non-parents. This finding demonstrates that parental status influences the attentional bias to babies.
138

Comparação entre métodos de normalização de iluminação utilizados para melhorar a taxa do reconhecimento facial / Comparison between illumination normalization methods used to improve the rate of facial recognition

Mendonça, Michelle Magalhães 25 June 2008 (has links)
Condições distintas de iluminação numa imagem podem produzir representações desiguais do mesmo objeto, dificultando o processo de segmentação e reconhecimento de padrões, incluindo o reconhecimento facial. Devido a isso, a distribuição de iluminação numa imagem é considerada de grande importância, e novos algoritmos de normalização utilizando técnicas mais recentes ainda vêm sendo pesquisados. O objetivo dessa pesquisa foi o de avaliar os seguintes algoritmos de normalização da iluminação encontrados na literatura, que obtiveram bons resultado no reconhecimento de faces: LogAbout, variação do filtro homomórfico e método baseado em wavelets. O objetivo foi o de identificar o método de normalização da iluminação que resulta na melhor taxa de reconhecimento facial. Os algoritmos de reconhecimento utilizados foram: auto-faces, PCA (Principal Component Analyses) com rede neural LVQ (Learning Vector Quantization) e wavelets com rede neural MLP (Multilayer Perceptron). Como entrada, foram utilizadas imagens do banco Yale, que foram divididas em três subconjuntos. Os resultados mostraram que o método de normalização da iluminação que utiliza wavelet e LogAbout foram os que apresentaram melhoria significativa no reconhecimento facial. Os resultados também evidenciaram que, de uma maneira geral, com a utilização dos métodos de normalização da iluminação, obtém-se uma melhor taxa do reconhecimento facial, exceto para o método de normalização variação do filtro homomórfico com os algoritmos de reconhecimento facial auto-faces e wavelet com rede neural MLP. / Distinct lighting conditions in an image can produce unequal representations of the same object, compromising segmentation and pattern recognition processes, including facial recognition. Hence, the lighting distribution on an image is considered of great importance, and normalization algorithms using new techniques have still been researched. This research aims to evaluate the following illumination normalization algorithms found in literature: LogAbout, variation of homomorphic filter and wavelet based method. The main interest was to find out the illumination normalization method which improves the facial recognition rate. The algorithms used for face recognition were: eigenfaces, PCA (Principal Component Analysis) with LVQ neural network and wavelets with MLP (Multilayer Perceptron) neural network. Images from Yale Face Database B, divided into three subsets have been used. The results show that the wavelet and LogAbout technique provided the best facial recognition rate. Experiments showed that the illumination normalization methods, in general, improve the facial recognition rate, except for the variation of homomorphic filter technique with the algorithms: eigenfaces and PCA with LVQ.
139

Detecção de faces e rastreamento da pose da cabeça

Schramm, Rodrigo 20 March 2009 (has links)
Submitted by Mariana Dornelles Vargas (marianadv) on 2015-04-27T19:08:59Z No. of bitstreams: 1 deteccao_faces.pdf: 3878917 bytes, checksum: 2fbf8222ef54d5fc0b1df0bf3b3a5292 (MD5) / Made available in DSpace on 2015-04-27T19:08:59Z (GMT). No. of bitstreams: 1 deteccao_faces.pdf: 3878917 bytes, checksum: 2fbf8222ef54d5fc0b1df0bf3b3a5292 (MD5) Previous issue date: 2009-03-20 / HP - Hewlett-Packard Brasil Ltda / As câmeras de vídeo já fazem parte dos novos modelos de interação entre o homem e a máquina. Através destas, a face e a pose da cabeça podem ser detectadas promovendo novos recursos para o usuário. Entre o conjunto de aplicações que têm se beneficiado deste tipo de recurso estão a vídeo-conferência, os jogos educacionais e de entretenimento, o controle de atenção de motoristas e a medida de foco de atenção. Nesse contexto insere-se essa proposta de mestrado, a qual propõe um novo modelo para detectar e rastrear a pose da cabeça a partir de uma seqüência de vídeo obtida com uma câmera monocular. Para alcançar esse objetivo, duas etapas principais foram desenvolvidas: a detecção da face e o rastreamento da pose. Nessa etapa, a face é detectada em pose frontal utilizando-se um detector com haar-like features. Na segunda etapa do algoritmo, após a detecção da face em pose frontal, atributos específicos da mesma são rastreados para estimar a variação da pose de cabeça. / Video cameras are already part of the new man-machine interaction models. Through these, the face and pose of the head can be found, providing new resources for users. Among the applications that have benefited from this type of resource are video conference, educational and entertainment games, and measurement of attention focus. In this context, this Master's thesis proposes a new model to detect and track the pose of the head in a video sequence captured by a monocular camera. To achieve this goal, two main stages were developed: face detection and head pose tracking. The first stage is the starting point for tracking the pose. In this stage, the face is detected in frontal pose using a detector with Haar-like features. In the second step of the algorithm, after detecting the face in frontal pose, specific attributes of the read are tracked to estimate the change in the pose of the head.
140

A bag of features approach for human attribute analysis on face images / Uma abordagem \"bag of features\" para análise de atributos humanos em imagens de faces

Araujo, Rafael Will Macêdo de 06 September 2019 (has links)
Computer Vision researchers are constantly challenged with questions that are motivated by real applications. One of these questions is whether a computer program could distinguish groups of people based on their geographical ancestry, using only frontal images of their faces. The advances in this research area in the last ten years show that the answer to that question is affirmative. Several papers address this problem by applying methods such as Local Binary Patterns (LBP), raw pixel values, Principal or Independent Component Analysis (PCA/ICA), Gabor filters, Biologically Inspired Features (BIF), and more recently, Convolution Neural Networks (CNN). In this work we propose to combine the Bag-of-Visual-Words model with new dictionary learning techniques and a new spatial structure approach for image features. An extensive set of experiments has been performed using two of the largest face image databases available (MORPH-II and FERET), reaching very competitive results for gender and ethnicity recognition, while using a considerable small set of images for training. / Pesquisadores de visão computacional são constantemente desafiados com perguntas motivadas por aplicações reais. Uma dessas questões é se um programa de computador poderia distinguir grupos de pessoas com base em sua ascendência geográfica, usando apenas imagens frontais de seus rostos. Os avanços nesta área de pesquisa nos últimos dez anos mostram que a resposta a essa pergunta é afirmativa. Vários artigos abordam esse problema aplicando métodos como Padrões Binários Locais (LBP), valores de pixels brutos, Análise de Componentes Principais ou Independentes (PCA/ICA), filtros de Gabor, Características Biologicamente Inspiradas (BIF) e, mais recentemente, Redes Neurais Convolucionais (CNN). Neste trabalho propomos combinar o modelo \"bag-of-words\" visual com novas técnicas de aprendizagem por dicionário e uma nova abordagem de estrutura espacial para características da imagem. Um extenso conjunto de experimentos foi realizado usando dois dos maiores bancos de dados de imagens faciais disponíveis (MORPH-II e FERET), alcançando resultados muito competitivos para reconhecimento de gênero e etnia, ao passo que utiliza um conjunto consideravelmente pequeno de imagens para treinamento.

Page generated in 0.054 seconds