Neural Networks are at the core of computer vision solutions for various applications. With the advent of deep neural networks Facial Expression Recognition (FER) has been a very ineluctable and challenging task in the field of computer vision. Micro-expressions (ME) have been quite prominently used in security, psychotherapy, neuroscience and have a wide role in several related disciplines. However, due to the subtle movements of facial muscles, the micro-expressions are difficult to detect and identify. Due to the above, emotion detection and classification have always been hot research topics. The recently adopted networks to train FERs are yet to focus on issues caused due to overfitting, effectuated by insufficient data for training and expression unrelated variations like gender bias, face occlusions and others. Association of FER with the Speech Emotion Recognition (SER) triggered the development of multimodal neural networks for emotion classification in which the application of sensors played a significant role as they substantially increased the accuracy by providing high quality inputs, further elevating the efficiency of the system. This thesis relates to the exploration of different principles behind application of deep neural networks with a strong focus towards Convolutional Neural Networks (CNN) and Generative Adversarial Networks (GAN) in regards to their applications to emotion recognition. A Motion Magnification algorithm for ME's detection and classification was implemented for applications requiring near real-time computations. A new and improved architecture using a Multimodal Network was implemented. In addition to the motion magnification technique for emotion classification and extraction, the Multimodal algorithm takes the audio-visual cues as inputs and reads the MEs on the real face of the participant. This feature of the above architecture can be deployed while administering interviews, or supervising ICU patients in hospitals, in the auto industry, and many others. The real-time emotion classifier based on state-of-the-art Image-Avatar Animation model was tested on simulated subjects. The salient features of the real-face are mapped on avatars that are build with a 3D scene generation platform. In pursuit of the goal of emotion classification, the Image Animation model outperforms all baselines and prior works. Extensive tests and results obtained demonstrate the validity of the approach.
Identifer | oai:union.ndltd.org:uottawa.ca/oai:ruor.uottawa.ca:10393/43292 |
Date | 14 February 2022 |
Creators | Ayyalasomayajula, Satya Chandrashekhar |
Contributors | Ionescu, Dan |
Publisher | Université d'Ottawa / University of Ottawa |
Source Sets | Université d’Ottawa |
Language | English |
Detected Language | English |
Type | Thesis |
Format | application/pdf |
Rights | Attribution 4.0 International, http://creativecommons.org/licenses/by/4.0/ |
Page generated in 0.0019 seconds