Return to search

Multimodal Emotion Recognition Using Temporal Convolutional Networks

Over the past decade, the field of affective computing has received increasing attention. With advancements in machine learning, a wide range of methodologies have been developed to better understand human emotions. However, one of the major challenges in this field is accurately modeling emotions on a set of continuous dimensions, such as arousal and valence. This type of modeling is essential to represent complex and subtle emotions, and to capture the full spectrum of human emotional experiences. Additionally, predicting changes in emotions across time series adds another layer of complexity, as emotions can shift continuously.
Our work addresses these challenges using a dataset that includes natural and spontaneous emotions from diverse individuals. We extract multiple features from different modalities, including audio, video, and text, and use them to predict emotions across three axes: arousal, valence, and liking. To achieve this, we employ deep features and multiple fusion techniques to combine the modalities. Our results demonstrate that temporal convolutional networks outperform long short-term memory models in multimodal emotion prediction.
Overall, our research contributes to advancing the field of affective computing by developing more accurate and comprehensive methods for modeling and predicting human emotions.

Identiferoai:union.ndltd.org:uottawa.ca/oai:ruor.uottawa.ca:10393/45175
Date19 July 2023
CreatorsHarb, Hussein
ContributorsAl Osman, Hussein
PublisherUniversité d'Ottawa / University of Ottawa
Source SetsUniversité d’Ottawa
LanguageEnglish
Detected LanguageEnglish
TypeThesis
Formatapplication/pdf

Page generated in 0.0017 seconds