Global ETD Search

1	Feature Fusion Deep Learning Method for Video and Audio Based Emotion Recognition Yanan Song (11825003) 20 December 2021 (has links) In this thesis, we proposed a deep learning based emotion recognition system in order to improve the successive classification rate. We first use transfer learning to extract visual features and use Mel frequency Cepstral Coefficients(MFCC) to extract audio features, and then apply the recurrent neural networks(RNN) with attention mechanism to process the sequential inputs. After that, the outputs of both channels are fused into a concatenate layer, which is processed using batch normalization, to reduce internal covariate shift. Finally, the classification result is obtained by the softmax layer. From our experiments, the video and audio subsystem achieve 78% and 77% respectively, and the feature fusion system with video and audio achieves 92% accuracy based on the RAVDESS dataset for eight emotion classes. Our proposed feature fusion system outperforms conventional methods in terms of classification prediction. Transfer learning deep learning Recurrent Neural Networks (RNNs) MFCC Emotion recognition
2	Temporal Localization of Representations in Recurrent Neural Networks Najam, Asadullah January 2023 (has links) Recurrent Neural Networks (RNNs) are pivotal in deep learning for time series prediction, but they suffer from 'exploding values' and 'gradient decay,' particularly when learning temporally distant interactions. Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU) have addressed these issues to an extent, but the precise mitigating mechanisms remain unclear. Moreover, the success of feedforward neural networks in time series tasks using an 'attention mechanism' raises questions about the solutions offered by LSTMs and GRUs. This study explores an alternative explanation for the challenges faced by RNNs in learning long-range correlations in the input data. Could the issue lie in the movement of the representations - how hidden nodes store and process information - across nodes instead of localization? Evidence presented suggests that RNNs can indeed possess "moving representations," with certain training conditions reducing this movement. These findings point towards the necessity of further research on localizing representations. Recurrent Neural Networks (RNNs) Deep Learning Time Series Prediction Exploding Values Gradient Decay Long Short-Term Memory (LSTMs) Gated Recurrent Units (GRUs) Attention Mechanism Moving Representations Localizing Representations Computer and Information Sciences Data- och informationsvetenskap

Search results

Feature Fusion Deep Learning Method for Video and Audio Based Emotion Recognition

Temporal Localization of Representations in Recurrent Neural Networks