Spelling suggestions: "subject:"recurrent beural betworks (RNNs)"" "subject:"recurrent beural conetworks (RNNs)""
1 |
Feature Fusion Deep Learning Method for Video and Audio Based Emotion RecognitionYanan Song (11825003) 20 December 2021 (has links)
In this thesis, we proposed a deep learning based emotion recognition system in order
to improve the successive classification rate. We first use transfer learning to extract visual
features and use Mel frequency Cepstral Coefficients(MFCC) to extract audio features, and
then apply the recurrent neural networks(RNN) with attention mechanism to process the
sequential inputs. After that, the outputs of both channels are fused into a concatenate layer,
which is processed using batch normalization, to reduce internal covariate shift. Finally, the
classification result is obtained by the softmax layer. From our experiments, the video and
audio subsystem achieve 78% and 77% respectively, and the feature fusion system with
video and audio achieves 92% accuracy based on the RAVDESS dataset for eight emotion
classes. Our proposed feature fusion system outperforms conventional methods in terms of
classification prediction.
|
2 |
Temporal Localization of Representations in Recurrent Neural NetworksNajam, Asadullah January 2023 (has links)
Recurrent Neural Networks (RNNs) are pivotal in deep learning for time series prediction, but they suffer from 'exploding values' and 'gradient decay,' particularly when learning temporally distant interactions. Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU) have addressed these issues to an extent, but the precise mitigating mechanisms remain unclear. Moreover, the success of feedforward neural networks in time series tasks using an 'attention mechanism' raises questions about the solutions offered by LSTMs and GRUs. This study explores an alternative explanation for the challenges faced by RNNs in learning long-range correlations in the input data. Could the issue lie in the movement of the representations - how hidden nodes store and process information - across nodes instead of localization? Evidence presented suggests that RNNs can indeed possess "moving representations," with certain training conditions reducing this movement. These findings point towards the necessity of further research on localizing representations.
|
Page generated in 0.0594 seconds