Global ETD Search

1	Recurrent Spatial Attention for Facial Emotion Recognition Forch, Valentin, Vitay, Julien, Hamker, Fred H. 15 October 2020 (has links) Automatic processing of emotion information through deep neural networks (DNN) can have great benefits (e.g., for human-machine interaction). Vice versa, machine learning can profit from concepts known from human information processing (e.g., visual attention). We employed a recurrent DNN incorporating a spatial attention mechanism for facial emotion recognition (FER) and compared the output of the network with results from human experiments. The attention mechanism enabled the network to select relevant face regions to achieve state-of-the-art performance on a FER database containing images from realistic settings. A visual search strategy showing some similarities with human saccading behavior emerged when the model’s perceptive capabilities were restricted. However, the model then failed to form a useful scene representation. emotion recognition, attention, LSTM info:eu-repo/classification/ddc/004.152 ddc:004.152 Deep Learning
2	Multi-objective optimization for model selection in music classification / Flermålsoptimering för modellval i musikklassificering Ujihara, Rintaro January 2021 (has links) With the breakthrough of machine learning techniques, the research concerning music emotion classification has been getting notable progress combining various audio features and state-of-the-art machine learning models. Still, it is known that the way to preprocess music samples and to choose which machine classification algorithm to use depends on data sets and the objective of each project work. The collaborating company of this thesis, Ichigoichie AB, is currently developing a system to categorize music data into positive/negative classes. To enhance the accuracy of the existing system, this project aims to figure out the best model through experiments with six audio features (Mel spectrogram, MFCC, HPSS, Onset, CENS, Tonnetz) and several machine learning models including deep neural network models for the classification task. For each model, hyperparameter tuning is performed and the model evaluation is carried out according to pareto optimality with regard to accuracy and execution time. The results show that the most promising model accomplished 95% correct classification with an execution time of less than 15 seconds. / I och med genombrottet av maskininlärningstekniker har forskning kring känsloklassificering i musik sett betydande framsteg genom att kombinera olikamusikanalysverktyg med nya maskinlärningsmodeller. Trots detta är hur man förbehandlar ljuddatat och valet av vilken maskinklassificeringsalgoritm som ska tillämpas beroende på vilken typ av data man arbetar med samt målet med projektet. Denna uppsats samarbetspartner, Ichigoichie AB, utvecklar för närvarande ett system för att kategorisera musikdata enligt positiva och negativa känslor. För att höja systemets noggrannhet är målet med denna uppsats att experimentellt hitta bästa modellen baserat på sex musik-egenskaper (Mel-spektrogram, MFCC, HPSS, Onset, CENS samt Tonnetz) och ett antal olika maskininlärningsmodeller, inklusive Deep Learning-modeller. Varje modell hyperparameteroptimeras och utvärderas enligt paretooptimalitet med hänsyn till noggrannhet och beräkningstid. Resultaten visar att den mest lovande modellen uppnådde 95% korrekt klassificering med en beräkningstid på mindre än 15 sekunder. Music emotion recognition Mel spectrogram MFCC CENS Onset Tonnetz HPSS 1D convolutional neural network Attention LSTM 1DCNN BiLSTM Pareto optimality Mathematics Matematik

Search results

Recurrent Spatial Attention for Facial Emotion Recognition

Multi-objective optimization for model selection in music classification / Flermålsoptimering för modellval i musikklassificering