Spelling suggestions: "subject:"emotionation recognition""
111 |
Improving Music Mood Annotation Using Polygonal Circular RegressionDufour, Isabelle 31 August 2015 (has links)
Music mood recognition by machine continues to attract attention from both academia and industry. This thesis explores the hypothesis that the music emotion problem is circular, and is a primary step in determining the efficacy of circular regression as a machine learning method for automatic music mood recognition. This hypothesis is tested through experiments conducted using instances of the two commonly accepted models of affect used in machine learning (categorical and two-dimensional), as well as on an original circular model proposed by the author. Polygonal approximations of circular regression are proposed as a practical way to investigate whether the circularity of the annotations can be exploited. An original dataset assembled and annotated for the models is also presented. Next, the architecture and implementation choices of all three models are given, with an emphasis on the new polygonal approximations of circular regression. Experiments with different polygons demonstrate consistent and in some cases significant improvements over the categorical model on a dataset containing ambiguous extracts (ones for which the human annotators did not fully agree upon). Through a comprehensive analysis of the results, errors and inconsistencies observed, evidence is provided that mood recognition can be improved if approached as a circular problem. Finally, a proposed multi-tagging strategy based on the circular predictions is put forward as a pragmatic method to automatically annotate music based on the circular model. / Graduate / 0984 / 0800 / 0413 / zazz101@hotmail.com
|
112 |
Hotspot Detection for Automatic Podcast Trailer Generation / Hotspot-detektering för automatisk generering av podcast-trailersZhu, Winstead Xingran January 2021 (has links)
With podcasts being a fast growing audio-only form of media, an effective way of promoting different podcast shows becomes more and more vital to all the stakeholders concerned, including the podcast creators, the podcast streaming platforms, and the podcast listeners. This thesis investigates the relatively little studied topic of automatic podcast trailer generation, with the purpose of en- hancing the overall visibility and publicity of different podcast contents and gen- erating more user engagement in podcast listening. This thesis takes a hotspot- based approach, by specifically defining the vague concept of “hotspot” and designing different appropriate methods for hotspot detection. Different meth- ods are analyzed and compared, and the best methods are selected. The selected methods are then used to construct an automatic podcast trailer generation sys- tem, which consists of four major components and one schema to coordinate the components. The system can take a random podcast episode audio as input and generate an around 1 minute long trailer for it. This thesis also proposes two human-based podcast trailer evaluation approaches, and the evaluation results show that the proposed system outperforms the baseline with a large margin and achieves promising results in terms of both aesthetics and functionality.
|
113 |
Classification of Affective Emotion in Musical Themes : How to understand the emotional content of the soundtracks of the movies?Diaz Banet, Paula January 2021 (has links)
Music is created by composers to arouse different emotions and feelings in the listener, and in the case of soundtracks, to support the storytelling of scenes. The goal of this project is to seek the best method to evaluate the emotional content of soundtracks. This emotional content can be measured quantitatively thanks to Russell’s model of valence, arousal and dominance which converts moods labels into numbers. To conduct the analysis, MFCCs and VGGish features were extracted from the soundtracks and used as inputs to a CNN and an LSTM model, in order to study which one achieved a better prediction. A database of 6757 number of soundtracks with their correspondent VAD values was created to perform the mentioned analysis. As an ultimate purpose, the results of the experiments will contribute to the start-up Vionlabs to understand better the content of the movies and, therefore, make a more accurate recommendation on what users want to consume on Video on Demand platforms according to their emotions or moods. / Musik skapas av kompositörer för att väcka olika känslor och känslor hos lyssnaren, och när det gäller ljudspår, för att stödja berättandet av scener. Målet med detta projekt är att söka den bästa metoden för att utvärdera det emotionella innehållet i ljudspår. Detta känslomässiga innehåll kan mätas kvantitativt tack vare Russells modell av valens, upphetsning och dominans som omvandlar stämningsetiketter till siffror. För att genomföra analysen extraherades MFCC: er och VGGish-funktioner från ljudspåren och användes som ingångar till en CNN- och en LSTM-modell för att studera vilken som uppnådde en bättre förutsägelse. En databas med totalt 6757 ljudspår med deras korrespondent acrshort VAD-värden skapades för att utföra den nämnda analysen. Som ett yttersta syfte kommer resultaten av experimenten att bidra till att starta upp Vionlabs för att bättre förstå innehållet i filmerna och därför ge mer exakta rekommendationer på Video on Demand-plattformar baserat på användarnas känslor eller stämningar.
|
114 |
Sex differences in cognition in Alzheimer's diseaseIrvine, Karen January 2014 (has links)
Inspection of the published research shows that sex differences in cognition in the general population have been widely cited with the direction of the advantage depending on the domain being examined. The most prevalent claims are that men are better than women at visuospatial and mathematical tasks whereas women have superior verbal skills and perform better than men on tasks assessing episodic memory. There is also some evidence that women are more accurate than men at identifying facial expressions of emotion. A more in-depth examination of the literature, however, reveals that evidence of such differences is not as conclusive as would at first appear. Not only is the direction and magnitude of sex differences dependent on the cognitive domain but also on the individual tasks. Some visuospatial tasks show no difference (e.g. figure copying) whist men have been shown to be better than women at confrontation naming (a verbal task). Alzheimer’s disease is a heterogeneous illness that affects the elderly. It manifests with deficits in cognitive abilities and behavioural difficulties. It has been suggested that some of the behavioural issues may arise from difficulties with recognising facial emotion expressions. There have been claims that AD affects men and women differently: women have been reported as being more likely to develop AD and showing a greater dementia severity than men with equivalent neuropathology. Despite this, research into sex differences in cognition in AD is scarce, and conflicting. This research was concerned with the effect of sex on the cognitive abilities of AD patients. The relative performance of men and women with AD was compared to that of elderly controls. The study focused on the verbal, visuospatial and facial emotion recognition domains. Data was collected and analysed from 70 AD patients (33 male, 37 female), 62 elderly controls (31 male, 31 female) and 80 young adults (40 male, 40 female). Results showed those with AD demonstrate cognitive deficits compared to elderly controls in verbal and visuospatial tasks but not in the recognition of facial emotions. There were no significant sex differences in either the young adults or the healthy elderly controls but sex differences favouring men emerged in the AD group for figure copying and recall and for confrontation naming. Given that elderly men and women perform equivalently for these tasks, this represents a deterioration in women’s cognitive abilities, relative to men’s. Further evidence of such an adverse effect of AD was apparent in other tasks, too: for most verbal and visuospatial tasks, either an effect favouring women in the elderly is reversed or a male advantage increases in magnitude. There is no evidence of sex differences in facial emotion recognition for any group. This suggests that the lack of published findings reporting on sex differences in this domain is due to the difficulty in getting null findings accepted for publication. The scarcity of research examining sex differences in other domains is also likely to be due to this bias.
|
115 |
Avaliação da influência de emoções na tomada de decisão de sistemas computacionais. / Evaluation of the influence of emotions in decision-taking of computer systems.Gracioso, Ana Carolina Nicolosi da Rocha 17 March 2016 (has links)
Este trabalho avalia a influência das emoções humanas expressas pela mímica da face na tomada de decisão de sistemas computacionais, com o objetivo de melhorar a experiência do usuário. Para isso, foram desenvolvidos três módulos: o primeiro trata-se de um sistema de computação assistiva - uma prancha de comunicação alternativa e ampliada em versão digital. O segundo módulo, aqui denominado Módulo Afetivo, trata-se de um sistema de computação afetiva que, por meio de Visão Computacional, capta a mímica da face do usuário e classifica seu estado emocional. Este segundo módulo foi implementado em duas etapas, as duas inspiradas no Sistema de Codificação de Ações Faciais (FACS), que identifica expressões faciais com base no sistema cognitivo humano. Na primeira etapa, o Módulo Afetivo realiza a inferência dos estados emocionais básicos: felicidade, surpresa, raiva, medo, tristeza, aversão e, ainda, o estado neutro. Segundo a maioria dos pesquisadores da área, as emoções básicas são inatas e universais, o que torna o módulo afetivo generalizável a qualquer população. Os testes realizados com o modelo proposto apresentaram resultados 10,9% acima dos resultados que usam metodologias semelhantes. Também foram realizadas análises de emoções espontâneas, e os resultados computacionais aproximam-se da taxa de acerto dos seres humanos. Na segunda etapa do desenvolvimento do Módulo Afetivo, o objetivo foi identificar expressões faciais que refletem a insatisfação ou a dificuldade de uma pessoa durante o uso de sistemas computacionais. Assim, o primeiro modelo do Módulo Afetivo foi ajustado para este fim. Por fim, foi desenvolvido um Módulo de Tomada de Decisão que recebe informações do Módulo Afetivo e faz intervenções no Sistema Computacional. Parâmetros como tamanho do ícone, arraste convertido em clique e velocidade de varredura são alterados em tempo real pelo Módulo de Tomada de Decisão no sistema computacional assistivo, de acordo com as informações geradas pelo Módulo Afetivo. Como o Módulo Afetivo não possui uma etapa de treinamento para inferência do estado emocional, foi proposto um algoritmo de face neutra para resolver o problema da inicialização com faces contendo emoções. Também foi proposto, neste trabalho, a divisão dos sinais faciais rápidos entre sinais de linha base (tique e outros ruídos na movimentação da face que não se tratam de sinais emocionais) e sinais emocionais. Os resultados dos Estudos de Caso realizados com os alunos da APAE de Presidente Prudente demonstraram que é possível melhorar a experiência do usuário, configurando um sistema computacional com informações emocionais expressas pela mímica da face. / The influence of human emotions expressed by facial mimics in decision-taking of computer systems is analyzed to improve user´s experience. Three modules were developed: the first module comprises a system of assistive computation - a digital alternative and amplified communication board. The second module, called the Affective Module, is a system of affective computation which, through a Computational Vision, retrieves the user\'s facial mimic and classifies their emotional state. The second module was implemented in two stages derived from the Facial Action Codification System (FACS) which identifies facial expressions based on the human cognitive system. In the first stage, the Affective Module infers the basic emotional stages, namely, happiness, surprise, anger, fear, sadness, disgust, and the neutral state. According to most researchers, basic emotions are innate and universal. Thus, the affective module is common to any population. Tests undertaken with the suggested model provided results which were 10.9% above those that employ similar methodologies. Spontaneous emotions were also undertaken and computer results were close to human score rates. The second stage for the development of the Affective Module, facial expressions that reflect dissatisfaction or difficulties during the use of computer systems were identified. The first model of the Affective Module was adjusted to this end. A Decision-taking Module which receives information from the Affective Module and intervenes in the Computer System was developed. Parameters such as icon size, draw transformed into a click, and scanning speed are changed in real time by the Decision-taking Module in the assistive computer system, following information by the Affective Module. Since the Affective Module does not have a training stage to infer the emotional stage, a neutral face algorithm has been suggested to solve the problem of initialing with emotion-featuring faces. Current assay also suggests the distinction between quick facial signals among the base signs (a click or any other sound in the face movement which is not an emotional sign) and emotional signs. Results from Case Studies with APAE children in Presidente Prudente SP Brazil showed that user´s experience may be improved through a computer system with emotional information expressed by facial mimics.
|
116 |
Modelagem computacional para reconhecimento de emoções baseada na análise facial / Computational modeling for emotion recognition based on facial analysisLibralon, Giampaolo Luiz 24 November 2014 (has links)
As emoções são objeto de estudo não apenas da psicologia, mas também de diversas áreas como filosofia, psiquiatria, biologia, neurociências e, a partir da segunda metade do século XX, das ciências cognitivas. Várias teorias e modelos emocionais foram propostos, mas não existe consenso quanto à escolha de uma ou outra teoria ou modelo. Neste sentido, diversos pesquisadores argumentam que existe um conjunto de emoções básicas que foram preservadas durante o processo evolutivo, pois servem a propósitos específicos. Porém, quantas e quais são as emoções básicas aceitas ainda é um tópico em discussão. De modo geral, o modelo de emoções básicas mais difundido é o proposto por Paul Ekman, que afirma a existência de seis emoções: alegria, tristeza, medo, raiva, aversão e surpresa. Estudos também indicam que existe um pequeno conjunto de expressões faciais universais capaz de representar as seis emoções básicas. No contexto das interações homem-máquina, o relacionamento entre ambos vem se tornando progressivamente natural e social. Desta forma, à medida que as interfaces evoluem, a capacidade de interpretar sinais emocionais de interlocutores e reagir de acordo com eles de maneira apropriada é um desafio a ser superado. Embora os seres humanos utilizem diferentes maneiras para expressar emoções, existem evidências de que estas são mais precisamente descritas por expressões faciais. Assim, visando obter interfaces que propiciem interações mais realísticas e naturais, nesta tese foi desenvolvida uma modelagem computacional, baseada em princípios psicológicos e biológicos, que simula o sistema de reconhecimento emocional existente nos seres humanos. Diferentes etapas são utilizadas para identificar o estado emocional: a utilização de um mecanismo de pré-atenção visual, que rapidamente interpreta as prováveis emoções, a detecção das características faciais mais relevantes para o reconhecimento das expressões emocionais identificadas, e a análise de características geométricas da face para determinar o estado emocional final. Vários experimentos demonstraram que a modelagem proposta apresenta taxas de acerto elevadas, boa capacidade de generalização, e permite a interpretabilidade das características faciais encontradas. / Emotions are the object of study not only of psychology, but also of various research areas such as philosophy, psychiatry, biology, neuroscience and, from the second half of the twentieth century, the cognitive sciences. A number of emotional theories and models have been proposed, but there is no consensus on the choice of one or another of these models or theories. In this sense, several researchers argue that there is a set of basic emotions that have been preserved during the evolutionary process because they serve specific purposes. However, it is still a topic for discussion how many and which the accepted basic emotions are. In general, the model of basic emotions proposed by Paul Ekman, which asserts the existence of six emotions - happiness, sadness, fear, anger, disgust and surprise, is the most popular. Studies also indicate the existence of a small set of universal facial expressions related to the six basic emotions. In the context of human-machine interactions, the relationship between human beings and machines is becoming increasingly natural and social. Thus, as the interfaces evolve, the ability to interpret emotional signals of interlocutors and to react accordingly in an appropriate manner is a challenge to surpass. Even though emotions are expressed in different ways by human beings, there is evidence that they are more accurately described by facial expressions. In order to obtain interfaces that allow more natural and realistic interactions, a computational modeling based on psychological and biological principles was developed to simulate the emotional recognition system existing in human beings. It presents distinct steps to identify an emotional state: the use of a preattentive visual mechanism, which quickly interprets the most likely emotions, the detection of the most important facial features for recognition of the identified emotional expressions, and the analysis of geometric facial features to determine the final emotional state. A number of experiments demonstrated that the proposed computational modeling achieves high accuracy rates, good generalization performance, and allows the interpretability of the facial features revealed.
|
117 |
Recognizing emotions in spoken dialogue with acoustic and lexical cuesTian, Leimin January 2018 (has links)
Automatic emotion recognition has long been a focus of Affective Computing. It has become increasingly apparent that awareness of human emotions in Human-Computer Interaction (HCI) is crucial for advancing related technologies, such as dialogue systems. However, performance of current automatic emotion recognition is disappointing compared to human performance. Current research on emotion recognition in spoken dialogue focuses on identifying better feature representations and recognition models from a data-driven point of view. The goal of this thesis is to explore how incorporating prior knowledge of human emotion recognition in the automatic model can improve state-of-the-art performance of automatic emotion recognition in spoken dialogue. Specifically, we study this by proposing knowledge-inspired features representing occurrences of disfluency and non-verbal vocalisation in speech, and by building a multimodal recognition model that combines acoustic and lexical features in a knowledge-inspired hierarchical structure. In our study, emotions are represented with the Arousal, Expectancy, Power, and Valence emotion dimensions. We build unimodal and multimodal emotion recognition models to study the proposed features and modelling approach, and perform emotion recognition on both spontaneous and acted dialogue. Psycholinguistic studies have suggested that DISfluency and Non-verbal Vocalisation (DIS-NV) in dialogue is related to emotions. However, these affective cues in spoken dialogue are overlooked by current automatic emotion recognition research. Thus, we propose features for recognizing emotions in spoken dialogue which describe five types of DIS-NV in utterances, namely filled pause, filler, stutter, laughter, and audible breath. Our experiments show that this small set of features is predictive of emotions. Our DIS-NV features achieve better performance than benchmark acoustic and lexical features for recognizing all emotion dimensions in spontaneous dialogue. Consistent with Psycholinguistic studies, the DIS-NV features are especially predictive of the Expectancy dimension of emotion, which relates to speaker uncertainty. Our study illustrates the relationship between DIS-NVs and emotions in dialogue, which contributes to Psycholinguistic understanding of them as well. Note that our DIS-NV features are based on manual annotations, yet our long-term goal is to apply our emotion recognition model to HCI systems. Thus, we conduct preliminary experiments on automatic detection of DIS-NVs, and on using automatically detected DIS-NV features for emotion recognition. Our results show that DIS-NVs can be automatically detected from speech with stable accuracy, and auto-detected DIS-NV features remain predictive of emotions in spontaneous dialogue. This suggests that our emotion recognition model can be applied to a fully automatic system in the future, and holds the potential to improve the quality of emotional interaction in current HCI systems. To study the robustness of the DIS-NV features, we conduct cross-corpora experiments on both spontaneous and acted dialogue. We identify how dialogue type influences the performance of DIS-NV features and emotion recognition models. DIS-NVs contain additional information beyond acoustic characteristics or lexical contents. Thus, we study the gain of modality fusion for emotion recognition with the DIS-NV features. Previous work combines different feature sets by fusing modalities at the same level using two types of fusion strategies: Feature-Level (FL) fusion, which concatenates feature sets before recognition; and Decision-Level (DL) fusion, which makes the final decision based on outputs of all unimodal models. However, features from different modalities may describe data at different time scales or levels of abstraction. Moreover, Cognitive Science research indicates that when perceiving emotions, humans make use of information from different modalities at different cognitive levels and time steps. Therefore, we propose a HierarchicaL (HL) fusion strategy for multimodal emotion recognition, which incorporates features that describe data at a longer time interval or which are more abstract at higher levels of its knowledge-inspired hierarchy. Compared to FL and DL fusion, HL fusion incorporates both inter- and intra-modality differences. Our experiments show that HL fusion consistently outperforms FL and DL fusion on multimodal emotion recognition in both spontaneous and acted dialogue. The HL model combining our DIS-NV features with benchmark acoustic and lexical features improves current performance of multimodal emotion recognition in spoken dialogue. To study how other emotion-related tasks of spoken dialogue can benefit from the proposed approaches, we apply the DIS-NV features and the HL fusion strategy to recognize movie-induced emotions. Our experiments show that although designed for recognizing emotions in spoken dialogue, DIS-NV features and HL fusion remain effective for recognizing movie-induced emotions. This suggests that other emotion-related tasks can also benefit from the proposed features and model structure.
|
118 |
Um modelo para inferência do estado emocional baseado em superfícies emocionais dinâmicas planares. / A model for facial emotion inference based on planar dynamic emotional surfaces.Ruivo, João Pedro Prospero 21 November 2017 (has links)
Emoções exercem influência direta sobre a vida humana, mediando a maneira como os indivíduos interagem e se relacionam, seja em âmbito pessoal ou social. Por essas razões, o desenvolvimento de interfaces homem-máquina capazes de manter interações mais naturais e amigáveis com os seres humanos se torna importante. No desenvolvimento de robôs sociais, assunto tratado neste trabalho, a adequada interpretação do estado emocional dos indivíduos que interagem com os robôs é indispensável. Assim, este trabalho trata do desenvolvimento de um modelo matemático para o reconhecimento do estado emocional humano por meio de expressões faciais. Primeiramente, a face humana é detectada e rastreada por meio de um algoritmo; então, características descritivas são extraídas da mesma e são alimentadas no modelo de reconhecimento de estados emocionais desenvolvidos, que consiste de um classificador de emoções instantâneas, um filtro de Kalman e um classificador dinâmico de emoções, responsável por fornecer a saída final do modelo. O modelo é otimizado através de um algoritmo de têmpera simulada e é testado sobre diferentes bancos de dados relevantes, tendo seu desempenho medido para cada estado emocional considerado. / Emotions have direct influence on the human life and are of great importance in relationships and in the way interactions between individuals develop. Because of this, they are also important for the development of human-machine interfaces that aim to maintain natural and friendly interactions with its users. In the development of social robots, which this work aims for, a suitable interpretation of the emotional state of the person interacting with the social robot is indispensable. The focus of this work is the development of a mathematical model for recognizing emotional facial expressions in a sequence of frames. Firstly, a face tracker algorithm is used to find and keep track of a human face in images; then relevant information is extracted from this face and fed into the emotional state recognition model developed in this work, which consists of an instantaneous emotional expression classifier, a Kalman filter and a dynamic classifier, which gives the final output of the model. The model is optimized via a simulated annealing algorithm and is experimented on relevant datasets, having its performance measured for each of the considered emotional states.
|
119 |
The Effect of a Social Communication Intervention on the Correct Production of Emotion Words in Children with Language ImpairmentLuddington, Annelise 01 June 2018 (has links)
Children diagnosed with Language Impairment (LI) often have difficulty with aspects of social communication. This thesis evaluates the effects of a social communication intervention focused on facilitating the correct production of emotion words in four elementary school-aged children with LI. Researchers monitored changes from pretreatment baseline data, through the intervention, and ended with posttreatment follow-up data for the emotions happiness, surprise, fear, anger, sadness, and disgust. Based on baseline measures, emotion categories in which the child showed limited proficiency were targeted for the 20 intervention sessions. The emotions targeted were different for each child. Each intervention session contained a combination of storybook therapeutic strategies such as story enactment, story sharing, and modeling by the clinician to help increase the child's emotion understanding. The child participated in emotion recognition and emotion inferencing tasks. The data for each participant was analyzed individually and formatted into figures. Data analyzation was performed using percentage of non-overlapping data (PND) which provided insight into how successful the intervention was for each of the targeted emotions. The results of each child's emotion based words were varied, some participants making good progress and others showing little or no gains. These results suggest that the intervention was effective for some of the children and should continued to be refined.
|
120 |
Development of Body Emotion Perception in Infancy: From Discrimination to RecognitionHeck, Alison, Chroust, Alyson, White, Hannah, Jubran, Rachel, Bhatt, Ramesh S. 01 February 2018 (has links)
Research suggests that infants progress from discrimination to recognition of emotions in faces during the first half year of life. It is whether the perception of emotions from bodies develops in a similar manner. In the current study, when presented with happy and angry body videos and voices, 5-month-olds looked longer at the matching video when they were presented upright but not when they were inverted. In contrast, 3.5-month-olds failed to match even with upright videos. Thus, 5-month-olds but not 3.5-month-olds exhibited evidence of recognition of emotions from bodies by demonstrating intermodal matching. In a subsequent experiment, younger infants did discriminate between body emotion videos but failed to exhibit an inversion effect, suggesting that discrimination may be based on low-level stimulus features. These results document a developmental change from discrimination based on non-emotional information at 3.5 months to recognition of body emotions at 5 months. This pattern of development is similar to face emotion knowledge development and suggests that both the face and body emotion perception systems develop rapidly during the first half year of life.
|
Page generated in 0.1437 seconds