Global ETD Search

21	Towards Affective Vision and Language Haydarov, Kilichbek 30 November 2021 (has links) Developing intelligent systems that can recognize and express human affects is essential to bridge the gap between human and artificial intelligence. This thesis explores the creative and emotional frontiers of artificial intelligence. Specifically, in this thesis, we investigate the relation between the affective impact of visual stimuli and natural language by collecting and analyzing a new dataset called ArtEmis. Furthermore, capitalizing on this dataset, we demonstrate affective AI models that can emotionally talk about artwork and generate them given their affective descriptions. In text-to-image generation task, we present HyperCGAN: a conceptually simple and general approach for text-to-image synthesis that uses hypernetworks to condition a GAN model on text. In our setting, the generator and the discriminator weights are controlled by their corresponding hypernetworks, which modulate weight parameters based on the provided text query. We explore different mechanisms to modulate the layers depending on the underlying architecture of a target network and the structure of the conditioning variable. Affective Computing Text-to-Image Generation Image Captioning
22	Hjälpmedel för att tydliggöra känslor hos personer med AST Yngström, Karl January 2018 (has links) Smarttelefoner är som kommunikationsverktyg mycket kraftfullt och tillåter kommunikationpå nya sätt. Att använda teknologiska hjälpmedel för att assistera människor med psykiskafunktionshinder är på framgång. Autismspektrum tillstånd, eller AST, är ett sådantfunktionshinder som manifesterar sig olika på olika människor, men ett genomgående dragär oförmåga att förstå känslor. Att mäta känslor är inte något som enkelt låter sig göras och iperioder har forskningsområdet för känslor varit eftersatt då det inte anses tillräckligtlogiskt.Denna uppsats använder Russels modell för känslor och core affect, den kartlägger känslorutifrån två korslagda axlar, aktivitet och valens (positiv - negativ).Syftet med studien är evaluera olika metoder för att mäta och registrera känslor hosmänniskor som har AST på ett enkelt och tillgängligt sätt. Detta görs genom existerandemodeller för känslor och använder en smarttelefon som verktyg och skall vara till hjälp i detdagliga livet för människor med AST och de människor som finns runt dem. / Smartphones are a powerful tool to facilitate communication in new ways and research onusing technology to assist people with various mental disabilities is a growing field. AutismSpectrum Disorder or ASD is one such disability, which will manifest differently for differentpeople, but one general theme is the inability to understand emotion. Measuring emotion issomething not easily done and for some time research into emotion has been overlooked infavor of more logical thought processes.This paper uses Russell's model for emotion and core affect, it maps emotion based on twocrossed axis, activation and valence (positive – negative).The purpose of this study is to evaluate various methods for measuring and registeringemotion for people with ASD in a simple cheap and accessible way. This is done based onexisting models of emotion and using a smartphone as a tool, and should be helpful in thedaily life of people with ASD and people around them. Android Affective Computing Human Computer Interaction Engineering and Technology Teknik och teknologier
23	Apprentissage neuronal profond pour l'analyse de contenus multimodaux et temporels / Deep learning for multimodal and temporal contents analysis Vielzeuf, Valentin 19 November 2019 (has links) Notre perception est par nature multimodale, i.e. fait appel à plusieurs de nos sens. Pour résoudre certaines tâches, il est donc pertinent d’utiliser différentes modalités, telles que le son ou l’image.Cette thèse s’intéresse à cette notion dans le cadre de l’apprentissage neuronal profond. Pour cela, elle cherche à répondre à une problématique en particulier : comment fusionner les différentes modalités au sein d’un réseau de neurones ?Nous proposons tout d’abord d’étudier un problème d’application concret : la reconnaissance automatique des émotions dans des contenus audio-visuels.Cela nous conduit à différentes considérations concernant la modélisation des émotions et plus particulièrement des expressions faciales. Nous proposons ainsi une analyse des représentations de l’expression faciale apprises par un réseau de neurones profonds.De plus, cela permet d’observer que chaque problème multimodal semble nécessiter l’utilisation d’une stratégie de fusion différente.C’est pourquoi nous proposons et validons ensuite deux méthodes pour obtenir automatiquement une architecture neuronale de fusion efficace pour un problème multimodal donné, la première se basant sur un modèle central de fusion et ayant pour visée de conserver une certaine interprétation de la stratégie de fusion adoptée, tandis que la seconde adapte une méthode de recherche d'architecture neuronale au cas de la fusion, explorant un plus grand nombre de stratégies et atteignant ainsi de meilleures performances.Enfin, nous nous intéressons à une vision multimodale du transfert de connaissances. En effet, nous détaillons une méthode non traditionnelle pour effectuer un transfert de connaissances à partir de plusieurs sources, i.e. plusieurs modèles pré-entraînés. Pour cela, une représentation neuronale plus générale est obtenue à partir d’un modèle unique, qui rassemble la connaissance contenue dans les modèles pré-entraînés et conduit à des performances à l'état de l'art sur une variété de tâches d'analyse de visages. / Our perception is by nature multimodal, i.e. it appeals to many of our senses. To solve certain tasks, it is therefore relevant to use different modalities, such as sound or image.This thesis focuses on this notion in the context of deep learning. For this, it seeks to answer a particular problem: how to merge the different modalities within a deep neural network?We first propose to study a problem of concrete application: the automatic recognition of emotion in audio-visual contents.This leads us to different considerations concerning the modeling of emotions and more particularly of facial expressions. We thus propose an analysis of representations of facial expression learned by a deep neural network.In addition, we observe that each multimodal problem appears to require the use of a different merge strategy.This is why we propose and validate two methods to automatically obtain an efficient fusion neural architecture for a given multimodal problem, the first one being based on a central fusion network and aimed at preserving an easy interpretation of the adopted fusion strategy. While the second adapts a method of neural architecture search in the case of multimodal fusion, exploring a greater number of strategies and therefore achieving better performance.Finally, we are interested in a multimodal view of knowledge transfer. Indeed, we detail a non-traditional method to transfer knowledge from several sources, i.e. from several pre-trained models. For that, a more general neural representation is obtained from a single model, which brings together the knowledge contained in the pre-trained models and leads to state-of-the-art performances on a variety of facial analysis tasks. Données Multimodales Deep Learning Multimodal Data Affective Computing Transfer Learning
24	A Machine Learning Approach to EEG-Based Emotion Recognition Jamil, Sara January 2018 (has links) In recent years, emotion classification using electroencephalography (EEG) has attracted much attention with the rapid development of machine learning techniques and various applications of brain-computer interfacing. In this study, a general model for emotion recognition was created using a large dataset of 116 participants' EEG responses to happy and fearful videos. We compared discrete and dimensional emotion models, assessed various popular feature extraction methods, evaluated the efficacy of feature selection algorithms, and examined the performance of 2 classification algorithms. An average test accuracy of 76% was obtained using higher-order spectral features with a support vector machine for discrete emotion classification. An accuracy of up 79% was achieved on the subset of classifiable participants. Finally, the stability of EEG patterns in emotion recognition was examined over time by evaluating consistency across sessions. / Thesis / Master of Science (MSc) Electroencephalography (EEG) Emotion Recognition Machine Learning Affective Computing
25	A Study of Methods in Computational Psychophysiology for Incorporating Implicit Affective Feedback in Intelligent Environments Saha, Deba Pratim 01 August 2018 (has links) Technological advancements in sensor miniaturization, processing power and faster networks has broadened the scope of our contemporary compute-infrastructure to an extent that Context-Aware Intelligent Environment (CAIE)--physical spaces with computing systems embedded in it--are increasingly commonplace. With the widespread adoption of intelligent personal agents proliferating as close to us as our living rooms, there is a need to rethink the human-computer interface to accommodate some of their inherent properties such as multiple focus of interaction with a dynamic set of devices and limitations such as lack of a continuous coherent medium of interaction. A CAIE provides context-aware services to aid in achieving user's goals by inferring their instantaneous context. However, often due to lack of complete understanding of a user's context and goals, these services may be inappropriate or at times even pose hindrance in achieving user's goals. Determining service appropriateness is a critical step in implementing a reliable and robust CAIE. Explicitly querying the user to gather such feedback comes at the cost of user's cognitive resources in addition to defeating the purpose of designing a CAIE to provide automated services. The CAIE may, however, infer this appropriateness implicitly from the user, by observing and sensing various behavioral cues and affective reactions from the user, thereby seamlessly gathering such user-feedback. In this dissertation, we have studied the design space for incorporating user's affective reactions to the intelligent services, as a mode of implicit communication between the user and the CAIE. As a result, we have introduced a framework named CAfFEINE, acronym for Context-aware Affective Feedback in Engineering Intelligent Naturalistic Environments. The CAfFEINE framework encompasses models, methods and algorithms establishing the validity of the idea of using a physiological-signal based affective feedback loop in conveying service appropriateness in a CAIE. In doing so, we have identified methods of learning ground-truth about an individual user's affective reactions as well as introducing a novel algorithm of estimating a physiological signal based quality-metric for our inferences. To evaluate the models and methods presented in the CAfFEINE framework, we have designed a set of experiments in laboratory-mockups and virtual-reality setup, providing context aware services to the users, while collecting their physiological signals from wearable sensors. Our results provide empirical validation for our CAfFEINE framework, as well as point towards certain guidelines for conducting future research extending this novel idea. Overall, this dissertation contributes by highlighting the symbiotic nature of the subfields of Affective Computing and Context-aware Computing and by identifying models, proposing methods and designing algorithms that may help accentuate this relationship making future intelligent environments more human-centric. / Ph. D. / Physical spaces containing intelligent computing agents have become an increasingly commonplace concept. These systems when populating a physical space, provides intelligent services by inferring user’s immediate needs, they are called intelligent environments. With this widespread adoption of intelligent systems, there is a need to design computer interfaces that focuses on the human user’s responses. In order for this service-delivery interaction to feel natural, these interfaces need to sense a user’s disapproval of a wrong service, without the user actively indicating so. It is imperative that implicitly inferring a user’s disapproval of a service by observing and sensing various behavioral cues from the user, will help in making the computing system cognitively disappear into the background. In this dissertation, we have studied the design space for incorporating user’s affective reactions to the intelligent services, as a mode of implicit communication between the user and the intelligent system. As a result, we have introduced an interaction framework named CAfFEINE, acronym for Context-aware Affective Feedback in Engineering Intelligent Naturalistic Environments. The CAfFEINE framework encompasses models, methods and algorithms exploring the validity of the idea of using physiological signal based affective feedback in intelligent environments. To evaluate the models and algorithms, we have designed a set of experimental protocols and conducted user studies in virtual-reality setup. The results from these user studies demonstrate the feasibility of this novel idea, in addition to proposing new methods of evaluating the quality of underlying physiological signals. Overall, this dissertation contributes by highlighting the symbiotic nature of the subfields of Affective Computing and Context-aware Computing and by identifying models, proposing methods and designing algorithms that may help accentuate this relationship making future intelligent environments more human-centric. Affective Computing Computational Psychophysiology Intelligent Environments Context-Awareness Signal Processing
26	Determining Correlation Between Video Stimulus and Electrodermal Activity Tasooji, Reza 06 August 2018 (has links) With the growth of wearable devices capable of measuring physiological signals, affective computing is becoming more popular than before that gradually will remove our cognitive approach. One of the physiological signals is the electrodermal activities (EDA) signal. We explore how video stimulus that might arouse fear affect the EDA signal. To better understand EDA signal, two different medians, a scene from a movie and a scene from a video game, were selected to arouse fear. We conducted a user study with 20 participants and analyzed the differences between medians and proposed a method capable of detecting the highlights of the stimulus using only EDA signals. The study results show that there are no significant differences between two medians except that users are more engaged with the content of the video game. From gathered data, we propose a similarity measurement method for clustering different users based on how common they reacted to different highlights. The result shows for 300 seconds stimulus, using a window size of 10 seconds, our approach for detecting highlights of the stimulus has the precision of one for both medians, and F1 score of 0.85 and 0.84 for movie and video game respectively. / Master of Science / In this work, we explore different approaches to analyze and cluster EDA signals. Two different medians, a scene from a movie and a scene from a video game, were selected to arouse fear. By conducting a user study with 20 participants, we analyzed the differences between two medians and proposed a method capable of detecting highlights of the video clip using only EDA signals. The result of the study, shows there are no significant differences between two medians except that users are more engaged to the content of the video game. From gathered data, we propose a similarity measurement method for clustering different user based on how common they reacted to different highlights. Emotion Recognition Affective Computing Human Computer Interaction Electrodermal Activity Fear
27	Comments as reviews: Predicting answer acceptance by measuring sentiment on stack exchange William Chase Ledbetter IV (12261440) 16 June 2023 (has links) <p>Online communication has increased the need to rapidly interpret complex emotions due to the volatility of the data involved; machine learning tasks that process text, such as sentiment analysis, can help address this challenge by automatically classifying text as positive, negative, or neutral. However, while much research has focused on detecting offensive or toxic language online, there is also a need to explore and understand the ways in which people express positive emotions and support for one another in online communities. This is where sentiment dictionaries and other computational methods can be useful, by analyzing the language used to express support and identifying common patterns or themes.</p> <p><br></p> <p>This research was conducted by compiling data from social question and answering around machine learning on the site Stack Exchange. Then a classification model was constructed using binary logistic regression. The objective was to discover whether predictions of marked solutions are accurate by treating the comments as reviews. Measuring collaboration signals may help capture the nuances of language around support and assistance, which could have implications for how people understand and respond to expressions of help online. By exploring this topic further, researchers can gain a more complete understanding of the ways in which people communicate and connect online.</p> Natural language processing Affective computing natural language processing affective computing relational database systems question and answer polarity determination
28	Automatic prediction of emotions induced by movies / Reconnaissance automatique des émotions induites par les films Baveye, Yoann 12 November 2015 (has links) Jamais les films n’ont été aussi facilement accessibles aux spectateurs qui peuvent profiter de leur potentiel presque sans limite à susciter des émotions. Savoir à l’avance les émotions qu’un film est susceptible d’induire à ses spectateurs pourrait donc aider à améliorer la précision des systèmes de distribution de contenus, d’indexation ou même de synthèse des vidéos. Cependant, le transfert de cette expertise aux ordinateurs est une tâche complexe, en partie due à la nature subjective des émotions. Cette thèse est donc dédiée à la détection automatique des émotions induites par les films, basée sur les propriétés intrinsèques du signal audiovisuel. Pour s’atteler à cette tâche, une base de données de vidéos annotées selon les émotions induites aux spectateurs est nécessaire. Cependant, les bases de données existantes ne sont pas publiques à cause de problèmes de droit d’auteur ou sont de taille restreinte. Pour répondre à ce besoin spécifique, cette thèse présente le développement de la base de données LIRIS-ACCEDE. Cette base a trois avantages principaux: (1) elle utilise des films sous licence Creative Commons et peut donc être partagée sans enfreindre le droit d’auteur, (2) elle est composée de 9800 extraits vidéos de bonne qualité qui proviennent de 160 films et courts métrages, et (3) les 9800 extraits ont été classés selon les axes de “valence” et “arousal” induits grâce un protocole de comparaisons par paires mis en place sur un site de crowdsourcing. L’accord inter-annotateurs élevé reflète la cohérence des annotations malgré la forte différence culturelle parmi les annotateurs. Trois autres expériences sont également présentées dans cette thèse. Premièrement, des scores émotionnels ont été collectés pour un sous-ensemble de vidéos de la base LIRIS-ACCEDE dans le but de faire une validation croisée des classements obtenus via crowdsourcing. Les scores émotionnels ont aussi rendu possible l’apprentissage d’un processus gaussien par régression, modélisant le bruit lié aux annotations, afin de convertir tous les rangs liés aux vidéos de la base LIRIS-ACCEDE en scores émotionnels définis dans l’espace 2D valence-arousal. Deuxièmement, des annotations continues pour 30 films ont été collectées dans le but de créer des modèles algorithmiques temporellement fiables. Enfin, une dernière expérience a été réalisée dans le but de mesurer de façon continue des données physiologiques sur des participants regardant les 30 films utilisés lors de l’expérience précédente. La corrélation entre les annotations physiologiques et les scores continus renforce la validité des résultats de ces expériences. Equipée d’une base de données, cette thèse présente un modèle algorithmique afin d’estimer les émotions induites par les films. Le système utilise à son avantage les récentes avancées dans le domaine de l’apprentissage profond et prend en compte la relation entre des scènes consécutives. Le système est composé de deux réseaux de neurones convolutionnels ajustés. L’un est dédié à la modalité visuelle et utilise en entrée des versions recadrées des principales frames des segments vidéos, alors que l’autre est dédié à la modalité audio grâce à l’utilisation de spectrogrammes audio. Les activations de la dernière couche entièrement connectée de chaque réseau sont concaténées pour nourrir un réseau de neurones récurrent utilisant des neurones spécifiques appelés “Long-Short-Term- Memory” qui permettent l’apprentissage des dépendances temporelles entre des segments vidéo successifs. La performance obtenue par le modèle est comparée à celle d’un modèle basique similaire à l’état de l’art et montre des résultats très prometteurs mais qui reflètent la complexité de telles tâches. En effet, la prédiction automatique des émotions induites par les films est donc toujours une tâche très difficile qui est loin d’être complètement résolue. / Never before have movies been as easily accessible to viewers, who can enjoy anywhere the almost unlimited potential of movies for inducing emotions. Thus, knowing in advance the emotions that a movie is likely to elicit to its viewers could help to improve the accuracy of content delivery, video indexing or even summarization. However, transferring this expertise to computers is a complex task due in part to the subjective nature of emotions. The present thesis work is dedicated to the automatic prediction of emotions induced by movies based on the intrinsic properties of the audiovisual signal. To computationally deal with this problem, a video dataset annotated along the emotions induced to viewers is needed. However, existing datasets are not public due to copyright issues or are of a very limited size and content diversity. To answer to this specific need, this thesis addresses the development of the LIRIS-ACCEDE dataset. The advantages of this dataset are threefold: (1) it is based on movies under Creative Commons licenses and thus can be shared without infringing copyright, (2) it is composed of 9,800 good quality video excerpts with a large content diversity extracted from 160 feature films and short films, and (3) the 9,800 excerpts have been ranked through a pair-wise video comparison protocol along the induced valence and arousal axes using crowdsourcing. The high inter-annotator agreement reflects that annotations are fully consistent, despite the large diversity of raters’ cultural backgrounds. Three other experiments are also introduced in this thesis. First, affective ratings were collected for a subset of the LIRIS-ACCEDE dataset in order to cross-validate the crowdsourced annotations. The affective ratings made also possible the learning of Gaussian Processes for Regression, modeling the noisiness from measurements, to map the whole ranked LIRIS-ACCEDE dataset into the 2D valence-arousal affective space. Second, continuous ratings for 30 movies were collected in order develop temporally relevant computational models. Finally, a last experiment was performed in order to collect continuous physiological measurements for the 30 movies used in the second experiment. The correlation between both modalities strengthens the validity of the results of the experiments. Armed with a dataset, this thesis presents a computational model to infer the emotions induced by movies. The framework builds on the recent advances in deep learning and takes into account the relationship between consecutive scenes. It is composed of two fine-tuned Convolutional Neural Networks. One is dedicated to the visual modality and uses as input crops of key frames extracted from video segments, while the second one is dedicated to the audio modality through the use of audio spectrograms. The activations of the last fully connected layer of both networks are conv catenated to feed a Long Short-Term Memory Recurrent Neural Network to learn the dependencies between the consecutive video segments. The performance obtained by the model is compared to the performance of a baseline similar to previous work and shows very promising results but reflects the complexity of such tasks. Indeed, the automatic prediction of emotions induced by movies is still a very challenging task which is far from being solved. Modèle d’estimation des émotions Emotions induites Base de donnée de vidéos Affective computing Crowdsourcing Computational emotion modeling Induced emotion Video dataset Affective computing Crowdsourcing
29	Emotional recognition in computing Axelrod, Lesley Ann January 2010 (has links) Emotions are fundamental to human lives and decision-making. Understanding and expression of emotional feeling between people forms an intricate web. This complex interactional phenomena, is a hot topic for research, as new techniques such as brain imaging give us insights about how emotions are tied to human functions. Communication of emotions is mixed with communication of other types of information (such as factual details) and emotions can be consciously or unconsciously displayed. Affective computer systems, using sensors for emotion recognition and able to make emotive responses are under development. The increased potential for emotional interaction with products and services, in many domains, is generating much interest. Emotionally enhanced systems have potential to improve human computer interaction and so to improve how systems are used and what they can deliver. They may also have adverse implications such as creating systems capable of emotional manipulation of users. Affective systems are in their infancy and lack human complexity and capability. This makes it difficult to assess whether human interaction with such systems will actually prove beneficial or desirable to users. By using experimental design, a Wizard of Oz methodology and a game that appeared to respond to the user's emotional signals with human-like capability, I tested user experience and reactions to a system that appeared affective. To assess users' behaviour, I developed a novel affective behaviour coding system called 'affectemes'. I found significant gains in user satisfaction and performance when using an affective system. Those believing the system responded to emotional signals blinked more frequently. If the machine failed to respond to their emotional signals, they increased their efforts to convey emotion, which might be an attempt to 'repair' the interaction. This work highlights how very complex and difficult it is to design and evaluate affective systems. I identify many issues for future work, including the unconscious nature of emotions and how they are recognised and displayed with affective systems; issues about the power of emotionally interactive systems and their evaluation; and critical ethical issues. These are important considerations for future design of systems that use emotion recognition in computing. 300.285
30	The Affective PDF Reader Radits, Markus January 2010 (has links) <p>The Affective PDF Reader is a PDF Reader combined with affect recognition systems. The aim of the project is to research a way to provide the reader of a PDF with real - time visual feedback while reading the text to influence the reading experience in a positive way. The visual feedback is given in accordance to analyzed emotional states of the person reading the text - this is done by capturing and interpreting affective information with a facial expression recognition system. Further enhancements would also include analysis of voice in the computation as well as gaze tracking software to be able to use the point of gaze when rendering the visualizations.The idea of the Affective PDF Reader mainly arose in admitting that the way we read text on computers, mostly with frozen and dozed off faces, is somehow an unsatisfactory state or moreover a lonesome process and a poor communication. This work is also inspired by the significant progress and efforts in recognizing emotional states from video and audio signals and the new possibilities that arise from.The prototype system was providing visualizations of footprints in different shapes and colours which were controlled by captured facial expressions to enrich the textual content with affective information. The experience showed that visual feedback controlled by utterances of facial expressions can bring another dimension to the reading experience if the visual feedback is done in a frugal and non intrusive way and it showed that the evolvement of the users can be enhanced.</p> Affective Computing Facial Expression Affective Feedback PDF Reader Computer science Datalogi

Search results