Global ETD Search

31	Visual speech synthesis by learning joint probabilistic models of audio and video Deena, Salil Prashant January 2012 (has links) Visual speech synthesis deals with synthesising facial animation from an audio representation of speech. In the last decade or so, data-driven approaches have gained prominence with the development of Machine Learning techniques that can learn an audio-visual mapping. Many of these Machine Learning approaches learn a generative model of speech production using the framework of probabilistic graphical models, through which efficient inference algorithms can be developed for synthesis. In this work, the audio and visual parameters are assumed to be generated from an underlying latent space that captures the shared information between the two modalities. These latent points evolve through time according to a dynamical mapping and there are mappings from the latent points to the audio and visual spaces respectively. The mappings are modelled using Gaussian processes, which are non-parametric models that can represent a distribution over non-linear functions. The result is a non-linear state-space model. It turns out that the state-space model is not a very accurate generative model of speech production because it assumes a single dynamical model, whereas it is well known that speech involves multiple dynamics (for e.g. different syllables) that are generally non-linear. In order to cater for this, the state-space model can be augmented with switching states to represent the multiple dynamics, thus giving a switching state-space model. A key problem is how to infer the switching states so as to model the multiple non-linear dynamics of speech, which we address by learning a variable-order Markov model on a discrete representation of audio speech. Various synthesis methods for predicting visual from audio speech are proposed for both the state-space and switching state-space models. Quantitative evaluation, involving the use of error and correlation metrics between ground truth and synthetic features, is used to evaluate our proposed method in comparison to other probabilistic models previously applied to the problem. Furthermore, qualitative evaluation with human participants has been conducted to evaluate the realism, perceptual characteristics and intelligibility of the synthesised animations. The results are encouraging and demonstrate that by having a joint probabilistic model of audio and visual speech that caters for the non-linearities in audio-visual mapping, realistic visual speech can be synthesised from audio speech. 006.54
32	A facial animation model for expressive audio-visual speech Somasundaram, Arunachalam 21 September 2006 (has links) No description available. Computer Science expressive facial speech animation expressive audio-visual speech facial animation speech animation facial expressions face speech emotions
33	Computer facial animation for sign language visualization Barker, Dean 04 1900 (has links) Thesis (MSc)--University of Stellenbosch, 2005. / ENGLISH ABSTRACT: Sign Language is a fully-fledged natural language possessing its own syntax and grammar; a fact which implies that the problem of machine translation from a spoken source language to Sign Language is at least as difficult as machine translation between two spoken languages. Sign Language, however, is communicated in a modality fundamentally different from all spoken languages. Machine translation to Sign Language is therefore burdened not only by a mapping from one syntax and grammar to another, but also, by a non-trivial transformation from one communicational modality to another. With regards to the computer visualization of Sign Language; what is required is a three dimensional, temporally accurate, visualization of signs including both the manual and nonmanual components which can be viewed from arbitrary perspectives making accurate understanding and imitation more feasible. Moreover, given that facial expressions and movements represent a fundamental basis for the majority of non-manual signs, any system concerned with the accurate visualization of Sign Language must rely heavily on a facial animation component capable of representing a well-defined set of emotional expressions as well as a set of arbitrary facial movements. This thesis investigates the development of such a computer facial animation system. We address the problem of delivering coordinated, temporally constrained, facial animation sequences in an online environment using VRML. Furthermore, we investigate the animation, using a muscle model process, of arbitrary three-dimensional facial models consisting of multiple aligned NURBS surfaces of varying refinement. Our results showed that this approach is capable of representing and manipulating high fidelity three-dimensional facial models in such a manner that localized distortions of the models result in the recognizable and realistic display of human facial expressions and that these facial expressions can be displayed in a coordinated, synchronous manner. / AFRIKAANSE OPSOMMING: Gebaretaal is 'n volwaardige natuurlike taal wat oor sy eie sintaks en grammatika beskik. Hierdie feit impliseer dat die probleem rakende masjienvertaling vanuit 'n gesproke taal na Gebaretaal net so moeilik is as masjienvertaling tussen twee gesproke tale. Gebaretaal word egter in 'n modaliteit gekommunikeer wat in wese van alle gesproke tale verskil. Masjienvertaling in Gebaretaal word daarom nie net belas deur 'n afbeelding van een sintaks en grammatika op 'n ander nie, maar ook deur beduidende omvorming van een kommunikasiemodaliteit na 'n ander. Wat die gerekenariseerde visualisering van Gebaretaal betref, vereis dit 'n driedimensionele, tyds-akkurate visualisering van gebare, insluitend komponente wat met en sonder die gebruik van die hande uitgevoer word, en wat vanuit arbitrêre perspektiewe beskou kan word ten einde die uitvoerbaarheid van akkurate begrip en nabootsing te verhoog. Aangesien gesigsuitdrukkings en -bewegings die fundamentele grondslag van die meeste gebare wat nie met die hand gemaak word nie, verteenwoordig, moet enige stelsel wat te make het met die akkurate visualisering van Gebaretaal boonop sterk steun op 'n gesigsanimasiekomponent wat daartoe in staat is om 'n goed gedefinieerde stel emosionele uitdrukkings sowel as 'n stel arbitrre gesigbewegings voor te stel. Hierdie tesis ondersoek die ontwikkeling van so 'n gerekenariseerde gesigsanimasiestelsel. Die probleem rakende die lewering van gekordineerde, tydsbegrensde gesigsanimasiesekwensies in 'n intydse omgewing, wat gebruik maak van VRML, word aangeroer. Voorts word ondersoek ingestel na die animasie (hier word van 'n spiermodelproses gebruik gemaak) van arbitrre driedimensionele gesigsmodelle bestaande uit veelvoudige, opgestelde NURBS-oppervlakke waarvan die verfyning wissel. Die resultate toon dat hierdie benadering daartoe in staat is om hoë kwaliteit driedimensionele gesigsmodelle só voor te stel en te manipuleer dat gelokaliseerde vervormings van die modelle die herkenbare en realistiese tentoonstelling van menslike gesigsuitdrukkings tot gevolg het en dat hierdie gesigsuitdrukkings op 'n gekordineerde, sinchroniese wyse uitgebeeld kan word. Computer animation Sign language -- Computer-aided design VRML (Computer program language) Computer facial animation system Dissertations -- Computer science Theses -- Computer science
34	An affective personality for an embodied conversational agent Xiao, He January 2006 (has links) Curtin Universitys Embodied Conversational Agents (ECA) combine an MPEG-4 compliant Facial Animation Engine (FAE), a Text To Emotional Speech Synthesiser (TTES), and a multi-modal Dialogue Manager (DM), that accesses a Knowledge Base (KB) and outputs Virtual Human Markup Language (VHML) text which drives the TTES and FAE. A user enters a question and an animated ECA responds with a believable and affective voice and actions. However, this response to the user is normally marked up in VHML by the KB developer to produce the required facial gestures and emotional display. A real person does not react by fixed rules but on personality, beliefs, previous experiences, and training. This thesis details the design, implementation and pilot study evaluation of an Affective Personality Model for an ECA. The thesis discusses the Email Agent system that informs a user when they have email. The system, built in Curtins ECA environment, has personality traits of Friendliness, Extraversion and Neuroticism. A small group of participants evaluated the Email Agent system to determine the effectiveness of the implemented personality system. An analysis of the qualitative and quantitative results from questionnaires is presented.
35	Bringing the avatar to life : Studies and developments in facial communication for virtual agents and robots Al Moubayed, Samer January 2012 (has links) The work presented in this thesis comes in pursuit of the ultimate goal of building spoken and embodied human-like interfaces that are able to interact with humans under human terms. Such interfaces need to employ the subtle, rich and multidimensional signals of communicative and social value that complement the stream of words – signals humans typically use when interacting with each other. The studies presented in the thesis concern facial signals used in spoken communication, and can be divided into two connected groups. The first is targeted towards exploring and verifying models of facial signals that come in synchrony with speech and its intonation. We refer to this as visual-prosody, and as part of visual-prosody, we take prominence as a case study. We show that the use of prosodically relevant gestures in animated faces results in a more expressive and human-like behaviour. We also show that animated faces supported with these gestures result in more intelligible speech which in turn can be used to aid communication, for example in noisy environments. The other group of studies targets facial signals that complement speech. As spoken language is a relatively poor system for the communication of spatial information; since such information is visual in nature. Hence, the use of visual movements of spatial value, such as gaze and head movements, is important for an efficient interaction. The use of such signals is especially important when the interaction between the human and the embodied agent is situated – that is when they share the same physical space, and while this space is taken into account in the interaction. We study the perception, the modelling, and the interaction effects of gaze and head pose in regulating situated and multiparty spoken dialogues in two conditions. The first is the typical case where the animated face is displayed on flat surfaces, and the second where they are displayed on a physical three-dimensional model of a face. The results from the studies show that projecting the animated face onto a face-shaped mask results in an accurate perception of the direction of gaze that is generated by the avatar, and hence can allow for the use of these movements in multiparty spoken dialogue. Driven by these findings, the Furhat back-projected robot head is developed. Furhat employs state-of-the-art facial animation that is projected on a 3D printout of that face, and a neck to allow for head movements. Although the mask in Furhat is static, the fact that the animated face matches the design of the mask results in a physical face that is perceived to “move”. We present studies that show how this technique renders a more intelligible, human-like and expressive face. We further present experiments in which Furhat is used as a tool to investigate properties of facial signals in situated interaction. Furhat is built to study, implement, and verify models of situated and multiparty, multimodal Human-Machine spoken dialogue, a study that requires that the face is physically situated in the interaction environment rather than in a two-dimensional screen. It also has received much interest from several communities, and been showcased at several venues, including a robot exhibition at the London Science Museum. We present an evaluation study of Furhat at the exhibition where it interacted with several thousand persons in a multiparty conversation. The analysis of the data from the setup further shows that Furhat can accurately regulate multiparty interaction using gaze and head movements. / <p>QC 20121123</p> Avatar Speech Communication Facial animation Nonverbal Social Robot Human-like Face-to-face Prosody Pitch Prominence Furhat Gaze Head-pose Dialogue Interaction Multimodal Multiparty
36	MPEG-4 Facial Feature Point Editor / Editor för MPEG-4 "feature points" Lundberg, Jonas January 2002 (has links) The use of computer animated interactive faces in film, TV, games is ever growing, with new application areas emerging also on the Internet and mobile environments. Morph targets are one of the most popular methods to animate the face. Up until now 3D artists had to design each morph target defined by the MPEG-4 standard by hand. This is a very monotonous and tedious task. With the newly developed method of Facial Motion Cloning [11]the heavy work is relieved from the artists. From an already animated face model the morph targets can now be copied onto a new static face model. For the Facial Motion Cloning process there must be a subset of the feature points specified by the MPEG-4 standard defined. The purpose of this is to correlate the facial features of the two faces. The goal of this project is to develop a graphical editor in which the artists can define the feature points for a face model. The feature points will be saved in a file format that can be used in a Facial Motion Cloning software. Technology MPEG-4 FMC Facial Motion Cloning Feature Point Facial Animation Feature Point Editor VRML OpenGL 3D Graphics TEKNIKVETENSKAP TECHNOLOGY TEKNIKVETENSKAP
37	MPEG-4 Facial Feature Point Editor / Editor för MPEG-4 "feature points" Lundberg, Jonas January 2002 (has links) <p>The use of computer animated interactive faces in film, TV, games is ever growing, with new application areas emerging also on the Internet and mobile environments. Morph targets are one of the most popular methods to animate the face. Up until now 3D artists had to design each morph target defined by the MPEG-4 standard by hand. This is a very monotonous and tedious task. With the newly developed method of Facial Motion Cloning [11]the heavy work is relieved from the artists. From an already animated face model the morph targets can now be copied onto a new static face model. </p><p>For the Facial Motion Cloning process there must be a subset of the feature points specified by the MPEG-4 standard defined. The purpose of this is to correlate the facial features of the two faces. The goal of this project is to develop a graphical editor in which the artists can define the feature points for a face model. The feature points will be saved in a file format that can be used in a Facial Motion Cloning software.</p> Technology MPEG-4 FMC Facial Motion Cloning Feature Point Facial Animation Feature Point Editor VRML OpenGL 3D Graphics TEKNIKVETENSKAP TECHNOLOGY TEKNIKVETENSKAP
38	基於形態轉換的多種表情卡通肖像 / Automatic generation of caricatures with multiple expressions using transformative approach 賴建安, Lai, Chien An Unknown Date (has links) 隨著數位影像軟、硬體裝置上進步與普及，普羅大眾對於影像的使用不僅限於日常生活之中，更隨著網路分享概念等Web技術的擴張，這些數量龐大的影像，在使用上更朝向娛樂化、趣味化及個人化的範疇。本論文提出結合影像處理中的人臉特徵分析（Facial Features Analysis）資訊以及影像內容分割（Image Content Segmentation）及影像變形轉換（Image Warping and Morphing）等技術，設計出可以將真實照片中的人臉轉換成為卡通化的肖像，供使用者於各類媒體上使用。卡通化肖像不但具有隱藏影像細節，保留部份隱私的優勢，同時又兼具充份擁有個人化特色的表徵，透過臉部動畫的參數（Facial Animation Parameters）設定，我們提出的卡通化系統更容許使用者依心情，來合成喜、怒、哀、樂等不同表情。另外，運用兩種轉描式（Rotoscoping）及圖像變形（Morphing）法，以不同的合成技巧來解決不同裝置在限定顏色及效果偏好上的各類需求。 / As the acquisition of digital images becomes more convenient, diversified applications of image collections have surfaced at a rapid pace. Not only have we witnessed the popularity of photo-sharing platforms, we have also seen strong demand for novel mechanism that offers personalized and creative entertainment in recent years. In this thesis, we proposed and implemented a personal caricature generator using transformative approaches. By combing facial feature detection, image segmentation and image warping/morphing techniques, the system is able to generate stylized caricature using only one reference image. The system can also produce multiple expressions by controlling the MPEG-4 facial animation parameters (FAP). Specifically, by referencing to various pre-drawn caricature in our database as well as feature points for mesh creation, personalized caricatures are automatically generated from the real photos using either rotoscoping or transformative approaches. The resulting caricature can be further modified to exhibit multiple facial expressions. Important issues regarding color reduction and vectorized representation of the caricature have also been discussed in this thesis. 影像內容分割圖像變形臉部動畫參數人臉特徵分析 Image Content Segmentation Image Warping and Morphing Facial Animation Parameters Facial Features Analysis
39	Marc : modèles informatiques des émotions et de leurs expressions faciales pour l’interaction Homme-machine affective temps réel / Marc : computational models of emotions and their facial expressions for real-time affective human-computer interaction Courgeon, Matthieu 21 November 2011 (has links) Les émotions et leurs expressions par des agents virtuels sont deux enjeux importants pour les interfaces homme-machine affectives à venir. En effet, les évolutions récentes des travaux en psychologie des émotions, ainsi que la progression des techniques de l’informatique graphique, permettent aujourd’hui d’animer des personnages virtuels réalistes et capables d’exprimer leurs émotions via de nombreuses modalités. Si plusieurs systèmes d’agents virtuels existent, ils restent encore limités par la diversité des modèles d’émotions utilisés, par leur niveau de réalisme, et par leurs capacités d’interaction temps réel. Dans nos recherches, nous nous intéressons aux agents virtuels capables d’exprimer des émotions via leurs expressions faciales en situation d’interaction avec l’utilisateur. Nos travaux posent de nombreuses questions scientifiques et ouvrent sur les problématiques suivantes : Comment modéliser les émotions en informatique en se basant sur les différentes approches des émotions en psychologie ? Quel niveau de réalisme visuel de l’agent est nécessaire pour permettre une bonne expressivité émotionnelle ? Comment permettre l’interaction temps réel avec un agent virtuel ? Comment évaluer l’impact des émotions exprimées par l’agent virtuel sur l’utilisateur ? A partir de ces problématiques, nous avons axé nos travaux sur la modélisation informatique des émotions et sur leurs expressions faciales par un personnage virtuel réaliste. En effet, les expressions faciales sont une modalité privilégiée de la communication émotionnelle. Notre objectif principal est de contribuer l’amélioration de l’interaction entre l’utilisateur et un agent virtuel expressif. Nos études ont donc pour objectif de mettre en lumière les avantages et les inconvénients des différentes approches des émotions ainsi que des méthodes graphiques étudiées. Nous avons travaillé selon deux axes de recherches complémentaires. D’une part, nous avons exploré différentes approches des émotions (catégorielle, dimensionnelle, cognitive, et sociale). Pour chacune de ces approches, nous proposons un modèle informatique et une méthode d’animation faciale temps réel associée. Notre second axe de recherche porte sur l’apport du réalisme visuel et du niveau de détail graphique à l’expressivité de l’agent. Cet axe est complémentaire au premier, car un plus grand niveau de détail visuel pourrait permettre de mieux refléter la complexité du modèle émotionnel informatique utilisé. Les travaux que nous avons effectués selon ces deux axes ont été évalués par des études perceptives menées sur des utilisateurs.La combinaison de ces deux axes de recherche est rare dans les systèmes d’agents virtuels expressifs existants. Ainsi, nos travaux ouvrent des perspectives pour l’amélioration de la conception d’agents virtuels expressifs et de la qualité de l’interaction homme machine basée sur les agents virtuels expressifs interactifs. L’ensemble des logiciels que nous avons conçus forme notre plateforme d’agents virtuels MARC (Multimodal Affective and Reactive Characters). MARC a été utilisée dans des applications de natures diverses : jeu, intelligence ambiante, réalité virtuelle, applications thérapeutiques, performances artistiques, etc. / Emotions and their expressions by virtual characters are two important issues for future affective human-machine interfaces. Recent advances in psychology of emotions as well as recent progress in computer graphics allow us to animate virtual characters that are capable of expressing emotions in a realistic way through various modalities. Existing virtual agent systems are often limited in terms of underlying emotional models, visual realism, and real-time interaction capabilities. In our research, we focus on virtual agents capable of expressing emotions through facial expressions while interacting with the user. Our work raises several issues: How can we design computational models of emotions inspired by the different approaches to emotion in Psychology? What is the level of visual realism required for the agent to express emotions? How can we enable real-time interaction with a virtual agent? How can we evaluate the impact on the user of the emotions expressed by the virtual agent? Our work focuses on computational modeling of emotions inspired by psychological theories of emotion and emotional facial expressions by a realistic virtual character. Facial expressions are known to be a privileged emotional communication modality. Our main goal is to contribute to the improvement of the interaction between a user and an expressive virtual agent. For this purpose, our research highlights the pros and cons of different approaches to emotions and different computer graphics techniques. We worked in two complementary directions. First, we explored different approaches to emotions (categorical, dimensional, cognitive, and social). For each of these approaches, a computational model has been designed together with a method for real-time facial animation. Our second line of research focuses on the contribution of visual realism and the level of graphic detail of the expressiveness of the agent. This axis is complementary to the first one, because a greater level of visual detail could contribute to a better expression of the complexity of the underlying computational model of emotion. Our work along these two lines was evaluated by several user-based perceptual studies. The combination of these two lines of research is seldom in existing expressive virtual agents systems. Our work opens future directions for improving human-computer interaction based on expressive and interactive virtual agents. The software modules that we have designed are integrated into our platform MARC (Multimodal Affective and Reactive Characters). MARC has been used in various kinds of applications: games, ubiquitous intelligence, virtual reality, therapeutic applications, performance art, etc. Interaction homme-machine Agents virtuels expressifs Animation faciale expressive temps-réel Modèles informatiques des émotions Interaction affective Études perceptives Human computer interaction Expressive virtual agents Expressive facial animation Affective interaction Perceptual studies
40	Paramétrisation et transfert d’animations faciales 3D à partir de séquences vidéo : vers des applications en temps réel / Rigging and retargetting of 3D facial animations from video : towards real-time applications Dutreve, Ludovic 24 March 2011 (has links) L’animation faciale est l’un des points clés dans le réalisme des scènes 3D qui mettent en scène des personnages virtuels. Ceci s’explique principalement par les raisons suivantes : le visage et les nombreux muscles qui le composent permettent de générer une multitude d’expressions ; ensuite, notre faculté de perception nous permet de détecter et d’analyser ses mouvements les plus fins. La complexité de ce domaine se retrouve dans les approches existantes par le fait qu’il est très difficile de créer une animation de qualité sans un travail manuel long et fastidieux. Partant de ce constat, cette thèse a pour but de développer des techniques qui contribuent au processus de création d’animations faciales. Trois thèmes sont principalement abordés. Le premier concerne la paramétrisation du visage pour l’animation. La paramétrisation a pour but de définir des moyens de contrôle pour pouvoir déformer et animer le visage. Le second s’oriente sur l’animation, et plus particulièrement sur le transfert d’animation. Le but est de proposer une méthode qui permette d’animer le visage d’un personnage à partir de données variées. Ces données peuvent être issues d’un système de capture de mouvement, ou bien elles peuvent être obtenues à partir de l’animation d’un personnage virtuel qui existe déjà. Enfin, nous nous sommes concentrés sur les détails fins liés à l’animation comme les rides. Bien que ces rides soient fines et discrètes, ces déformations jouent un rôle important dans la perception et l’analyse des émotions. C’est pourquoi nous proposons une technique d’acquisition mono-caméra et une méthode à base de poses références pour synthétiser dynamiquement les détails fins d’animation sur le visage. L’objectif principal des méthodes proposées est d’offrir des solutions afin de faciliter et d’améliorer le processus de création d’animations faciales réalistes utilisées dans le cadre d’applications en temps réel. Nous nous sommes particulièrement concentrés sur la facilité d’utilisation et sur la contrainte du temps réel. De plus, nous offrons la possibilité à l’utilisateur ou au graphiste d’interagir afin de personnaliser sa création et/ou d’améliorer les résultats obtenus / Facial animation is one of the key points of the realism of 3D scenes featuring virtual humans. This is due to several reasons : face and the many muscles that compose it can generate a multitude of expressions ; then, our faculty of perception provides us a great ability to detect and analyze its smallest variations. This complexity is reflected in existing approaches by the fact that it is very difficult to create an animation without a long and a tedious manual work. Based on these observations, this thesis aims to develop techniques that contribute to the process of creating facial animation. Three main themes have been addressed. The first concerns the rigging issue of a virtual 3D face for animation. Rigging aims at defining control parameters in order to deform and animate the face. The second deals with the animation, especially on the animation retargeting issue. The goal is to propose a method to animate a character’s face from various data. These data can be obtained from a motion capture system or from an existing 3D facial animation. Finally, we focus on animation finescale details like wrinkles. Although these are thin and discreet, their deformations play an important part in the perception and analysis of emotions. Therefore we propose a monocular acquisition technique and a reference pose based method to synthetise dynamically animation fine details over the face. The purpose is to propose methods to facilitate and improve the process of creating realistic facial animations for interactive applications. We focused on ease to use in addition to the real-time aspect. Moreover, we offer the possibility to the user or graphist to interact in order to personalize its creation and/or improve the results Animation faciale Temps réel Transfert d’animation Détails fins dynamiques Mise en correspondance de surface Capture de mouvement Fonctions à base radiale Facial animation Real-time Animation retargeting Dynamic fine-scale details Surface fitting Motion capture Radial basis functions 003.3

Search results