• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 23
  • 3
  • 3
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 45
  • 45
  • 23
  • 11
  • 11
  • 9
  • 8
  • 8
  • 8
  • 8
  • 7
  • 7
  • 7
  • 6
  • 6
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
31

Morphable 3d Facial Animation Based On Thin Plate Splines

Erdogdu, Aysu 01 May 2010 (has links) (PDF)
The aim of this study is to present a novel three dimensional (3D) facial animation method for morphing emotions and facial expressions from one face model to another. For this purpose, smooth and realistic face models were animated with thin plate splines (TPS). Neutral face models were animated and compared with the actual expressive face models. Neutral and expressive face models were obtained from subjects via a 3D face scanner. The face models were preprocessed for pose and size normalization. Then muscle and wrinkle control points were located to the source face with neutral expression according to the human anatomy. Facial Action Coding System (FACS) was used to determine the control points and the face regions in the underlying model. The final positions of the control points after a facial expression were received from the expressive scan data of the source face. Afterwards control points were transferred to the target face using the facial landmarks and TPS as the morphing function. Finally, the neutral target face was animated with control points by TPS. In order to visualize the method, face scans with expressions composed of a selected subset of action units found in Bosphorus Database were used. Five lower-face and three-upper face action units are simulated during this study. For experimental results, the facial expressions were created on the 3D neutral face scan data of a human subject and the synthetic faces were compared to the subject&rsquo / s actual 3D scan data with the same facial expressions taken from the dataset.
32

Verification, Validation and Evaluation of the Virtual Human Markup Language (VHML) / Verifiering, validering och utvärdering av Virtual Human Markup Language (VHML)

Gustavsson, Camilla, Strindlund, Linda, Wiknertz, Emma January 2002 (has links)
<p>Human communication is inherently multimodal. The information conveyed through body language, facial expression, gaze, intonation, speaking style etc. are all important components of everyday communication. An issue within computer science concerns how to provide multimodal agent based systems. Those are systems that interact with users through several channels. These systems can include Virtual Humans. A Virtual Human might for example be a complete creature, i.e. a creature with a whole body including head, arms, legs etc. but it might also be a creature with only a head, a Talking Head. The aim of the Virtual Human Markup Language (VHML) is to control Virtual Humans regarding speech, facial animation, facial gestures and body animation. These parts have previously been implemented and investigated separately, but VHML aims to combine them. In this thesis VHML is verified, validated and evaluated in order to reach that aim and thus VHML is made more solid, homogenous and complete. Further, a Virtual Human has to communicate with the user and even though VHML supports a number of other ways of communication, an important communication channel is speech. The Virtual Human has to be able to interact with the user, therefore a dialogue between the user and the Virtual Human has to be created. These dialogues tend to expand tremendously, hence the Dialogue Management Tool (DMT) was developed. Having a toolmakes it easier for programmers to create and maintain dialogues for the interaction. Finally, in order to demonstrate the work done in this thesis a Talking Head application, The Mystery at West Bay Hospital, has been developed and evaluated. This has shown the usefulness of the DMT when creating dialogues. The work that has been accomplished within this project has contributed to simplify the development of Talking Head applications.</p>
33

Simulation Of Turkish Lip Motion And Facial Expressions In A 3d Environment And Synchronization With A Turkish Speech Engine

Akagunduz, Erdem 01 January 2004 (has links) (PDF)
In this thesis, 3D animation of human facial expressions and lip motion and their synchronization with a Turkish Speech engine using JAVA programming language, JAVA3D API and Java Speech API, is analyzed. A three-dimensional animation model for simulating Turkish lip motion and facial expressions is developed. In addition to lip motion, synchronization with a Turkish speech engine is achieved. The output of the study is facial expressions and Turkish lip motion synchronized with Turkish speech, where the input is Turkish text in Java Speech Markup Language (JSML) format, also indicating expressions. Unlike many other languages, in Turkish, words are easily broken up into syllables. This property of Turkish Language lets us use a simple method to map letters to Turkish visual phonemes. In this method, totally 37 face models are used to represent the Turkish visual phonemes and these letters are mapped to 3D facial models considering the syllable structures. The animation is created using JAVA3D API. 3D facial models corresponding to different lip positions of the same person are morphed to each other to construct the animation. Moreover, simulations of human facial expressions of emotions are created within the animation. Expression weight parameter, which states the weight of the given expression, is introduced. The synchronization of lip motion with Turkish speech is achieved via CloudGarden&reg / &rsquo / s Java Speech API interface. As a final point a virtual Turkish speaker with facial expression of emotions is created for JAVA3D animation.
34

Facial Modelling and animation trends in the new millennium : a survey

Radovan, Mauricio 11 1900 (has links)
M.Sc (Computer Science) / Facial modelling and animation is considered one of the most challenging areas in the animation world. Since Parke and Waters’s (1996) comprehensive book, no major work encompassing the entire field of facial animation has been published. This thesis covers Parke and Waters’s work, while also providing a survey of the developments in the field since 1996. The thesis describes, analyses, and compares (where applicable) the existing techniques and practices used to produce the facial animation. Where applicable, the related techniques are grouped in the same chapter and described in a chronological fashion, outlining their differences, as well as their advantages and disadvantages. The thesis is concluded by exploratory work towards a talking head for Northern Sotho. Facial animation and lip synchronisation of a fragment of Northern Sotho is done by using software tools primarily designed for English. / Computing
35

Visual speech synthesis by learning joint probabilistic models of audio and video

Deena, Salil Prashant January 2012 (has links)
Visual speech synthesis deals with synthesising facial animation from an audio representation of speech. In the last decade or so, data-driven approaches have gained prominence with the development of Machine Learning techniques that can learn an audio-visual mapping. Many of these Machine Learning approaches learn a generative model of speech production using the framework of probabilistic graphical models, through which efficient inference algorithms can be developed for synthesis. In this work, the audio and visual parameters are assumed to be generated from an underlying latent space that captures the shared information between the two modalities. These latent points evolve through time according to a dynamical mapping and there are mappings from the latent points to the audio and visual spaces respectively. The mappings are modelled using Gaussian processes, which are non-parametric models that can represent a distribution over non-linear functions. The result is a non-linear state-space model. It turns out that the state-space model is not a very accurate generative model of speech production because it assumes a single dynamical model, whereas it is well known that speech involves multiple dynamics (for e.g. different syllables) that are generally non-linear. In order to cater for this, the state-space model can be augmented with switching states to represent the multiple dynamics, thus giving a switching state-space model. A key problem is how to infer the switching states so as to model the multiple non-linear dynamics of speech, which we address by learning a variable-order Markov model on a discrete representation of audio speech. Various synthesis methods for predicting visual from audio speech are proposed for both the state-space and switching state-space models. Quantitative evaluation, involving the use of error and correlation metrics between ground truth and synthetic features, is used to evaluate our proposed method in comparison to other probabilistic models previously applied to the problem. Furthermore, qualitative evaluation with human participants has been conducted to evaluate the realism, perceptual characteristics and intelligibility of the synthesised animations. The results are encouraging and demonstrate that by having a joint probabilistic model of audio and visual speech that caters for the non-linearities in audio-visual mapping, realistic visual speech can be synthesised from audio speech.
36

A facial animation model for expressive audio-visual speech

Somasundaram, Arunachalam 21 September 2006 (has links)
No description available.
37

Computer facial animation for sign language visualization

Barker, Dean 04 1900 (has links)
Thesis (MSc)--University of Stellenbosch, 2005. / ENGLISH ABSTRACT: Sign Language is a fully-fledged natural language possessing its own syntax and grammar; a fact which implies that the problem of machine translation from a spoken source language to Sign Language is at least as difficult as machine translation between two spoken languages. Sign Language, however, is communicated in a modality fundamentally different from all spoken languages. Machine translation to Sign Language is therefore burdened not only by a mapping from one syntax and grammar to another, but also, by a non-trivial transformation from one communicational modality to another. With regards to the computer visualization of Sign Language; what is required is a three dimensional, temporally accurate, visualization of signs including both the manual and nonmanual components which can be viewed from arbitrary perspectives making accurate understanding and imitation more feasible. Moreover, given that facial expressions and movements represent a fundamental basis for the majority of non-manual signs, any system concerned with the accurate visualization of Sign Language must rely heavily on a facial animation component capable of representing a well-defined set of emotional expressions as well as a set of arbitrary facial movements. This thesis investigates the development of such a computer facial animation system. We address the problem of delivering coordinated, temporally constrained, facial animation sequences in an online environment using VRML. Furthermore, we investigate the animation, using a muscle model process, of arbitrary three-dimensional facial models consisting of multiple aligned NURBS surfaces of varying refinement. Our results showed that this approach is capable of representing and manipulating high fidelity three-dimensional facial models in such a manner that localized distortions of the models result in the recognizable and realistic display of human facial expressions and that these facial expressions can be displayed in a coordinated, synchronous manner. / AFRIKAANSE OPSOMMING: Gebaretaal is 'n volwaardige natuurlike taal wat oor sy eie sintaks en grammatika beskik. Hierdie feit impliseer dat die probleem rakende masjienvertaling vanuit 'n gesproke taal na Gebaretaal net so moeilik is as masjienvertaling tussen twee gesproke tale. Gebaretaal word egter in 'n modaliteit gekommunikeer wat in wese van alle gesproke tale verskil. Masjienvertaling in Gebaretaal word daarom nie net belas deur 'n afbeelding van een sintaks en grammatika op 'n ander nie, maar ook deur beduidende omvorming van een kommunikasiemodaliteit na 'n ander. Wat die gerekenariseerde visualisering van Gebaretaal betref, vereis dit 'n driedimensionele, tyds-akkurate visualisering van gebare, insluitend komponente wat met en sonder die gebruik van die hande uitgevoer word, en wat vanuit arbitrêre perspektiewe beskou kan word ten einde die uitvoerbaarheid van akkurate begrip en nabootsing te verhoog. Aangesien gesigsuitdrukkings en -bewegings die fundamentele grondslag van die meeste gebare wat nie met die hand gemaak word nie, verteenwoordig, moet enige stelsel wat te make het met die akkurate visualisering van Gebaretaal boonop sterk steun op 'n gesigsanimasiekomponent wat daartoe in staat is om 'n goed gedefinieerde stel emosionele uitdrukkings sowel as 'n stel arbitrre gesigbewegings voor te stel. Hierdie tesis ondersoek die ontwikkeling van so 'n gerekenariseerde gesigsanimasiestelsel. Die probleem rakende die lewering van gekordineerde, tydsbegrensde gesigsanimasiesekwensies in 'n intydse omgewing, wat gebruik maak van VRML, word aangeroer. Voorts word ondersoek ingestel na die animasie (hier word van 'n spiermodelproses gebruik gemaak) van arbitrre driedimensionele gesigsmodelle bestaande uit veelvoudige, opgestelde NURBS-oppervlakke waarvan die verfyning wissel. Die resultate toon dat hierdie benadering daartoe in staat is om hoë kwaliteit driedimensionele gesigsmodelle só voor te stel en te manipuleer dat gelokaliseerde vervormings van die modelle die herkenbare en realistiese tentoonstelling van menslike gesigsuitdrukkings tot gevolg het en dat hierdie gesigsuitdrukkings op 'n gekordineerde, sinchroniese wyse uitgebeeld kan word.
38

An affective personality for an embodied conversational agent

Xiao, He January 2006 (has links)
Curtin Universitys Embodied Conversational Agents (ECA) combine an MPEG-4 compliant Facial Animation Engine (FAE), a Text To Emotional Speech Synthesiser (TTES), and a multi-modal Dialogue Manager (DM), that accesses a Knowledge Base (KB) and outputs Virtual Human Markup Language (VHML) text which drives the TTES and FAE. A user enters a question and an animated ECA responds with a believable and affective voice and actions. However, this response to the user is normally marked up in VHML by the KB developer to produce the required facial gestures and emotional display. A real person does not react by fixed rules but on personality, beliefs, previous experiences, and training. This thesis details the design, implementation and pilot study evaluation of an Affective Personality Model for an ECA. The thesis discusses the Email Agent system that informs a user when they have email. The system, built in Curtins ECA environment, has personality traits of Friendliness, Extraversion and Neuroticism. A small group of participants evaluated the Email Agent system to determine the effectiveness of the implemented personality system. An analysis of the qualitative and quantitative results from questionnaires is presented.
39

Bringing the avatar to life : Studies and developments in facial communication for virtual agents and robots

Al Moubayed, Samer January 2012 (has links)
The work presented in this thesis comes in pursuit of the ultimate goal of building spoken and embodied human-like interfaces that are able to interact with humans under human terms. Such interfaces need to employ the subtle, rich and multidimensional signals of communicative and social value that complement the stream of words – signals humans typically use when interacting with each other. The studies presented in the thesis concern facial signals used in spoken communication, and can be divided into two connected groups. The first is targeted towards exploring and verifying models of facial signals that come in synchrony with speech and its intonation. We refer to this as visual-prosody, and as part of visual-prosody, we take prominence as a case study. We show that the use of prosodically relevant gestures in animated faces results in a more expressive and human-like behaviour. We also show that animated faces supported with these gestures result in more intelligible speech which in turn can be used to aid communication, for example in noisy environments. The other group of studies targets facial signals that complement speech. As spoken language is a relatively poor system for the communication of spatial information; since such information is visual in nature. Hence, the use of visual movements of spatial value, such as gaze and head movements, is important for an efficient interaction. The use of such signals is especially important when the interaction between the human and the embodied agent is situated – that is when they share the same physical space, and while this space is taken into account in the interaction. We study the perception, the modelling, and the interaction effects of gaze and head pose in regulating situated and multiparty spoken dialogues in two conditions. The first is the typical case where the animated face is displayed on flat surfaces, and the second where they are displayed on a physical three-dimensional model of a face. The results from the studies show that projecting the animated face onto a face-shaped mask results in an accurate perception of the direction of gaze that is generated by the avatar, and hence can allow for the use of these movements in multiparty spoken dialogue. Driven by these findings, the Furhat back-projected robot head is developed. Furhat employs state-of-the-art facial animation that is projected on a 3D printout of that face, and a neck to allow for head movements. Although the mask in Furhat is static, the fact that the animated face matches the design of the mask results in a physical face that is perceived to “move”. We present studies that show how this technique renders a more intelligible, human-like and expressive face. We further present experiments in which Furhat is used as a tool to investigate properties of facial signals in situated interaction. Furhat is built to study, implement, and verify models of situated and multiparty, multimodal Human-Machine spoken dialogue, a study that requires that the face is physically situated in the interaction environment rather than in a two-dimensional screen. It also has received much interest from several communities, and been showcased at several venues, including a robot exhibition at the London Science Museum. We present an evaluation study of Furhat at the exhibition where it interacted with several thousand persons in a multiparty conversation. The analysis of the data from the setup further shows that Furhat can accurately regulate multiparty interaction using gaze and head movements. / <p>QC 20121123</p>
40

MPEG-4 Facial Feature Point Editor / Editor för MPEG-4 "feature points"

Lundberg, Jonas January 2002 (has links)
The use of computer animated interactive faces in film, TV, games is ever growing, with new application areas emerging also on the Internet and mobile environments. Morph targets are one of the most popular methods to animate the face. Up until now 3D artists had to design each morph target defined by the MPEG-4 standard by hand. This is a very monotonous and tedious task. With the newly developed method of Facial Motion Cloning [11]the heavy work is relieved from the artists. From an already animated face model the morph targets can now be copied onto a new static face model. For the Facial Motion Cloning process there must be a subset of the feature points specified by the MPEG-4 standard defined. The purpose of this is to correlate the facial features of the two faces. The goal of this project is to develop a graphical editor in which the artists can define the feature points for a face model. The feature points will be saved in a file format that can be used in a Facial Motion Cloning software.

Page generated in 0.0928 seconds