Spelling suggestions: "subject:"balking heat"" "subject:"balking held""
1 |
On the development of an Interactive talking head systemAthanasopoulos, Michael, Ugail, Hassan, Gonzalez Castro, Gabriela January 2010 (has links)
No / In this work we propose a talking head system for animating facial expressions using a template face generated from partial differential equations (PDE). It uses a set of pre configured curves to calculate an internal template surface face. This surface is then used to associate various facial features with a given 3D face object. Motion retargeting is then used to transfer the deformations in these areas from the template to the target object. The procedure is continued until all the expressions in the database are calculated and transferred to the target 3D human face object. Additionally the system interacts with the user using an artificial intelligence (AI) chatter bot to generate response from a given text. Speech and facial animation are synchronized using the Microsoft Speech API, where the response from the AI bot is converted to speech.
|
2 |
Lietuvių kalbos animavimo technologija taikant trimatį veido modelį / Lithuanian speech animation technology for 3D facial modelMažonavičiūtė, Ingrida 18 February 2013 (has links)
Kalbos animacija plačiai naudojama technikos įrenginiuose siekiant kurtiesiems, vaikams, vidutinio ir vyresnio amžiaus žmonėms sudaryti vienodas bendravimo galimybes. Žmonės yra labai jautrūs veido išvaizdos pokyčiams, todėl kalbos animavimas yra sudėtingas procesas, kurio metu žmogaus kalboje atpažinta akustinė informacija (fonemos) yra vizualizuojama naudojant specialiai sumodeliuotas veido išraiškas vadinamas vizemomis. Didžiausią įtaką kalbos animacijos tikroviškumui turi teisingas fonemas atitinkančių vizemų identifikavimas, modeliavimas ir jų išrikiavimas laiko juostoje. Tačiau, norint užtikrinti kalbos animacijos natūralumą, būtina papildomai išnalizuoti vizemų įtaką kaimyninėms fonemoms ir atsižvelgiant į animuojamos kalbos fonetines savybes sukurti koartikuliacijos valdymo modelį. Kiekvienos kalbos fonetika skiriasi, todėl kitai vienai kalbai sukurta animavimo sistema nėra tiesiogiai tinkama kitai kalbai animuoti. Kalbos animavimo karkasas, kuriame realizuojama Lietuvių kalbai skirta animavimo technologija, turi būti sukurta lietuvių kalbai vizualizuoti.
Darbą sudaro įvadas, trys pagrindiniai skyriai, bendrosios išvados, literatūros sąrašas, publikacijų sąrašas.
Pirmame skyriuje Skyriuje analizuojamos pasaulyje naudojamos kalbos animavimo technologijos. Kalbos signalas yra ir girdimas, ir matomas, todėl jos animacija yra sudėtinis procesas priklausantis nuo pasirinktos veido modeliavimo metodikos, kalbos signalo tipo, ir koartikuliacijos valdymo modelio.
Antrajame... [toliau žr. visą tekstą] / Speech animation is widely used in technical devices to allow the growing number of hearing impaired persons, children, middle-aged and elderly equal participation in communication. Speech animation systems (“Talking heads”) are basically driven by speech phonetics and their visual representation – visemes. Acuraccy of the chosen speech recognition engine, naturally looking visemes, phoneme to viseme mapping and coarticulation control model considerably influence the quality of animated speech. Speech animation is strongly related with language phonetics, so new“Talking heads” should be created to animate different languages. Framework suitable to animate Lithuanian speech, which includes two new models that help to improve intelligibility of animated Lithuanian speech is used to create Lithuanian „Talking head” „LIT”.
The dissertation consists of Introduction, three main chapters and general conclusions.
Chapter 1 provides the analysis of the existing speech animation technologies. Different facial modelling techniques are analysed to define the most suitable 3D „Talking head” modelling technique for Lithuanian language. Viseme classification experiments across different languages are analysed to identify variety of viseme classification methods. Coarticulation control models are compared to deside which one should be used to define coarticulation of Lithuanian speech.
Chapter 2 describes theoretical framework for Lithuanian speech animation. Translingual visual speech... [to full text]
|
3 |
Verification, Validation and Evaluation of the Virtual Human Markup Language (VHML) / Verifiering, validering och utvärdering av Virtual Human Markup Language (VHML)Gustavsson, Camilla, Strindlund, Linda, Wiknertz, Emma January 2002 (has links)
Human communication is inherently multimodal. The information conveyed through body language, facial expression, gaze, intonation, speaking style etc. are all important components of everyday communication. An issue within computer science concerns how to provide multimodal agent based systems. Those are systems that interact with users through several channels. These systems can include Virtual Humans. A Virtual Human might for example be a complete creature, i.e. a creature with a whole body including head, arms, legs etc. but it might also be a creature with only a head, a Talking Head. The aim of the Virtual Human Markup Language (VHML) is to control Virtual Humans regarding speech, facial animation, facial gestures and body animation. These parts have previously been implemented and investigated separately, but VHML aims to combine them. In this thesis VHML is verified, validated and evaluated in order to reach that aim and thus VHML is made more solid, homogenous and complete. Further, a Virtual Human has to communicate with the user and even though VHML supports a number of other ways of communication, an important communication channel is speech. The Virtual Human has to be able to interact with the user, therefore a dialogue between the user and the Virtual Human has to be created. These dialogues tend to expand tremendously, hence the Dialogue Management Tool (DMT) was developed. Having a toolmakes it easier for programmers to create and maintain dialogues for the interaction. Finally, in order to demonstrate the work done in this thesis a Talking Head application, The Mystery at West Bay Hospital, has been developed and evaluated. This has shown the usefulness of the DMT when creating dialogues. The work that has been accomplished within this project has contributed to simplify the development of Talking Head applications.
|
4 |
Verification, Validation and Evaluation of the Virtual Human Markup Language (VHML) / Verifiering, validering och utvärdering av Virtual Human Markup Language (VHML)Gustavsson, Camilla, Strindlund, Linda, Wiknertz, Emma January 2002 (has links)
<p>Human communication is inherently multimodal. The information conveyed through body language, facial expression, gaze, intonation, speaking style etc. are all important components of everyday communication. An issue within computer science concerns how to provide multimodal agent based systems. Those are systems that interact with users through several channels. These systems can include Virtual Humans. A Virtual Human might for example be a complete creature, i.e. a creature with a whole body including head, arms, legs etc. but it might also be a creature with only a head, a Talking Head. The aim of the Virtual Human Markup Language (VHML) is to control Virtual Humans regarding speech, facial animation, facial gestures and body animation. These parts have previously been implemented and investigated separately, but VHML aims to combine them. In this thesis VHML is verified, validated and evaluated in order to reach that aim and thus VHML is made more solid, homogenous and complete. Further, a Virtual Human has to communicate with the user and even though VHML supports a number of other ways of communication, an important communication channel is speech. The Virtual Human has to be able to interact with the user, therefore a dialogue between the user and the Virtual Human has to be created. These dialogues tend to expand tremendously, hence the Dialogue Management Tool (DMT) was developed. Having a toolmakes it easier for programmers to create and maintain dialogues for the interaction. Finally, in order to demonstrate the work done in this thesis a Talking Head application, The Mystery at West Bay Hospital, has been developed and evaluated. This has shown the usefulness of the DMT when creating dialogues. The work that has been accomplished within this project has contributed to simplify the development of Talking Head applications.</p>
|
5 |
Anglų kalbos vizemų pritaikymas lietuvių kalbos garsų animacijai / English visemes and Lithuanian phonemes mapping for animationMažonavičiūtė, Ingrida 27 June 2008 (has links)
Baigiamajame darbe tiriamas lietuvių kalbos garsų ir jų vaizdinės informacijos ryšys. Atliekama kalbančių galvų modelių animavimo algoritmų analizė, iškeliama jų problematika ir atsivelgiant į tai pasiūloma lietuvių kalbos sintetinimo metodika, kuri yra pagrįsta anglų kalbos vizemų naudojimu. Šiame darbe sukuriama 30 trimačių lietuvių kalbos vizemų, kurias vizualiai lyginant su standartinėmis anglų kalbos fonemų vizemomis, sudaroma lietuviškų fonemų ir angliškų vizemų atitikčių lentelė. Sudaryta lentelė naudojama lietuvių kalbos garso rinkmenai animuoti. / The connection of Lithuanian sounds and their visual aspect is analyzed. The thesis consists of talking head animation algorithms analysis, problematic topics. In reference it is proposed the idea, how to synthesize Lithuanian speech using English visemes. 30 three dimensional Lithuanian visemes are created. After visual comparison of 3D Lithuanian and standard English visemes, the table of Lithuanian phonemes and English visemes mapping is created. The table is used for animating the Lithuanian sound file.
|
6 |
Visual speech synthesis by learning joint probabilistic models of audio and videoDeena, Salil Prashant January 2012 (has links)
Visual speech synthesis deals with synthesising facial animation from an audio representation of speech. In the last decade or so, data-driven approaches have gained prominence with the development of Machine Learning techniques that can learn an audio-visual mapping. Many of these Machine Learning approaches learn a generative model of speech production using the framework of probabilistic graphical models, through which efficient inference algorithms can be developed for synthesis. In this work, the audio and visual parameters are assumed to be generated from an underlying latent space that captures the shared information between the two modalities. These latent points evolve through time according to a dynamical mapping and there are mappings from the latent points to the audio and visual spaces respectively. The mappings are modelled using Gaussian processes, which are non-parametric models that can represent a distribution over non-linear functions. The result is a non-linear state-space model. It turns out that the state-space model is not a very accurate generative model of speech production because it assumes a single dynamical model, whereas it is well known that speech involves multiple dynamics (for e.g. different syllables) that are generally non-linear. In order to cater for this, the state-space model can be augmented with switching states to represent the multiple dynamics, thus giving a switching state-space model. A key problem is how to infer the switching states so as to model the multiple non-linear dynamics of speech, which we address by learning a variable-order Markov model on a discrete representation of audio speech. Various synthesis methods for predicting visual from audio speech are proposed for both the state-space and switching state-space models. Quantitative evaluation, involving the use of error and correlation metrics between ground truth and synthetic features, is used to evaluate our proposed method in comparison to other probabilistic models previously applied to the problem. Furthermore, qualitative evaluation with human participants has been conducted to evaluate the realism, perceptual characteristics and intelligibility of the synthesised animations. The results are encouraging and demonstrate that by having a joint probabilistic model of audio and visual speech that caters for the non-linearities in audio-visual mapping, realistic visual speech can be synthesised from audio speech.
|
Page generated in 0.0767 seconds