Global ETD Search

41	[en] A SYSTEM FOR GENERATING DYNAMIC FACIAL EXPRESSIONS IN 3D FACIAL ANIMATION WITH SPEECH PROCESSING / [pt] UM SISTEMA DE GERAÇÃO DE EXPRESSÕES FACIAIS DINÂMICAS EM ANIMAÇÕES FACIAIS 3D COM PROCESSAMENTO DE FALA PAULA SALGADO LUCENA RODRIGUES 24 April 2008 (has links) [pt] Esta tese apresenta um sistema para geração de expressões faciais dinâmicas sincronizadas com a fala em uma face realista tridimensional. Entende-se por expressões faciais dinâmicas aquelas que variam ao longo do tempo e que semanticamente estão relacionadas às emoções, à fala e a fenômenos afetivos que podem modificar o comportamento de uma face em uma animação. A tese define um modelo de emoção para personagens virtuais falantes, de- nominado VeeM (Virtual emotion-to-expression Model ), proposto a partir de uma releitura e uma reestruturação do modelo do círculo emocional de Plutchik. O VeeM introduz o conceito de um hipercubo emocional no espaço canônico do R4 para combinar emoções básicas, dando origem a emoções derivadas. Para validação do VeeM é desenvolvida uma ferramenta de autoria e apresentação de animações faciais denominada DynaFeX (Dynamic Facial eXpression), onde um processamento de fala é realizado para permitir o sincronismo entre fonemas e visemas. A ferramenta permite a definição e o refinamento de emoções para cada quadro ou grupo de quadros de uma animação facial. O subsistema de autoria permite também, alternativamente, uma manipulação em alto-nível, através de scripts de animação. O subsistema de apresentação controla de modo sincronizado a fala da personagem e os aspectos emocionais editados. A DynaFeX faz uso de uma malha poligonal tridimensional baseada no padrão MPEG-4 de animação facial, favorecendo a interoperabilidade da ferramenta com outros sistemas de animação facial. / [en] This thesis presents a system for generating dynamic facial expressions synchronized with speech, rendered using a tridimensional realistic face. Dynamic facial expressions are those temporal-based facial expressions semanti- cally related with emotions, speech and affective inputs that can modify a facial animation behavior. The thesis defines an emotion model for speech virtual actors, named VeeM (Virtual emotion-to-expression Model ), which is based on a revision of the emotional wheel of Plutchik model. The VeeM introduces the emotional hypercube concept in the R4 canonical space to combine pure emotions and create new derived emotions. In order to validate VeeM, it has been developed an authoring and player facial animation tool, named DynaFeX (Dynamic Facial eXpression), where a speech processing is realized to allow the phoneme and viseme synchronization. The tool allows either the definition and refinement of emotions for each frame, or group of frames, as the facial animation edition using a high-level approach based on animation scripts. The tool player controls the animation presentation synchronizing the speech and emotional features with the virtual character performance. DynaFeX is built over a tridimensional polygonal mesh, compliant with MPEG-4 facial animation standard, what favors tool interoperability with other facial animation systems. [pt] MPEG-4 [en] MPEG-4 [pt] ANIMACAO FACIAL [en] FACIAL ANIMATION [pt] EXPRESSOES FACIAIS DINAMICAS [en] DYNAMIC FACIAL EXPRESSIONS [pt] PROCESSAMENTO DE FALA [en] SPEECH PROCESSING [pt] MODELO DE EMOCAO [en] EMOTION MODEL [pt] HIPERCUBO EMOCIONAL [en] EMOTIONAL HYPERCUBE [pt] PERSONAGENS VIRTUAIS FALANTES [en] VIRTUAL TALKING CHARACTERS
42	Hybrid Methods for the Analysis and Synthesis of Human Faces Paier, Wolfgang 18 November 2024 (has links) Der Trend hin zu virtueller Realität (VR) hat neues Interesse an Themen wie der Modellierung menschlicher Körper geweckt, da sich neue Möglichkeiten für Unterhaltung, Konferenzsysteme und immersive Anwendungen bieten. Diese Dissertation stellt deshalb neue Ansätze für die Erstellung animierbarer/realistischer 3D-Kopfmodelle, zur computergestützten Gesichtsanimation aus Text/Sprache sowie zum fotorealistischen Echtzeit-Rendering vor. Um die 3D-Erfassung zu vereinfachen, wird ein hybrider Ansatz genutzt, der statistische Kopfmodelle mit dynamischen Texturen kombiniert. Das Modell erfasst Kopfhaltung und großflächige Deformationen, während die Texturen feine Details und komplexe Bewegungen kodieren. Anhand der erfassten Daten wird ein generatives Modell trainiert, das realistische Gesichtsausdrücke aus einem latenten Merkmalsvektor rekonstruiert. Zudem wird eine neue neuronale Rendering-Technik presentiert, die lernt den Vordergrund (Kopf) vom Hintergrund zu trennen. Das erhöht die Flexibilität während der Inferenz (z. B. neuer Hintergrund) und vereinfacht den Trainingsprozess, da die Segmentierung nicht vorab berechnet werden muss. Ein neuer Animationsansatz ermöglicht die automatische Synthese von Gesichtsvideos auf der Grundlage weniger Trainingssequenzen. Im Gegensatz zu bestehenden Arbeiten lernt das Verfahren einen latenten Merkmalsraum, der sowohl Emotionen als auch visuelle Variationen der Sprache erfasst, während gelernte Priors Animations-Artefakte und unrealistische Kopfbewegungen minimieren. Nach dem Training ist es möglich, realistische Sprachsequenzen zu erzeugen, während der latente Stil-Raum zusätzliche Gestaltungsmöglichkeiten bietet. Die vorgestellten Methoden bilden ein Komplettsystem für die realistische 3D-Modellierung, Animation und Darstellung von menschlichen Köpfen, das den Stand der Technik übertrifft. Dies wird in verschiedenen Experimenten, Ablations-/Nutzerstudien gezeigt und ausführlich diskutiert. / The recent trend of virtual reality (VR) has sparked new interest in human body modeling by offering new possibilities for entertainment, conferencing, and immersive applications (e.g., intelligent virtual assistants). Therefore, this dissertation presents new approaches to creating animatable and realistic 3D head models, animating human faces from text/speech, and the photo-realistic rendering of head models in real-time. To simplify complex 3D face reconstruction, a hybrid approach is introduced that combines a lightweight statistical head model for 3D geometry with dynamic textures. The model captures head orientation and large-scale deformations, while textures encode fine details and complex motions. A deep variational autoencoder trained on these textured meshes learns to synthesize realistic facial expressions from a compact vector. Additionally, a new neural-rendering technique is proposed that separates the head (foreground) from the background, providing more flexibility during inference (e.g., rendering on novel backgrounds) and simplifying the training process as no segmentation masks have to be pre-computed. This dissertation also presents a new neural-network-based approach to synthesizing novel face animations based on emotional speech videos of an actor. Unlike existing works, the proposed model learns a latent animation style space that captures emotions as well as natural variations in visual speech. Additionally, learned animation priors minimize animation artifacts and unrealistic head movements. After training, the animation model offers temporally consistent editing of the animation style according to the users’ needs. Together, the presented methods provide an end-to-end system for realistic 3D modeling, animation, and rendering of human heads. Various experimental results, ablation studies, and user evaluations demonstrate that the proposed approaches outperform the state-of-the-art. 3D Kopferfassung 3D Kopfmodelle Neuronales Rendering Automatische Gesichtsanimation 3D Talking Heads Textgestüzte Gesichtsanimation Human Face Capture 3D Head Model Neural Rendering Neural Face Animation 3D Talking Heads Text-based Facial Animation 004 Informatik ST 330 ST 177 ddc:004

Search results

[en] A SYSTEM FOR GENERATING DYNAMIC FACIAL EXPRESSIONS IN 3D FACIAL ANIMATION WITH SPEECH PROCESSING / [pt] UM SISTEMA DE GERAÇÃO DE EXPRESSÕES FACIAIS DINÂMICAS EM ANIMAÇÕES FACIAIS 3D COM PROCESSAMENTO DE FALA

Hybrid Methods for the Analysis and Synthesis of Human Faces