Global ETD Search

Return to search

Speech-driven animation using multi-modal hidden Markov models

The main objective of this thesis was the synthesis of speech synchronised motion, in particular head motion. The hypothesis that head motion can be estimated from the speech signal was confirmed. In order to achieve satisfactory results, a motion capture data base was recorded, a definition of head motion in terms of articulation was discovered, a continuous stream mapping procedure was developed, and finally the synthesis was evaluated. Based on previous research into non-verbal behaviour basic types of head motion were invented that could function as modelling units. The stream mapping method investigated in this thesis is based on Hidden Markov Models (HMMs), which employ modelling units to map between continuous signals. The objective evaluation of the modelling parameters confirmed that head motion types could be predicted from the speech signal with an accuracy above chance, close to 70%. Furthermore, a special type ofHMMcalled trajectoryHMMwas used because it enables synthesis of continuous output. However head motion is a stochastic process therefore the trajectory HMM was further extended to allow for non-deterministic output. Finally the resulting head motion synthesis was perceptually evaluated. The effects of the “uncanny valley” were also considered in the evaluation, confirming that rendering quality has an influence on our judgement of movement of virtual characters. In conclusion a general method for synthesising speech-synchronised behaviour was invented that can applied to a whole range of behaviours.

http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.562793

502.85

Identifer	oai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:562793
Date	January 2010
Creators	Hofer, Gregor Otto
Contributors	Hiroshi, Shimodaira. : Renals, Steve
Publisher	University of Edinburgh
Source Sets	Ethos UK
Detected Language	English
Type	Electronic Thesis or Dissertation
Source	http://hdl.handle.net/1842/3786

Page generated in 0.0015 seconds

Speech-driven animation using multi-modal hidden Markov models

Description

Links & Downloads

Tags

Additional Fields