In many applications, such as conversational agents, virtual reality, movies, and games, animated facial expressions of computer-generated (CG) characters are used to communicate, teach, or entertain. With an increased demand for CG characters, it is important to animate accurate, realistic facial expressions because human facial expressions communicate a wealth of information. However, realistically animating faces is challenging and time-consuming for two reasons. First, human observers are adept at detecting anomalies in realistic CG facial animations. Second, traditional animation techniques based on keyframing sometimes approximate the dynamics of facial expressions or require extensive artistic input while high-resolution performance capture techniques are cost prohibitive. In this thesis, we develop a framework to explore representations of two key facial expressions, blinks and smiles, and we show that data-driven models are needed to realistically animate these expressions. Our approach relies on utilizing high-resolution performance capture data to build models that can be used in traditional keyframing systems. First, we record large collections of high-resolution dynamic expressions through video and motion capture technology. Next, we build expression-specific models of the dynamic data properties of blinks and smiles. We explore variants of the model and assess whether viewers perceive the models as more natural than the simplified models present in the literature. In the first part of the thesis, we build a generative model of the characteristic dynamics of blinks: fast closing of the eyelids followed by a slow opening. Blinks have a characteristic profile with relatively little variation across instances or people. Our results demonstrate the need for an accurate model of eye blink dynamics rather than simple approximations, as viewers perceive the difference. In the second part of the thesis, we investigate how spatial and temporal linearities impact smile genuineness and build a model for genuine smiles. Our perceptual results indicate that a smile model needs to preserve temporal information. With this model, we synthesize perceptually genuine smiles that outperform traditional animation methods accompanied by plausible head motions. In the last part of the thesis, we investigate how blinks synchronize with the start and end of spontaneous smiles. Our analysis shows that eye blinks correlate with the end of the smile and occur before the lip corners stop moving downwards. We argue that the timing of blinks relative to smiles is useful in creating compelling facial expressions. Our work is directly applicable to current methods in animation. For example, we illustrate how our models can be used in the popular framework of blendshape animation to increase realism while keeping the system complexity low. Furthermore, our perceptual results can inform the design of realistic animation systems by highlighting common assumptions that over-simplify the dynamics of expressions.
Identifer | oai:union.ndltd.org:cmu.edu/oai:repository.cmu.edu:dissertations-1428 |
Date | 01 August 2014 |
Creators | Trutoiu, Laura |
Publisher | Research Showcase @ CMU |
Source Sets | Carnegie Mellon University |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | Dissertations |
Page generated in 0.0019 seconds