Return to search

Capture, analysis and synthesis of photorealistic crowds

This thesis explores techniques for synthesizing crowds from imagery. Synthetic photorealistic crowds
are desirable for cinematic gaming, special effects and architectural visualization. While motion
captured-based techniques for the animation and control of crowds have been well-studied
in computer graphics, the resulting control rig sequences require a laborious model-based graphics pipeline
to render photorealistic videos of crowds.

Over the past ten years, data-driven techniques for rendering imagery of complex phenomena
have become a popular alternative to model-based graphics. This popularity is due in large
part to difficulties in constructing the sufficiently-detailed models that are required to achieve
photorealism. A dynamic crowd of humans is an extremely challenging example of such phenomena.
Example-based synthesis methods such as video textures are an appealing alternative, but current
techniques are unable to handle new challenges posed by crowds.

This thesis describes how to synthesize video-based crowds by explicitly segmenting pedestrians from
input videos of natural crowds and optimally placing them into an output video while satisfying
environmental constraints imposed by the scene. There are three key challenges. First, the crowd layout of segmented videos must satisfy constraints imposed by environmental and crowd obstacles. This thesis addresses four types of environmental constraints: (a) ground planes in the scene which are valid for crowd traversal, such as sidewalks,
(b) spatial regions of these planes where crowds may enter and exit the scene, (c) static obstacles, such as mailboxes and walls of a building, and (d) dynamic obstacles such as individuals and groups of individuals. Second, pedestrians and groups of pedestrians should be segmented from the input video with no artifacts and minimal interaction time. This is challenging in real world scenes due to significant appearance changes while traveling through the scene. Third, segmented pedestrian videos may not have enough frames or the right shape to compose a path from an artist-defined entrance to exit. Plausible temporal transitions between segmented pedestrians are therefore needed but they are difficult to identify and synthesize due to complex self occlusions.

We present a novel algorithm for composing video billboards, represented by crowd tubes, to form
a crowd while avoiding collisions between static and dynamic obstacles. Crowd tubes are represented
in the scene using a temporal sequence of circles planted in the calibrated ground plane. The approach consists of
representing crowd tube samples and constraint violations with a conflict graph. The maximal independent
set yields a dense crowd composition. We present a prototype system for the capture, analysis, synthesis and control
of video-based crowds. Several results demonstrate the system's ability to generate videos of crowds
which exhibit a variety of natural behaviors.

Identiferoai:union.ndltd.org:GATECH/oai:smartech.gatech.edu:1853/37310
Date17 November 2010
CreatorsFlagg, Matthew
PublisherGeorgia Institute of Technology
Source SetsGeorgia Tech Electronic Thesis and Dissertation Archive
Detected LanguageEnglish
TypeDissertation

Page generated in 0.0018 seconds