Global ETD Search

Return to search

Learning Consistent Visual Synthesis

With the rapid development of photography, we can easily record the 3D world by taking photos and videos. In traditional images and videos, the viewer observes the scene from fixed viewpoints and cannot navigate the scene or edit the 2D observation afterward.

Thus, visual content editing and synthesis become an essential task in computer vision.
However, achieving high-quality visual synthesis often requires a complex and expensive multi-camera setup. This is not practical for daily use because most people only have one cellphone camera. But a single camera, on the contrary, could not provide enough multi-view constraints to synthesize consistent visual content.

Therefore, in this thesis, I address this challenging single-camera visual synthesis problem by leveraging different regularizations. I study three consistent synthesis problems: time-consistent synthesis, view-consistent synthesis, and view-time-consistent synthesis. I show how we can take cellphone-captured monocular images and videos as input to model the scene and consistently synthesize new content for an immersive viewing experience. / Doctor of Philosophy / With the rapid development of photography, we can easily record the 3D world by taking photos and videos. More recently, we have incredible cameras on cell phones, which enable us to take pro-level photos and videos. Those powerful cellphones even have advanced computational photography features build-in. However, these features focus on faithfully recording the world during capturing. We can only watch the photo and video as it is, but not navigate the scene, edit the 2D observation, or synthesize content afterward.

Thus, visual content editing and synthesis become an essential task in computer vision. We know that achieving high-quality visual synthesis often requires a complex and expensive multi-camera setup. This is not practical for daily use because most people only have one cellphone camera. But a single camera, on the contrary, is not enough to synthesize consistent visual content.

Therefore, in this thesis, I address this challenging single-camera visual synthesis problem by leveraging different regularizations. I study three consistent synthesis problems: time-consistent synthesis, view-consistent synthesis, and view-time-consistent synthesis. I show how we can take cellphone-captured monocular images and videos as input to model the scene and consistently synthesize new content for an immersive viewing experience.

Computer vision

Computational photography

View synthesis

Temporal consistency

Identifer	oai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/111588
Date	22 August 2022
Creators	Gao, Chen
Contributors	Electrical and Computer Engineering, Huang, Jia-Bin, Dhillon, Harpreet Singh, Kopf, Johannes Peter, Huang, Bert, Abbott, A. Lynn
Publisher	Virginia Tech
Source Sets	Virginia Tech Theses and Dissertation
Language	English
Detected Language	English
Type	Dissertation
Format	ETD, application/pdf
Rights	Creative Commons Attribution 4.0 International, http://creativecommons.org/licenses/by/4.0/

Page generated in 0.0023 seconds

Learning Consistent Visual Synthesis

Description

Links & Downloads

Tags

Additional Fields