Global ETD Search

Return to search

Adapting Single-View View Synthesis with Multiplane Images for 3D Video Chat

Activities like one-on-one video chatting and video conferencing with multiple participants are more prevalent than ever today as we continue to tackle the pandemic. Bringing a 3D feel to video chat has always been a hot topic in Vision and Graphics communities. In this thesis, we have employed novel view synthesis in attempting to turn one-on-one video chatting into 3D. We have tuned the learning pipeline of Tucker and Snavely's single-view view synthesis paper — by retraining it on MannequinChallenge dataset — to better predict a layered representation of the scene viewed by either video chat participant at any given time. This intermediate representation of the local light field — called a Multiplane Image (MPI) — may then be used to rerender the scene at an arbitrary viewpoint which, in our case, would match with the head pose of the watcher in the opposite, concurrent video frame. We discuss that our pipeline, when implemented in real-time, would allow both video chat participants to unravel occluded scene content and "peer into" each other's dynamic video scenes to a certain extent. It would enable full parallax up to the baselines of small head rotations and/or translations. It would be similar to a VR headset's ability to determine the position and orientation of the wearer's head in 3D space and render any scene in alignment with this estimated head pose. We have attempted to improve the performance of the retrained model by extending MannequinChallenge with the much larger RealEstate10K dataset. We present a quantitative and qualitative comparison of the model variants and describe our impactful dataset curation process, among other aspects.

Virtual Reality

Neural Networks

Deep Learning

Image-Based Rendering

Computational Photography

3D Video Conferencing

Artificial Intelligence and Robotics

Identifer	oai:union.ndltd.org:CALPOLY/oai:digitalcommons.calpoly.edu:theses-3987
Date	01 December 2021
Creators	Uppuluri, Anurag Venkata
Publisher	DigitalCommons@CalPoly
Source Sets	California Polytechnic State University
Detected Language	English
Type	text
Format	application/pdf
Source	Master's Theses

Page generated in 0.0014 seconds

Adapting Single-View View Synthesis with Multiplane Images for 3D Video Chat

Description

Links & Downloads

Tags

Additional Fields