Global ETD Search

Return to search

Reconstructing 3D Humans From Visual Data

Understanding humans in visual content is fundamental for numerous computer vision applications. Extensive research has been conducted in the field of human pose estimation (HPE) to accurately locate joints and construct body representations from images and videos. Expanding on HPE, human mesh recovery (HMR) addresses the more complex task of estimating the 3D pose and shape of the entire human body. HPE and HMR have gained significant attention due to their applications in areas such as digital human avatar modeling, AI coaching, and virtual reality [135]. However, HPE and HMR come with notable challenges, including intricate body articulation, occlusion, depth ambiguity, and the limited availability of annotated 3D data. Despite the progress made so far, the research community continues to strive for robust, accurate, and efficient solutions in HPE and HMR, advancing us closer to the ultimate goals in the field.
This dissertation tackles various challenges in the domains of HPE and HMR. The initial focus is on video-based HPE, where we proposed a transformer architecture named PoseFormer [136] to leverage to capture the spatial relationships between body joints and temporal correlations across frames. This approach effectively harnesses the comprehensive connectivity and expressive power of transformers, leading to improved pose estimation accuracy in video sequences. Building upon this, the dissertation addresses the heavy computational and memory burden associated with image-based HMR. Our proposed Feater Map-based Transformer method (FeatER [133]) and a Pooling attention transformer method (POTTER[130]), demonstrate superior performance while significantly reducing computational and memory requirements compared to existing state-of-the-art techniques. Furthermore, a diffusion-based framework (DiffMesh[134]) is proposed for reconstructing high-quality human mesh outputs given input video sequences. These achievements provide practical and efficient solutions that cater to the demands of real-world applications in HPE and HMR.
In this dissertation, our contributions advance the fields of HPE and HMR, bringing us closer to accurate and efficient solutions for understanding humans in visual content.

human pose estimation

Identifer	oai:union.ndltd.org:ucf.edu/oai:stars.library.ucf.edu:etd2023-1039
Date	01 January 2023
Creators	Zheng, Ce
Publisher	STARS
Source Sets	University of Central Florida
Language	English
Detected Language	English
Type	text
Format	application/pdf
Source	Graduate Thesis and Dissertation 2023-2024

Page generated in 0.0016 seconds

Reconstructing 3D Humans From Visual Data

Description

Links & Downloads

Tags

Additional Fields