• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • Tagged with
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Reconstructing 3D Humans From Visual Data

Zheng, Ce 01 January 2023 (has links) (PDF)
Understanding humans in visual content is fundamental for numerous computer vision applications. Extensive research has been conducted in the field of human pose estimation (HPE) to accurately locate joints and construct body representations from images and videos. Expanding on HPE, human mesh recovery (HMR) addresses the more complex task of estimating the 3D pose and shape of the entire human body. HPE and HMR have gained significant attention due to their applications in areas such as digital human avatar modeling, AI coaching, and virtual reality [135]. However, HPE and HMR come with notable challenges, including intricate body articulation, occlusion, depth ambiguity, and the limited availability of annotated 3D data. Despite the progress made so far, the research community continues to strive for robust, accurate, and efficient solutions in HPE and HMR, advancing us closer to the ultimate goals in the field. This dissertation tackles various challenges in the domains of HPE and HMR. The initial focus is on video-based HPE, where we proposed a transformer architecture named PoseFormer [136] to leverage to capture the spatial relationships between body joints and temporal correlations across frames. This approach effectively harnesses the comprehensive connectivity and expressive power of transformers, leading to improved pose estimation accuracy in video sequences. Building upon this, the dissertation addresses the heavy computational and memory burden associated with image-based HMR. Our proposed Feater Map-based Transformer method (FeatER [133]) and a Pooling attention transformer method (POTTER[130]), demonstrate superior performance while significantly reducing computational and memory requirements compared to existing state-of-the-art techniques. Furthermore, a diffusion-based framework (DiffMesh[134]) is proposed for reconstructing high-quality human mesh outputs given input video sequences. These achievements provide practical and efficient solutions that cater to the demands of real-world applications in HPE and HMR. In this dissertation, our contributions advance the fields of HPE and HMR, bringing us closer to accurate and efficient solutions for understanding humans in visual content.
2

Humans in the wild : NeRFs for Dynamic Scenes Modeling from In-the-Wild Monocular Videos with Humans

Alessandro, Sanvito January 2023 (has links)
Recent advancements in computer vision have led to the emergence of Neural Radiance Fields (NeRFs), a powerful tool for reconstructing photorealistic 3D scenes, even in dynamic settings. However, these methods struggle when dealing with human subjects, especially when the subject is partially obscured or not completely observable, resulting in inaccurate reconstructions of geometries and textures. To address this issue, this thesis evaluates state-of-the-art human modeling using implicit representations with partial observability of the subject. We then propose and test several novel methods to improve the generalization of these models, including the use of symmetry and Signed Distance Function (SDF) driven losses and leveraging prior knowledge from multiple subjects via a pre-trained model. Our results demonstrate that our proposed methods significantly improve the accuracy of the reconstructions, even in challenging ”in-the-wild” situations, both quantitatively and qualitatively. Our approach opens new opportunities for applications such as asset generation for video games and movies and improved simulations for autonomous driving scenarios from abundant in-the-wild monocular videos. In summary, our research presents a significant improvement to the state-of-the-art human modeling using implicit representations, with important implications for 3D Computer Vision (CV) and Neural Rendering and its applications in various industries. / De senaste framstegen inom datorseende har lett till uppkomsten av Neural Radiance Fields (NeRFs), ett kraftfullt verktyg för att rekonstruera fotorealistiska 3D-scener, även i dynamiska miljöer. Dessa metoder brister dock vid hantering av människor, särskilt när människan är delvis skymd eller inte helt observerbar, vilket resulterar i felaktiga rekonstruktioner av geometrier och texturer. För att ta itu med denna fråga, utvärderar denna avhandling toppmodern mänsklig modellering med hjälp av implicita representationer med partiell observerbarhet av ämnet. Vidare föreslår, samt testar vi, flertalet nya metoder för att förbättra generaliseringen av dessa modeller, inklusive användningen av symmetri och SDF-drivna förluster och utnyttjandet av förkunskaper från flera individer via en förtränad modell. Resultaten visar att våra föreslagna metoder avsevärt förbättrar rekonstruktionernas noggrannhet, även i utmanande ”in-the-wild” situationer, både kvantitativt och kvalitativt. Vårt tillvägagångssätt skapar nya möjligheter för applikationer som tillgångsgenerering för videospel och filmer och förbättrade simuleringar för scenarier för autonom körning från rikliga monokulära videor. Sammanfattningsvis, presenterar vår forskning en betydande förbättring av toppmodern modelleringen med hjälp av implicita representationer, med viktiga implikationer för 3D CV och neural rendering och dess tillämpningar i olika industrier.

Page generated in 0.1392 seconds