Global ETD Search

Return to search

Learned structural and temporal context for dynamic 3D pose optimization and tracking

Accurate 3D tracking of animals from video recordings is critical for many behavioral studies. However, other than for humans, there is a lack of publicly available datasets of videos of animals that the computer vision community could use for model development. Furthermore, due to occlusion and the uncontrollable nature of the animals, existing pose estimation models suffer from inadequate precision. People rely on biomechanical expertise to design mathematical models to optimize poses to mitigate this issue at the cost of generalization. We propose OptiPose, a generalizable attention-based deep learning pose optimization model, as a part of a post-processing pipeline for refining 3D poses estimated by pre-existing systems. Our experiments show how OptiPose is highly robust to noise and occlusion and can be used to optimize pose sequences provided by state-of-the-art models for animal pose estimation. Furthermore, we will make Rodent3D, a multimodal (RGB, Thermal, and Depth) dataset for rats, publicly available.

https://hdl.handle.net/2144/45222

Artificial intelligence

Identifer	oai:union.ndltd.org:bu.edu/oai:open.bu.edu:2144/45222
Date	30 September 2022
Creators	Patel, Mahir
Contributors	Betke, Margrit
Source Sets	Boston University
Language	en_US
Detected Language	English
Type	Thesis/Dissertation
Rights	Attribution-NonCommercial-ShareAlike 4.0 International, http://creativecommons.org/licenses/by-nc-sa/4.0/

Page generated in 0.0019 seconds

Learned structural and temporal context for dynamic 3D pose optimization and tracking

Description

Links & Downloads

Tags

Additional Fields