Accurate 3D tracking of animals from video recordings is critical for many behavioral studies. However, other than for humans, there is a lack of publicly available datasets of videos of animals that the computer vision community could use for model development. Furthermore, due to occlusion and the uncontrollable nature of the animals, existing pose estimation models suffer from inadequate precision. People rely on biomechanical expertise to design mathematical models to optimize poses to mitigate this issue at the cost of generalization. We propose OptiPose, a generalizable attention-based deep learning pose optimization model, as a part of a post-processing pipeline for refining 3D poses estimated by pre-existing systems. Our experiments show how OptiPose is highly robust to noise and occlusion and can be used to optimize pose sequences provided by state-of-the-art models for animal pose estimation. Furthermore, we will make Rodent3D, a multimodal (RGB, Thermal, and Depth) dataset for rats, publicly available.
Identifer | oai:union.ndltd.org:bu.edu/oai:open.bu.edu:2144/45222 |
Date | 30 September 2022 |
Creators | Patel, Mahir |
Contributors | Betke, Margrit |
Source Sets | Boston University |
Language | en_US |
Detected Language | English |
Type | Thesis/Dissertation |
Rights | Attribution-NonCommercial-ShareAlike 4.0 International, http://creativecommons.org/licenses/by-nc-sa/4.0/ |
Page generated in 0.0019 seconds