تقرير
SoloPose: One-Shot Kinematic 3D Human Pose Estimation with Video Data Augmentation
العنوان: | SoloPose: One-Shot Kinematic 3D Human Pose Estimation with Video Data Augmentation |
---|---|
المؤلفون: | Jeong, David C., Liu, Hongji, Salazar, Saunder, Jiang, Jessie, Kitts, Christopher A. |
سنة النشر: | 2023 |
المجموعة: | Computer Science |
مصطلحات موضوعية: | Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence |
الوصف: | While recent two-stage many-to-one deep learning models have demonstrated great success in 3D human pose estimation, such models are inefficient ways to detect 3D key points in a sequential video relative to one-shot and many-to-many models. Another key drawback of two-stage and many-to-one models is that errors in the first stage will be passed onto the second stage. In this paper, we introduce SoloPose, a novel one-shot, many-to-many spatio-temporal transformer model for kinematic 3D human pose estimation of video. SoloPose is further fortified by HeatPose, a 3D heatmap based on Gaussian Mixture Model distributions that factors target key points as well as kinematically adjacent key points. Finally, we address data diversity constraints with the 3D AugMotion Toolkit, a methodology to augment existing 3D human pose datasets, specifically by projecting four top public 3D human pose datasets (Humans3.6M, MADS, AIST Dance++, MPI INF 3DHP) into a novel dataset (Humans7.1M) with a universal coordinate system. Extensive experiments are conducted on Human3.6M as well as the augmented Humans7.1M dataset, and SoloPose demonstrates superior results relative to the state-of-the-art approaches. Comment: 8 pages, 6 figures |
نوع الوثيقة: | Working Paper |
URL الوصول: | http://arxiv.org/abs/2312.10195 |
رقم الأكسشن: | edsarx.2312.10195 |
قاعدة البيانات: | arXiv |
الوصف غير متاح. |