Platypose: Calibrated Zero-Shot Multi-Hypothesis 3D Human Motion Estimation

Paweł A. Pierzchlewicz, Caio da Silva, R. James Cotton, Fabian H. Sinz
Computer Science, Computer Vision and Pattern Recognition, Computer Vision and Pattern Recognition (cs.CV)
2024-03-10 00:00:00
Single camera 3D pose estimation is an ill-defined problem due to inherent ambiguities from depth, occlusion or keypoint noise. Multi-hypothesis pose estimation accounts for this uncertainty by providing multiple 3D poses consistent with the 2D measurements. Current research has predominantly concentrated on generating multiple hypotheses for single frame static pose estimation. In this study we focus on the new task of multi-hypothesis motion estimation. Motion estimation is not simply pose estimation applied to multiple frames, which would ignore temporal correlation across frames. Instead, it requires distributions which are capable of generating temporally consistent samples, which is significantly more challenging. To this end, we introduce Platypose, a framework that uses a diffusion model pretrained on 3D human motion sequences for zero-shot 3D pose sequence estimation. Platypose outperforms baseline methods on multiple hypotheses for motion estimation. Additionally, Platypose also achieves state-of-the-art calibration and competitive joint error when tested on static poses from Human3.6M, MPI-INF-3DHP and 3DPW. Finally, because it is zero-shot, our method generalizes flexibly to different settings such as multi-camera inference.
PDF: Platypose: Calibrated Zero-Shot Multi-Hypothesis 3D Human Motion Estimation.pdf
Empowered by ChatGPT