导读 近年来,随着人工智能技术的发展和移动互联网的兴起,基于视频的低成本动作捕捉技术逐渐在游戏制作、虚拟主播、AR/VR等领域展露头角。低成本视频动捕技术大大拓展了动捕技术的受众范围,给普通用户带来了全新的内容生产体验,具有广阔的发展前景。本文主要介绍视频动捕技术的基本原理和最新的技术进展。 01 背景 02 单视角视频动捕技术的介绍
[1] Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J Black. SMPL: A skinned multiperson linear model. ACM transactions on graphics (TOG), 34(6):248, 2015.
[2] Federica Bogo, Angjoo Kanazawa, Christoph Lassner, Peter Gehler, Javier Romero, and Michael J Black. Keep it SMPL: Automatic estimation of 3D human pose and shape from a single image. In ECCV, 2016.
[3] Angjoo Kanazawa, Michael J Black, David W Jacobs, and Jitendra Malik. End-to-end recovery of human shape and pose. In CVPR, 2018.
[4] Nikos Kolotouros, Georgios Pavlakos, Michael J. Black, and Kostas Daniilidis. Learning to reconstruct 3D human pose and shape via model-fitting in the loop. In ICCV, 2019.
[5] Xiang, Donglai and Joo, Hanbyul and Sheikh, Yaser. Monocular total capture: Posing face, body, and hands in the wild. In ICCV, 2019.
[6] Angjoo Kanazawa, Jason Y. Zhang,, Panna Felsen and Jitendra Malik. Learning 3D Human Dynamics from Video. In CVPR, 2019.
[7] Kocabas, Muhammed and Athanasiou, Nikos and Black, Michael J. VIBE: Video Inference for Human Body Pose and Shape Estimation. In CVPR, 2019.
[8] Pavlakos, Georgios and Choutas, Vasileios and Ghorbani, Nima and Bolkart, Timo and Osman, Ahmed A. A. and Tzionas, Dimitrios and Black, Michael J. Expressive Body Capture: 3D Hands, Face, and Body from a Single Image. In CVPR, 2019.
[9] Sun, Yu and Ye, Yun and Liu, Wu and Gao, Wenpeng and Fu, YiLi and Mei, Tao. Human Mesh Recovery from Monocular Images via a Skeleton-disentangled Representation. In ICCV 2019.
[10] Xu, Weipeng and Chatterjee, Avishek and Zollhofer, Michael and Rhodin, Helge and Mehta, Dushyant and Seidel, Hans-Peter and Theobalt, Christian. MonoPerfCap: Human Performance Capture From Monocular Video. In SIGGRAPH, 2018.
[11] Tyler Zhu, Per Karlsson, and Christoph Bregler. SimPose: Effectively Learning DensePose and Surface Normals of People from Simulated Data. In ECCV, 2020.
[12] Mehta, Dushyant and Sridhar, Srinath and Sotnychenko, Oleksandr and Rhodin, Helge and Shafiei, Mohammad and Seidel, Hans-Peter and Xu, Weipeng and Casas, Dan and Theobalt, Christian. VNect: Real-time 3D Human Pose Estimation with a Single RGB Camera. In SIGGRAPH, 2017.
[13] Dushyant Mehta, Oleksandr Sotnychenko, Franziska Mueller, Weipeng Xu, Mohamed Elgharib, Pascal Fua, Hans-Peter Seidel, Helge Rhodin, Gerard Pons-Moll, Christian Theobalt. XNect: Real-time Multi-Person 3D Motion Capture with a Single RGB Camera. In SIGGRAPH, 2020.