在线时间:8:00-16:00
迪恩网络APP
随时随地掌握行业动态
扫描二维码
关注迪恩网络微信公众号
开源软件名称(OpenSource Name):ntu-rris/google-mediapipe开源软件地址(OpenSource Url):https://github.com/ntu-rris/google-mediapipe开源编程语言(OpenSource Language):Python 81.4%开源软件介绍(OpenSource Introduction):Google MediaPipe for Pose EstimationMediaPipe is a cross-platform framework for building multimodal applied machine learning pipelines including inference models and media processing functions. The main purpose of this repo is to:
Pose Estimation with Input Color ImageAttractiveness of Google MediaPipe as compared to other SOTA (e.g. FrankMocap, CMU OpenPose, DeepPoseKit, DeepLabCut, MinimalHand):
FeaturesLatest MediaPipe Python API version 0.8.9.1 (Released 14 Dec 2021) features: Face Detect (2D face detection)
Face Mesh (468/478 3D face landmarks)
Hands (21 3D landmarks and able to support multiple hands, 2 levels of model complexity) (NEW world coordinates)
Body Pose (33 3D landmarks for whole body, 3 levels of model complexity)
Holistic (Face + Hands + Body) (A total of 543/535 landmarks: 468 face + 2 x 21 hands + 33/25 pose) Objectron (3D object detection and tracking) (4 possible objects: Shoe / Chair / Camera / Cup)
Selfie Segmentation (Segments human for selfie effect/video conferencing)
Note: The above videos are presented at CVPR 2020 Fourth Workshop on Computer Vision for AR/VR, interested reader can refer to the link for other related works. InstallationThe simplest way to run our implementation is to use anaconda. You can create an anaconda environment called
Demo Overview
Usage0. Single Image5 different modes are available and sample images are located in data/sample/ folder
Note: The sample images for subject with body marker are adapted from An Asian-centric human movement database capturing activities of daily living and the image of Mona Lisa is adapted from Wiki 1. Video Input5 different modes are available and video capture can be done online through webcam or offline from your own .mp4 file
Note: It takes around 10 to 30 FPS on CPU, depending on the mode selected. The video demonstrating supported mini-squats is adapted from National Stroke Association 2. Gesture Recognition2 modes are available: Use evaluation mode to perform recognition of 11 gestures and use train mode to log your own training data
Note: A simple but effective K-nearest neighbor (KNN) algorithm is used as the classifier. For the hand gesture recognition demo, since 3D hand joints are available, we can compute flexion joint angles (feature vector) and use it to classify different hand poses. On the other hand, if 3D body joints are not yet reliable, the normalized pairwise distances between predifined lists of joints as described in MediaPipe Pose Classification could also be used as the feature vector for KNN. 3. Rock Paper Scissor GameSimple game of rock paper scissor requires a pair of hands facing the camera
For another game of flappy bird refer to this github 4. Measure Hand Range of Motion2 modes are available: Use evaluation mode to perform hand ROM recognition and use train mode to log your own training data
5. Measure Wrist and Forearm Range of Motion3 modes are available and user has to input the side of the hand to be measured
Note: For measuring forearm pronation/supination, the camera has to be placed at the same level as the hand such that palmar side of the hand is directly facing camera. For measuring wrist ROM, the camera has to be placed such that upper body of the subject is visible, refer to examples of wrist_XXX.png images in data/sample/ folder. The wrist images are adapted from Goni Wrist Flexion, Extension, Radial & Ulnar Deviation 6. Face MaskOverlay a 3D face mask on the detected face in image plane
Note: The face image is adapted from MediaPipe 3D Face Transform 7. Triangulate PointsEstimating 3D body pose from a single 2D image is an ill-posed problem and extremely challenging. One way to reconstruct 3D body pose is to make use of multiview setup and perform triangulation. For offline testing, use CMU Panoptic Dataset, follow the instructions on PanopticStudio Toolbox to download a sample dataset 171204_pose1_sample into data/ folder
8. 3D Skeleton3D pose estimation is available in full-body mode and this demo displays the estimated 3D skeleton of the hand and/or body. 3 different modes are available and video capture can be done online through webcam or offline from your own .mp4 file
9. 3D Object Detection4 different modes are available and a sample image is located in data/sample/ folder. Currently supports 4 classes: Shoe / Chair / Cup / Camera.
10. Selfie Segmentation2 modes are available. The landscape mode has fewer FLOPS than the general model and may run faster. The selfie segmentation works best for selfie effects and video conferencing, where the person is close (< 2m) to the camera.
Limitations:Estimating 3D pose from a single 2D image is an ill-posed problem and extremely challenging, thus the measurement of ROM may not be accurate! Please refer to the respective model cards for more details on other types of limitations such as lighting, motion blur, occlusions, image resolution, etc. |
2023-10-27
2022-08-15
2022-08-17
2022-09-23
2022-08-13
请发表评论