Use the MediaPipe Pose Landmark Detection solution to detect and track a human pose in video.
Image
, Points2D
, Points3D
, ClassDescription
, AnnotationContext
, SegmentationImage
Human pose tracking is a task in computer vision that focuses on identifying key body locations, analyzing posture, and categorizing movements. At the heart of this technology is a pre-trained machine-learning model to assess the visual input and recognize landmarks on the body in both image coordinates and 3D world coordinates. The use cases and applications of this technology include but are not limited to Human-Computer Interaction, Sports Analysis, Gaming, Virtual Reality, Augmented Reality, Health, etc.
In this example, the MediaPipe Pose Landmark Detection solution was utilized to detect and track human pose landmarks and produces segmentation masks for humans. Rerun was employed to visualize the output of the Mediapipe solution over time to make it easy to analyze the behavior.
The visualizations in this example were created with the following Rerun code.
For each processed video frame, all data sent to Rerun is associated with the two timelines
time
and frame_idx
.
rr.set_time_seconds("time", bgr_frame.time) rr.set_time_sequence("frame_idx", bgr_frame.idx)
The input video is logged as a sequence of
Image
objects to the 'Video' entity.
rr.log( "video/rgb", rr.Image(rgb).compress(jpeg_quality=75) )
The segmentation result is logged through a combination of two archetypes. The segmentation
image itself is logged as an
SegmentationImage
and
contains the id for each pixel. The color is determined by the
AnnotationContext
which is
logged with static=True
as it should apply to the whole sequence.
rr.log( "video/mask", rr.AnnotationContext( [ rr.AnnotationInfo(id=0, label="Background"), rr.AnnotationInfo(id=1, label="Person", color=(0, 0, 0)), ] ), static=True, )
rr.log( "video/mask", rr.SegmentationImage(segmentation_mask.astype(np.uint8)) )
Logging the body pose landmarks involves specifying connections between the points, extracting pose landmark points and logging them to the Rerun SDK. The 2D points are visualized over the image/video for a better understanding and visualization of the body pose. The 3D points allows the creation of a 3D model of the body posture for a more comprehensive representation of the human pose.
The 2D and 3D points are logged through a combination of two archetypes. First, a timeless
ClassDescription
is logged, that contains the information which maps keypoint ids to labels and how to connect
the keypoints.
Defining these connections automatically renders lines between them. Mediapipe provides the POSE_CONNECTIONS
variable which contains the list of (from, to)
landmark indices that define the connections. Second, the actual keypoint positions are logged in 2D
and 3D as Points2D
and
Points3D
archetypes, respectively.
rr.log( "/", rr.AnnotationContext( rr.ClassDescription( info=rr.AnnotationInfo(id=1, label="Person"), keypoint_annotations=[rr.AnnotationInfo(id=lm.value, label=lm.name) for lm in mp_pose.PoseLandmark], keypoint_connections=mp_pose.POSE_CONNECTIONS, ) ), static=True, )
rr.log( "video/pose/points", rr.Points2D(landmark_positions_2d, class_ids=1, keypoint_ids=mp_pose.PoseLandmark) )
rr.log( "person/pose/points", rr.Points3D(landmark_positions_3d, class_ids=1, keypoint_ids=mp_pose.PoseLandmark), )
To run this example, make sure you have the Rerun repository checked out and the latest SDK installed:
# Setup pip install --upgrade rerun-sdk # install the latest Rerun SDK git clone git@github.com:rerun-io/rerun.git # Clone the repository cd rerun git checkout latest # Check out the commit matching the latest SDK release
Install the necessary libraries specified in the requirements file:
pip install -r examples/python/human_pose_tracking/requirements.txt
To experiment with the provided example, simply execute the main Python script:
python examples/python/human_pose_tracking/main.py # run the example
If you wish to customize it for various videos, adjust the maximum frames, or explore additional features, use the CLI with the --help
option for guidance:
python examples/python/human_pose_tracking/main.py --help