Mediapipe into development #53

philipqueen · 2025-01-11T00:48:30Z

Applying new architecture changes to the Mediapipe tracker.

Currently running fine in the webcam demo. For now I'm using the builtin mediapipe drawing utils for animation, as I figured @jonmatthis or @aaroncherian would have opinions on how best to annotate the output.

TODOs:

Rewrite tests to verify output
Discuss how we want to handle annotation
Give recording a try

jonmatthis · 2025-01-11T13:01:59Z

skellytracker/trackers/mediapipe_tracker/mediapipe_detector.py

+        )
+
+    def detect(self, image: np.ndarray) -> MediapipeObservation:
+        rgb_image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)  # assumes we're always getting BGR input - check with Jon to verify


I think BGR is a very cv2 specific convention at this point, its a rather old convention (.bmp images were BGR, for example), RGB is a much much more common standard these days.

Is Basler RGB? That is, do images come out of the Basler pipeline as RGB or BGR?

We should have a policy of converting cv2 images to RGB immediately upon read, and assuming they are RGB anywhere else in their lifetime.

Yep, the RGB camera comes out in RGB, the IR in monochrome.

I think immediate conversion sounds like a good idea

jonmatthis · 2025-01-11T13:07:00Z

skellytracker/trackers/mediapipe_tracker/mediapipe_observation.py

+        return self.num_body_points + (2 * self.num_single_hand_points) + self.num_face_points
+
+    @property
+    def pose_trajectories(self) -> np.ndarray[..., 3]:  # TODO: not sure how to type these array sizes, seems like it doesn't like the `self` reference


Typing is tricky with numpy arrays - I think there are pacakages that help with it, but nothing particularly venerable.

I think there's a way to set up custom type annotations and validators in pydantic (or dataclasses) or something? Or just type hint them as 'np.ndarray' and then just do a shape validation on initialization or read?

jonmatthis · 2025-01-11T13:29:36Z

This is great! Works like a charm :)

(note - wizard pose is bc I was trying to face the camera while pressing the Pause (space) bar with my elbow 💪 😂

Gonna merge this in to my branch so I can play with it, thanks!

I left some notes on your comments, looking forward to chatting about your experience building this thing

philipqueen · 2025-01-14T17:32:22Z

Great @jonmatthis, your changes look good as well!

Looking at this and the charuco tracker, I think it's worth declaring some of the methods as abstract in the BaseObservation class (like from_detection_results (standardizing the name) and to_serializeable_dict), then we could make to_json_string and to_json_bytes builtin helper methods

philipqueen added 2 commits January 10, 2025 16:39

Seemingly works, but haven't done annotation yet

8f41c49

Simple annotations (builtin mediapipe drawing utils)

bc3983b

jonmatthis reviewed Jan 11, 2025

View reviewed changes

enum for model complexity and type hinting for detector config

d285a89

jonmatthis merged commit 16bca71 into jon/development Jan 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mediapipe into development #53

Mediapipe into development #53

philipqueen commented Jan 11, 2025

jonmatthis Jan 11, 2025

philipqueen Jan 14, 2025

jonmatthis Jan 11, 2025 •

edited

Loading

jonmatthis commented Jan 11, 2025

philipqueen commented Jan 14, 2025

Mediapipe into development #53

Mediapipe into development #53

Conversation

philipqueen commented Jan 11, 2025

TODOs:

jonmatthis Jan 11, 2025

Choose a reason for hiding this comment

philipqueen Jan 14, 2025

Choose a reason for hiding this comment

jonmatthis Jan 11, 2025 • edited Loading

Choose a reason for hiding this comment

jonmatthis commented Jan 11, 2025

philipqueen commented Jan 14, 2025

jonmatthis Jan 11, 2025 •

edited

Loading