Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Non-Maximum Merging (NMM) to Detections #500

Merged
merged 29 commits into from
May 27, 2024

Conversation

mario-dg
Copy link
Contributor

@mario-dg mario-dg commented Oct 18, 2023

Description

Currently it is only possible to apply Non-Maximum Suppression (NMS) to Detections.
In an attempt to increase the accuracy of object detection and segmentation, especially when using the InferenceSlicer,
I added the ability to apply NMM instead of NMS to the merged Detections merge_detections.

NMM merges bounding boxes and masks, that would have originally been discarded, when applying NMS.
This results in more accurate detections/segmentations.

The implementation of this algorithm closely follows the one from SAHI.

In addition, also inspired by SAHI, the InferenceSlicer can increase the detection accuracy of larger objects, by running inference
on the whole image on top of all slices perform_standard_pred. The whole image detections will be merged with the sliced image detections.

I would be happy to discuss implementation changes and design choices.
This PR should serve as a first implementation and is far from perfect.
Any ideas, critic or comments are more than welcome :)

Type of change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

How has this change been tested, please provide a testcase or example of how you tested the change?

NMS:
nms

NMM:
nmm

SAHI:
sahi

The code sample that was used to create these two images:

import cv2
import numpy as np
import supervision as sv

from supervision.detection.core import Detections
from supervision.detection.tools.inference_slicer import InferenceSlicer

from PIL import Image
from ultralytics.models.yolo import YOLO


model = YOLO("yolov8s.pt")
def callback(x):
    result = model(x[:, :, ::-1])[0]
    dets = Detections.from_ultralytics(result)
    return dets[dets.confidence > 0.3]

# get image from here: https://ultralytics.com/images/bus.jpg

slicer = InferenceSlicer(callback=callback, 
                         slice_wh=(512, 512), 
                         overlap_ratio_wh=(0.4, 0.4), 
                         iou_threshold=0.1, 
                         merge_detections=True, 
                         perform_standard_pred=True)

ba = sv.BoundingBoxAnnotator()
la = sv.LabelAnnotator()
image = np.array(Image.open("bus.jpg"))

nmm_dets = slicer(image)
nmm_ann = ba.annotate(scene=image.copy(), detections=nmm_dets)
nmm_ann = la.annotate(scene=nmm_ann, detections=nmm_dets)
cv2.imwrite("nmm.jpg", nmm_ann)

slicer.merge_detections = False

nms_dets = slicer(image)
nms_ann = ba.annotate(scene=image.copy(), detections=nms_dets)
nms_ann = la.annotate(scene=nms_ann, detections=nms_dets)
cv2.imwrite("bms.jpg", nms_ann)

Any specific deployment considerations

The two additional parameters that were added to InferenceSlicer are optional and default to False, thus neither breaking, nor changing the results of existing implementations.

Docs

  • Docs updated? What were the changes: I will update the docs in the near future.

@SkalskiP
Copy link
Collaborator

Hi @mario-dg 👋🏻! Sorry for lagging with the review. I will take a look at it on Monday.

@mario-dg
Copy link
Contributor Author

Hi @SkalskiP👋, no worries!
Must have been quite busy with the latest Release! Congrats on that one🚀

@kadirnar
Copy link

kadirnar commented Jan 7, 2024

This is great feature. When will it be merged? @SkalskiP

@SkalskiP
Copy link
Collaborator

SkalskiP commented Jan 7, 2024

@kadirnar, I need to review it first. I didn't have time to work on it yet.

@SkalskiP
Copy link
Collaborator

SkalskiP commented Apr 8, 2024

Hi @mario-dg 👋🏻 We aim to significantly enhance our InferenceSlicer implementation. One of the challenges we've encountered involves limitations associated with applying non_max_suppression in the area of overlapping tiles. Would you still be interested in continuing to assist us?

@mario-dg
Copy link
Contributor Author

mario-dg commented Apr 9, 2024

Hey @SkalskiP, thanks for coming back to this PR!
Sure, I would love to continue🚀

@SkalskiP SkalskiP mentioned this pull request Apr 9, 2024
1 task
@SkalskiP
Copy link
Collaborator

SkalskiP commented Apr 9, 2024

Hi @mario-dg 👋🏻 that's awesome!

We are planning a major wave of updates in InferenceSlicer that includes:

  • Support for segmentation models
  • Support for batch processing (running inference on multiple tiles at once)
  • Improved post-processing - merging instead of removing overlapping detections

This PR looks like the perfect candidate to achieve the last of these goals.

@mario-dg
Copy link
Contributor Author

mario-dg commented Apr 9, 2024

Thats great! Supervision might become a much more suitable lightweight alternative to SAHI, once we've improved its capabilities. Anything specific I could work on?

@SkalskiP
Copy link
Collaborator

SkalskiP commented Apr 9, 2024

@mario-dg 🔥 Please start by solving conflicts. This PR is quite old and we need to make it up to date. Than @LinasKo will review the PR.

@LinasKo
Copy link
Contributor

LinasKo commented Apr 9, 2024

Hi @mario-dg 👋

We spoke with SkalskiP - I'll see if I can review your code today.
We know of a few changes still needed, but I need to ramp-up a bit.

The broader context is that there's another candidate solution #268 and we'd like to compare the two before adding to InferenceSlicer.

@mario-dg
Copy link
Contributor Author

mario-dg commented Apr 9, 2024

Alright @SkalskiP, will get to it this evening.

Copy link
Contributor

@LinasKo LinasKo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @mario-dg,

I've reviewed the PR. Frankly, I think it's a really well structured, well-documented contribution. I especially like how you managed to match the function API and documentation to what we have in with_nms 🙂

I've noted down the changes that we'd need. Some extra tests would be good to do as well.

Let me know if you have any questions.


Hey @SkalskiP, as mentioned in one of the comments - you were right about data missing in __setitem__.

The broader question is: what's contained inside it? Do we have a list of possible dict entries? We would want to unify into a single object, so it's a different 'merge' than what we had before in Detections.merge.

supervision/detection/core.py Outdated Show resolved Hide resolved
supervision/detection/core.py Outdated Show resolved Hide resolved
supervision/detection/core.py Outdated Show resolved Hide resolved
supervision/detection/core.py Outdated Show resolved Hide resolved
supervision/detection/core.py Outdated Show resolved Hide resolved
supervision/detection/utils.py Outdated Show resolved Hide resolved
supervision/detection/utils.py Outdated Show resolved Hide resolved
supervision/detection/utils.py Show resolved Hide resolved
supervision/detection/core.py Outdated Show resolved Hide resolved
supervision/detection/tools/inference_slicer.py Outdated Show resolved Hide resolved
@mario-dg
Copy link
Contributor Author

mario-dg commented Apr 9, 2024

Thank you @LinasKo for this in depth review. I think I got most of what you've commented.
I estimate the changes you've requested to be done this work week.

@SkalskiP
Copy link
Collaborator

Hey @SkalskiP, as mentioned in one of the comments - you were right about data missing in __setitem__.

The broader question is: what's contained inside it? Do we have a list of possible dict entries? We would want to unify into a single object, so it's a different 'merge' than what we had before in Detections.merge.

The whole point of the data field is to give people flexibility. The data can contain anything. It seems to me that the most challenging task will involve linking two or more objects together. At the moment, I'm leaning towards copying the data from the detection with the higher confidence value. @LinasKo, what do you think?

@LinasKo
Copy link
Contributor

LinasKo commented Apr 10, 2024

The whole point of the data field is to give people flexibility. The data can contain anything. It seems to me that the most challenging task will involve linking two or more objects together. At the moment, I'm leaning towards copying the data from the detection with the higher confidence value. @LinasKo, what do you think?

Got it; then there's no ideal solution. In light of that - more confidence is a nice tie-breaker.

@mario-dg, can I request a change? If we're relying on confidence as a tie-breaker, let's make the class_id and tracker_id resolution use the same logic - instead of using min and max, let's pick the IDs from the more confident of the two merged objects.

@mario-dg
Copy link
Contributor Author

@LinasKo yes sure! Was about to start with your first comments, so I will keep that in mind and use the confidences as a base to determine which ids will be picked in the merging process.

@mario-dg
Copy link
Contributor Author

@SkalskiP, @LinasKo I have just pushed my latest changes and would love for you to look over them again!
I'm stilling missing tests though. Will get to them as soon as I can🔥

@LinasKo
Copy link
Contributor

LinasKo commented Apr 11, 2024

Hey @mario-dg,

Good work! I'm reviewing it now.

For testing, you might find something useful in this Colab. If you remove the inference slicer and its callbacks, you should be able to evaluate the runtime of your functions on an image with many realistic detections.

If you do use it, I suggest you remove surplus tests - Ultralytics runs quickest and is easy to install. Also, I don't think you'll need roboflow, inference-gpu, transformers, and timm packages.

@LinasKo
Copy link
Contributor

LinasKo commented Apr 11, 2024

Other than that, I like how this looks.
It would be cool to see how fast it runs :)

@LinasKo
Copy link
Contributor

LinasKo commented May 3, 2024

Upon closer inspection, mypy revealed issues relating to datatypes.

Specifically, many of the variables we have such as class_id and tracker_id may be None. Also, in the output we expect numpy arrays, whereas the code returns floats and a list.

We'd like to merge today, so I'll see if I can fix it.

@mario-dg
Copy link
Contributor Author

mario-dg commented May 3, 2024

Hey @LinasKo, sorry for being absent lately. Due to being pretty busy with work and my masters thesis, I'm currently not able to test or benchmark.

@LinasKo
Copy link
Contributor

LinasKo commented May 3, 2024

It's alright; we've all been through that 😉

Thank you very much for your contribution. I'll make sure it gets verified and merged soon!

@LinasKo
Copy link
Contributor

LinasKo commented May 15, 2024

Hey @SkalskiP, I pushed an update; here's some questions:

  1. Removed non_max_merge altogether - it wasn't used.
  2. There's now tests for merge_object_detection_pair and box_non_max_merge, but not for batch_box_non_max_merge. Would you like to see those too?
  3. The merge_object_detection_pair I kept in detections/core.py. Due to it using Detections, it'd be a circular dependency if moved to utils. Alternatively, It would work if I moved it to utils, but annotate with "Detections" type without importing Detections from core.
  4. with_nmm is between 2x and 10x slower than with_nms.
    • For 1000 iterations & 38 detections, it's 10x slower. (0.42s vs 4.7s total)
    • For 1000 iterations & 758 detections, it's 2x slower. (41s vs 95s)
    • For 1 iteration & 38 detections: 5x slower (1ms vs 5ms)
    • For 1 iteration & 758 detections: 1.5x slower (42ms vs 89ms)
  5. I wanted to check with you again regarding datatypes. Do we want the change to these functions or every detections/utils method? that's 117 changes.
  6. box_non_max_merge tie-breaks preferring later detections. E.g. given equal confidence, and sufficient overlap, we would merge detections 0 and 1 into detection 2. I think it's too small to care about - do we agree on that?

@LinasKo LinasKo force-pushed the add_nmm_to_detections branch from fe11936 to 559ef90 Compare May 15, 2024 09:14
@SkalskiP
Copy link
Collaborator

SkalskiP commented May 15, 2024

Removed non_max_merge altogether - it wasn't used.

Awesome! Less is more!

There's now tests for merge_object_detection_pair and box_non_max_merge, but not for batch_box_non_max_merge. Would you like to see those too?

No. That's okey.

The merge_object_detection_pair I kept in detections/core.py. Due to it using Detections, it'd be a circular dependency if moved to utils. Alternatively, It would work if I moved it to utils, but annotate with "Detections" type without importing Detections from core.

Yup. Let's move it and use "Detections". It's not perfect but we need to keep it simple in detections/core.py.

I wanted to check with you again regarding datatypes.

I'm really sorry but I'm not sure what you are asking. Sorry once again.

@LinasKo
Copy link
Contributor

LinasKo commented May 16, 2024

Thanks, that's helpful :)

  • I've now realized that for the method we planned to move to utils, annotating with "Detections". Is not enough. It needs to know about the Detections class as it calls the constructor.
    • I strongly suggest we keep it in core.py. Otherwise it's either a circular import, or we need to pass the class as an argument which adds complication.
    • Is there a third, non-complex option?
  • Indeed, I was unclear when I asked about datatypes. What I meant was npt.NDArray[np.float64]. We can change the type annotations only in code introduced by this PR, or we can change it in the whole core.py and utils.py files. Alternatively, we can wait for @onuralpszr to push his changes (Issue 1021), and then do a second pass, annotating every np.ndarray that's left (We'll need to do that anyway - some changes will have been made on other branches).

@LinasKo
Copy link
Contributor

LinasKo commented May 17, 2024

Added np.NDArray types for functions affected by this change.

I believe I've addressed the change requests.

supervision/__init__.py Outdated Show resolved Hide resolved
return Detections.merge(result)


def merge_object_detection_pair(det1: Detections, det2: Detections) -> Detections:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename the arguments to detections_1 and detections_2.

Copy link
Collaborator

@SkalskiP SkalskiP May 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I think we have a naming problem. Our current marge should be called concatenate, and this should be just merge. But as long as merge_object_detection_pair is not part of public API we don't need to overthink it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made an attempt to improve the naming:

  • box_non_max_merge -> _box_non_max_merge_all (open to name ideas)
  • box_non_max_merge_batch -> box_non_max_merge. Now it's the main method, exactly like * box_non_max_suppression
  • merge_object_detection_pair -> _merge_inner_detection_object_pair
  • new method: _merge_inner_detections_objects

@@ -1066,6 +1068,33 @@ def __setitem__(self, key: str, value: Union[np.ndarray, List]):

self.data[key] = value

def _set_at_index(self, index: int, other: Detections):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't placing this code as part of the setitem method makes more sense? The flow below feels quite natural to me.

detections_1 = sv.Detections(...)
detections_2 = sv.Detections(...)
detections_1[0] = detections_2[0]

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__setitam__

detections_2[0]
detections_2["class_name"]

__getitam__

detections_2[0]
detections_2[1:3]
detections_2[[1, 2, 3]]
detections_2[[False, True, False]]
detections_2["class_name"]

Copy link
Contributor

@LinasKo LinasKo May 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_set_at_index was not required. I removed it entirely, and did not add any logic to __setitem__.

supervision/detection/core.py Outdated Show resolved Hide resolved
supervision/detection/core.py Outdated Show resolved Hide resolved
@@ -274,6 +275,81 @@ def box_non_max_suppression(
return keep[sort_index.argsort()]


def box_non_max_merge(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one seems like a helper function that does not need to be exposed in the API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed to _box_non_max_merge_all, removed from init.py.

Note: you'll see box_non_max_merge which is the prior batch function. Now it's the main one.

supervision/detection/core.py Outdated Show resolved Hide resolved
supervision/__init__.py Show resolved Hide resolved
supervision/detection/utils.py Outdated Show resolved Hide resolved
supervision/detection/utils.py Outdated Show resolved Hide resolved
Linas Kondrackis and others added 3 commits May 23, 2024 16:01
@LinasKo
Copy link
Contributor

LinasKo commented May 23, 2024

@SkalskiP, Here's some updates. I've addressed every comment you had, with the major changes being as follows:

  1. Removed _set_at_index, without any logic changes to __setitem__ - I found that we don't need it.
  2. I made an attempt to improve the naming:
    • box_non_max_merge -> _box_non_max_merge_all (open to name ideas)
    • box_non_max_merge_batch -> box_non_max_merge. Now it's the main method, exactly like box_non_max_suppression
    • merge_object_detection_pair -> _merge_inner_detection_object_pair
    • new method: _merge_inner_detections_objects
  3. Simplified the with_nmm function, albeit at the cost of 2 helper methods. Now it's similar to with_nms.
  4. Modified the methods to use List[List[int]] as the core datatype. It does look simpler now.

Unexpected changes:

  1. I found that doing the iou check during the final merging was important. In rare cases, it would prevent unintended merges. Let's chat if you want a deeper explanation.

Same colab as before, verified to work as the original author intended: https://colab.research.google.com/drive/1v0MPlG1tQctX5-koh0l6h1NcB6eWJ_YY#scrollTo=hObeS7Dg8_AP

Let me know what you think.

supervision/detection/utils.py Outdated Show resolved Hide resolved
supervision/detection/core.py Outdated Show resolved Hide resolved
Raises:
ValueError: If one field is None and the other is not, for any of the fields.
"""
attributes = ["mask", "confidence", "class_id", "tracker_id"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we try to get that list automatically?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried it, it's cumbersome, I'll add the code + tests in a separate PR and we can choose whether to keep it.

supervision/detection/core.py Outdated Show resolved Hide resolved
supervision/detection/core.py Outdated Show resolved Hide resolved
supervision/detection/core.py Outdated Show resolved Hide resolved
supervision/detection/core.py Outdated Show resolved Hide resolved

result = []
for merge_group in merge_groups:
unmerged_detections = [self[i] for i in merge_group]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we don't need that list comprehension, just use detections[indexes].

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My explanation was wrong.

We're doing this not to copy the result (that's in another case), but to create a list of single-object detections. [Detections, Detections, Detections, ...].

I believe this is the most concise way.

@LinasKo
Copy link
Contributor

LinasKo commented May 27, 2024

Hi @SkalskiP,

I've tidied this up - I believe it can now be merged.

@SkalskiP SkalskiP merged commit a0d0d45 into roboflow:develop May 27, 2024
9 checks passed
@SkalskiP
Copy link
Collaborator

Thanks a lot, @LinasKo and @mario-dg! 🙏🏻 It's merged!

@mario-dg
Copy link
Contributor Author

That's awesome! Didn't image that this PR would turn out as big as it got. Thanks guys!🚀

@mario-dg mario-dg deleted the add_nmm_to_detections branch May 28, 2024 05:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants