Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Priorityqueue: Yet another queue_depth metric fix #3085

Merged
merged 1 commit into from
Jan 23, 2025

Conversation

alvaroaleman
Copy link
Member

Inside the priorityqueues spin we call the metrics add if an item becomes ready so that the queue_depth metric gets incremented. To avoid doing this multiple times for the same item, we track the key in a map and remove it there when we hand the item out.

If an item gets added without RequeueAfter that is already on the queue but with a RequeueAfter we also call the metrics add - But if we already did that in spin we will count the item twice.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 22, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alvaroaleman

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jan 22, 2025
@alvaroaleman
Copy link
Member Author

/cherrypick release-0.20

@k8s-infra-cherrypick-robot

@alvaroaleman: once the present PR merges, I will cherry-pick it on top of release-0.20 in a new PR and assign it to you.

In response to this:

/cherrypick release-0.20

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

pkg/controller/priorityqueue/metrics.go Show resolved Hide resolved
@@ -168,7 +168,7 @@ func (w *priorityqueue[T]) AddWithOpts(o AddOpts, items ...T) {
}

if item.ReadyAt != nil && (readyAt == nil || readyAt.Before(*item.ReadyAt)) {
if readyAt == nil {
if readyAt == nil && !w.becameReady.Has(key) {
w.metrics.add(key)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be sure.

Do we also have to add it to becameReady here?

w.becameReady.Insert(item.Key)

I'm not sure if we otherwise still have another edge case where we count twice

Should we maybe simply always add it to becameReady when we call metrics.add?

If I see correctly we always remove it from there when we get the item out of the queue, so it should be fine to always add it?

(might also make sense to rename becameReady to something that expresses that an item is currently counted in the queue depth or something)

Copy link
Member Author

@alvaroaleman alvaroaleman Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason that isn't needed is that in spin we only call add if the item has a non-nil readyAt wheras here we only call it here if we are going to set ReadyAt to nil which is how mutual exclusivity this way round is ensured.

I've extended the two last tests that validate that we do call metrics.add here to do a queue.get at the end which ensures the codepath that potentially calls add again in spin gets execercised so that we validate the case you described.

I do agree the current name isn't great but I couldn't come up with a better one - calledAdd would be confusing because this is only relevant for items that have a RequeueAfter and because we remove from it when we do the metrics.get. If you have ideas for a better name here, I am all ears :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also no better ideas for the name :)

Inside the priorityqueues `spin` we call the metrics `add` if an item
becomes ready so that the `queue_depth` metric gets incremented. To
avoid doing this multiple times for the same item, we track the key in a
map and remove it there when we hand the item out.

If an item gets added without `RequeueAfter` that is already on the
queue but with a `RequeueAfter` we also call the metrics `add` - But if
we already did that in `spin` we will count the item twice.
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jan 23, 2025
@sbueringer
Copy link
Member

Thank you!

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 23, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 81504124b34419388ffa46f2d7ac946a97c79085

@k8s-ci-robot k8s-ci-robot merged commit 91e102f into kubernetes-sigs:main Jan 23, 2025
10 checks passed
@k8s-infra-cherrypick-robot

@alvaroaleman: #3085 failed to apply on top of branch "release-0.20":

Applying: bug: Priorityqueue: Yet another queue_depth metric fix
Using index info to reconstruct a base tree...
M	pkg/controller/priorityqueue/priorityqueue.go
Falling back to patching base and 3-way merge...
Auto-merging pkg/controller/priorityqueue/priorityqueue.go
CONFLICT (content): Merge conflict in pkg/controller/priorityqueue/priorityqueue.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config advice.mergeConflict false"
Patch failed at 0001 bug: Priorityqueue: Yet another queue_depth metric fix

In response to this:

/cherrypick release-0.20

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@sbueringer
Copy link
Member

Let's try again after the other cherry-pick is merged

@sbueringer
Copy link
Member

/cherrypick release-0.20

@k8s-infra-cherrypick-robot

@sbueringer: new pull request created: #3089

In response to this:

/cherrypick release-0.20

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants