-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
⚠ Fix custom defaulter: avoid deleting unknown fields (zero change patch) #2982
⚠ Fix custom defaulter: avoid deleting unknown fields (zero change patch) #2982
Conversation
/cc @alculquicondor |
5e7dd60
to
c30f09c
Compare
For context to c-r maintainers, we have been discussing what alternatives we have for a generic defaulter that doesn't drop unknown fields. Here is the thread for additional context kubernetes-sigs/kueue#3194 (comment) Taking @mimowo's words, we would be happy to first merge a solution in kueue and let it mature there before moving it here. But we would appreciate your input to the approaches. |
@alculquicondor @trasc @mimowo We (Stefan, Vince and I) had some discussion around #2932 vs this and we landed at this here being better but we aren't 100% sure if its compatible in all cases and currently don't have time to spend on that question, which is why we would like an opt-out for what is suggested here (but on by default). Does that sound reasonable to you all? |
Sure, I'll ping you when I have something in that direction. |
Thanks for considering the options! Do you have any particular guideline on how the opt-out mechanism could look like? Should we have a different method in the factory? Or should the method Other than that, I hope the overhead of the approach is not significant. Fortunately, a new encoding is coming to k/k CRDs https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/4222-cbor-serializer#motivation |
I think
it leave room to maybe add an additional option that removes the |
@alvaroaleman please have a look at my Also please let me know if we can do something about the
is a flase positive in my opinion, as you don't need to actually do something to old code in order to use the new version. |
72ac9bd
to
e652815
Compare
4a124c8
to
ab4e8c9
Compare
/lgtm |
LGTM label has been added. Git tree hash: dc8f633f86d5467de1824324a9004cf5c10d3d61
|
b5419bf
to
fb7b6cd
Compare
Rebased |
@trasc: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Thx! /lgtm |
LGTM label has been added. Git tree hash: 7302a762ff601db57984c8fa33072a6e68119c8f
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alvaroaleman, sbueringer, trasc The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@trasc @alculquicondor Hey folks, just an early heads up already. I bumped Cluster API to CR main branch and I'm now getting panics from the webhook: https://storage.googleapis.com/kubernetes-ci-logs/pr-logs/pull/kubernetes-sigs_cluster-api/11633/pull-cluster-api-e2e-blocking-main/1874860029216034816/artifacts/clusters/bootstrap/logs/capd-system/capd-controller-manager/capd-controller-manager-5b59b9b76-4kmz2/manager.log Looks like jsonpatch.Operation is not always hashable
|
Not sure what the most efficient way is to compare jsonpatch.Operations (the straightforward option would be to call the |
I don't think we need to compare the whole thing, just the combination of operation and path. Here is the type, for reference https://github.com/gomodules/jsonpatch/blob/e2d6410c0299d49b797482dc765a3b49ecc8a4b6/v3/jsonpatch.go#L17-L21 Should we revert for now? |
IIUC, the JSON Patch |
I think I've identified the problem. We're comparing all patches, which makes it possible to encounter a non-hashable value. Created a fix #3056. |
I think we're fine with the fix (#3056 should cover it)
Looks perfect, thank you! |
Quick feedback. The panic is fixed 🎉 I'm just hitting another issue in Cluster API. Not sure yet if that is something that we should care about in CR API field:
Through a combination of unexpected behavior, we're trying to set this field to null via one of our CAPI controllers:
Before this change: our defaulting/mutating webhook generated this patch operation:
=> The result was that the field was removed before CR validation After this change: this patch operation is dropped by dropSchemeRemovals
|
It looks like WAI, from the name of the defaulter option It's a bug in the CAPI controller that happened to be masked by the webhook implementation details. |
I agree. The behavior of the defaulter is exactly as expected. CAPI has a controller that has to work with unstructured and in some cases it sets the field (unintentionally) to nil instead of an empty string. I'm only wondering how many folks we break that are in a similar situation as CAPI. CAPI itself is not a problem, I can deal with this there by setting the option. I think I have to set it because we can't make a change like this within an API version in CAPI (there are many tools/products built on top of CAPI and they shouldn't suddenly start failing with issues like this). If I don't set the option the behavior of our API changes. Values that have been (unintentionally) valid before are suddenly invalid. |
We did make the change opt-out. I think it's reasonable to make it opt-in for a few releases, have a deprecation note and eventually remove it. |
Talked to Alvaro. Let's keep it as opt-out. We'll document it as "breaking change" in the release notes, which should increase the chances that folks notice the change (folks that don't read release notes also won't notice if we change the opt-in/out behavior over a few releases, it would just hit them a bit later). In general I would probably like to keep the option permanently. I don't want to force folks to fork the custom defaulter just because they want to keep the behavior of their API entirely stable. |
When creating a patch for the custom defaulter, don't return the removals generated by object decoding with the local scheme.
Otherwise, the patch will include a remove instruction for fields that the go type doesn't contain.
This could happen when building a webhook for a resource that belongs to another project (including k8s types).
This PR stops the defaulter from pruning the fields that are not recognized in the local scheme.
When doing this we are creating two patches one before and one after the custom defaulter call and only send back the remove operations that are not present in both patches.
The user can opt-out from this new behavior by specifying the
DefaulterRemoveUnknownFields
option when callingWithCustomDefaulter
.(This is a rework of #2931)