Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schema-aware pulumi-level detailed diff calculation in the SDKv2 bridge #2405

Merged
merged 170 commits into from
Oct 1, 2024

Conversation

VenelinMartinov
Copy link
Contributor

@VenelinMartinov VenelinMartinov commented Sep 6, 2024

This PR adds a new algorithm for calculating the detailed diff which acts on the pulumi property values for the SDKv2 bridge, comparing the planned state returned by Diff to the prior state. This is flagged under the existing DiffEqualDecisionOverride feature flag. The results look very promising so far - all the detailed diff integration tests pass and the issues previously reported are almost all fixed by this.

Why

The current detailed diff acts on the InstanceDiff structure which is internal to the plugin-sdk. This has a few shortcomings:

  • TF does not actually refer to this for the detailed diff, so it might point to diffs which are not present in TF.
  • It refers to TF attribute paths, which are tricky to translate back in some cases.
  • It does not compare the planned state with the prior state but compares the news vs olds - this misses properties added by TF planning.

Implementation

The new algorithm is under pkg/tfbridge/detailed_diff.go and used in pkg/tfbridge/provider.go:Diff only for the SDKv2 and only if the DiffEqualDecisionOverride is enabled.
The main entrypoint is makePulumiDetailedDiffV2 - which in turn calls makePropDiff on each property. That branches on the type of the property and we have a separate function responsible for calculating the detailed diff for each property type.
There's a few interesting bits here:

  • We always walk the full tree even when comparing against a nil property and simplify the diff after in simplifyDiff. This is in order to get replaces correct. More on that later.
  • When returning the diff to the engine we only return the simplest possible diff which explains the changes. So instead of prop: Update, prop.subprop: Add, we only return prop.subprop: Add. This seems to work much better in the engine and goes around some weird behaviour in the detailed diff display (see Detailed diff duplicated collection diffs #2234 and Detailed diff shows diff on __defaults when number of elements changes #2400). Moreover, the first can be inferred from the second, so there is no reason for the bridge to return the full tree if only a leaf was changed.
  • We can not correctly identify diffs between nil and empty collections because of how the TF SDKv2 works without additional work. This is studied in TestNilVsEmptyListProperty and TestNilVsEmptyMapProperty in pkg/cross-tests/diff_cross_test.go. This is probably fine for now and a full fix is not possible. We can make a partial fix for non-computed properties by inspecting the pulumi inputs, before the plan.
  • There's a bit of an edge case with Unknowns and Replaces - we might not have enough information to tell the user they'll get a replace because the property which causes the replaces is nested inside an unknown. There's not much to do here, except to choose which side to err on. The algorithm currently does not say there's a replace.

On Replaces

We do not short-circuit detailed diffs when comparing non-nil properties against nil ones. The reason for that is that a replace might be triggered by a ForceNew inside a nested property of a non-ForceNew property. We instead always walk the full tree even when comparing against a nil property. We then later do a simplification step for the detailed diff in simplifyDiff in order to reduce the diff to what the user expects to see.

For example:
This is a list of objects with two properties, one of which is ForceNew

schema = {
  "list_prop": {
     Type: List,
     Elem: {
         "prop": String
         "force_new_prop": StringWithForceNew
     }
  }
}

We are diffing an unspecified list against a list with a single element

olds = {}
news = {
"list_prop": [
   {
    "prop": "val",
    "force_new_prop" : "val"
  }
]

The user expects to see:

+ list_prop

or because of how collections work in TF SDKv2 (see #2233)

+ list_prop[0]

An element was added to the list. When calculating the detailed diff we can short-circuit the diff when comparing the two lists, as we can see they have different lengths. This would identify the correct element to be displayed to the user as the reason for the diff but would fail to identify the diff as a replace, since we never saw the ForceNew on the nested property force_new_prop of the list elements.

That is why, instead of short-circuiting the diff, we walk the full tree down and compare every property against a nil if it is not specified on the other side. We then to a simplification pass over the detailed diff, which respects any replaces triggered by nested properties and bubbles these up.

There is a full case study of the TF behaviour around replaces in pkg/cross-tests/diff_cross_test.go TestAttributeCollectionForceNew, TestBlockCollectionForceNew, TestBlockCollectionElementForceNew.

Testing

Unit tests for the detailed diff algorithm in pkg/tfbridge/detailed_diff_test.go - this tries to cover all meaningful permutations of schemas:

  • TestBasicDetailedDiff tests all the meaningful pairs between nil values, empty values, non-empty values and computed for each TF type.
  • TestDetailedDiffObject, TestDetailedDiffList, TestDetailedDiffMap, TestDetailedDiffSet covers the cases not covered above for object and collection types.
  • TestDetailedDiffTFForceNewPlain, TestDetailedDiffTFForceNewAttributeCollection, TestDetailedDiffTFForceNewBlockCollection, TestDetailedDiffTFForceNewElemBlockCollection, TestDetailedDiffTFForceNewObject cover ForceNew behaviour in all TF types.
  • TestDetailedDiffPulumiSchemaOverride covers pulumi schema overrides

Integration tests in pkg/tests/schema_pulumi_test.go, mostly TestDetailedDiffPlainTypes and TestUnknownBlocks. Note that most of the edge cases and issues previously discovered here are resolved by this PR.

Follow-up Work

Not done but will follow-up in separate PRs:

Related Issues

fixes:

does not fix:

@VenelinMartinov VenelinMartinov requested a review from iwahbe October 1, 2024 14:47
Copy link
Member

@iwahbe iwahbe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@VenelinMartinov VenelinMartinov merged commit deb3c39 into master Oct 1, 2024
17 checks passed
@VenelinMartinov VenelinMartinov deleted the vvm/detailed_diff_v2 branch October 1, 2024 17:08
@pulumi-bot
Copy link
Contributor

This PR has been shipped in release v3.92.0.

VenelinMartinov added a commit that referenced this pull request Oct 29, 2024
This PR changes the logic in the detailed diff v2 calculation to
short-circuit on encountering nils and unknowns.

Previously, when we encountered a null or unknown in either news or olds
we would still recurse down and compare the non-nil values versus a nil
value in order to get any replacements which might be happening. We
would then simplify the diff later to trim these extra diffs but would
propagate the replacement to the right level.

We would now instead short-circuit on encountering a null or unknown and
walk the non-null value to find any properties which can trigger a
replacement. This makes it much easier to handle sets in
#2451 as recursing
into sets is not necessary, as they are only compared by hashes.


This change is not meant to change any behaviour and is tested in
#2405

Stacked on #2515
and #2516
VenelinMartinov added a commit that referenced this pull request Oct 30, 2024
This change adds improved TF set handling to the detailed diff v2. The
main challenge here is that pulumi does not have native sets, so set
types are represented as lists.

### Diffing sets using the hash ###

To correctly find the diff of two sets we calculate the hash of each
element ourselves and do the diffs based on that. What makes this
somewhat non-trivial is that due to MaxItemsOne flattening we can't just
hash the pulumi `PropertyValue`s given to us by the engine. Instead we
use `makeSingleTerraformInput` to transform the values using the schema.
We then use the hashes of the elements in the set to calculate the
diffs. This allows us to correctly account for shuffling and duplicates,
matching the terraform behaviour of sets.

When returning the element indices back to the engine, we need to return
them in terms of oldState and newInputs because the engine does not have
access to the plannedState (see
#2281). To do
that we keep the newInputs and match plannedState elements to their
newInputs index by the set hash. Note that this is not possible if the
set element contains any computed properties - these can change during
the planning process, so we do not attempt to match and print a warning
about possibly inaccurate diff results instead.


### Unknowns in sets ###
Note that the terraform planning process does not allow a set to contain
any unknowns, because that breaks the hashing. Because of that plan
should always return an unknown for a set which contains any unknowns.
This accounts for cases where resolving the unknown can result in
duplicate elements.

Unknown elements in sets - the whole set becomes unknown in the plan, so
the algorithm no longer works. Currently we return an update for the
whole set to the engine and it does the diff on its side.

### Testing ###
This PR also includes tests for the set behaviour, both unit tests for
the detailed diff algorithm and integration tests using pulumi programs
for:
- Single element additions, updates and removals
- Shuffling, also with additions, updates and removals
- Multi-element additions, updates and removals
- Unknowns

### Issues ###

Builds on #2405

Stacked on #2515,
#2516,
#2496 and
#2497

fixes #2200
fixes #2300
fixes #1904
fixes #186
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment