Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve inventory target name resolution #3291

Merged
merged 2 commits into from
May 22, 2024

Conversation

seanmil
Copy link
Contributor

@seanmil seanmil commented Apr 2, 2024

Update the target string name resolution behavior with two significant improvements:

  • Do not perform wildcard lookups unless a wildcard character is present in the string, instead only performing exact-name matches.
  • When performing a wildcard lookup, use fnmatch for improved speed and functionality.

@seanmil seanmil requested a review from a team as a code owner April 2, 2024 16:37
@seanmil
Copy link
Contributor Author

seanmil commented Apr 2, 2024

The documentation for the plan functions get_target and get_targets do not mention anything about wildcard support, but the Bolt documentation at https://www.puppet.com/docs/bolt/latest/running_bolt_commands#specify-targets-using-glob-matching mentions at the bottom that "Extended glob matching is only supported when specifying targets from the command-line interface. Plans do not support extended glob matching." - which seems to imply that basic glob matching should be supported.

@donoghuc
Copy link
Contributor

donoghuc commented Apr 2, 2024

I'll have to debug the tests (i'll do this in a separate PR). Assuming all the tests pass I think this is a great improvement. i agree with your assessment and interpretation of the intent for matching. Thanks!

@seanmil seanmil force-pushed the get_target_performance branch from 95742a8 to ebbcf8f Compare April 3, 2024 00:32
@seanmil seanmil changed the title Improve inventory target name resolution performance Improve inventory target name resolution Apr 3, 2024
@seanmil seanmil force-pushed the get_target_performance branch from ebbcf8f to 627c000 Compare April 3, 2024 00:52
@seanmil
Copy link
Contributor Author

seanmil commented Apr 3, 2024

I'll have to debug the tests (i'll do this in a separate PR). Assuming all the tests pass I think this is a great improvement. i agree with your assessment and interpretation of the intent for matching. Thanks!

I made one more improvement to do an exact-match lookup instead of the slower wildcard if there are no wildcard characters.

In my ~3,300 target string name get_targets call it went from around 30s to 7s with the regex-to-fnmatch change and then from 7s to 0.035s with the exact-match logic change.

@donoghuc
Copy link
Contributor

closing and re-opening to trigger tests.

@donoghuc donoghuc closed this Apr 16, 2024
@donoghuc donoghuc reopened this Apr 16, 2024
@donoghuc
Copy link
Contributor

Now that I got tests unblocked, you can see a couple cases we will need to address with this work.

Copy link
Contributor

@donoghuc donoghuc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Failing tests:

Failures:

  1) Bolt::Target when parsing userinfo strips brackets from ipv6 addresses in a uri
     Failure/Error: raise(WildcardError, target) if targets.empty?

     Bolt::Inventory::Inventory::WildcardError:
       Found 0 targets matching wildcard pattern ssh://[::1]:22
     # ./lib/bolt/inventory/inventory.rb:160:in `resolve_name'
     # ./lib/bolt/inventory/inventory.rb:180:in `block in expand_targets'
     # ./lib/bolt/inventory/inventory.rb:179:in `map'
     # ./lib/bolt/inventory/inventory.rb:179:in `expand_targets'
     # ./lib/bolt/inventory/inventory.rb:99:in `get_targets'
     # ./lib/bolt/inventory/inventory.rb:108:in `get_target'
     # ./spec/unit/target_spec.rb:177:in `block (3 levels) in <top (required)>'

  2) Bolt::Target when parsing userinfo returns set object with feature_set
     Failure/Error: raise(WildcardError, target) if targets.empty?

     Bolt::Inventory::Inventory::WildcardError:
       Found 0 targets matching wildcard pattern ssh://[::1]:22
     # ./lib/bolt/inventory/inventory.rb:160:in `resolve_name'
     # ./lib/bolt/inventory/inventory.rb:180:in `block in expand_targets'
     # ./lib/bolt/inventory/inventory.rb:179:in `map'
     # ./lib/bolt/inventory/inventory.rb:179:in `expand_targets'
     # ./lib/bolt/inventory/inventory.rb:99:in `get_targets'
     # ./lib/bolt/inventory/inventory.rb:108:in `get_target'
     # ./spec/unit/target_spec.rb:192:in `block (3 levels) in <top (required)>'

Finished in 18.04 seconds (files took 1.62 seconds to load)
[153](https://github.com/puppetlabs/bolt/actions/runs/8713085630/job/23900793508?pr=3291#step:7:154)6 examples, 2 failures

Failed examples:

rspec ./spec/unit/target_spec.rb:176 # Bolt::Target when parsing userinfo strips brackets from ipv6 addresses in a uri
rspec ./spec/unit/target_spec.rb:191 # Bolt::Target when parsing userinfo returns set object with feature_set

@seanmil seanmil force-pushed the get_target_performance branch 2 times, most recently from d6c5158 to 5a1dfa7 Compare April 17, 2024 16:50
@seanmil
Copy link
Contributor Author

seanmil commented Apr 17, 2024

Failed examples:

rspec ./spec/unit/target_spec.rb:176 # Bolt::Target when parsing userinfo strips brackets from ipv6 addresses in a uri
rspec ./spec/unit/target_spec.rb:191 # Bolt::Target when parsing userinfo returns set object with feature_set

@donoghuc I've fixed these with what I believe to be a correct fix (if it looks like a URI, don't try wildcard matching on it) but one seemingly unrelated test failed. Maybe it would work if the CI run was re-started.

@donoghuc
Copy link
Contributor

triggering a CI run by opening and then closing this PR.

@donoghuc donoghuc closed this May 20, 2024
@donoghuc donoghuc reopened this May 20, 2024
@donoghuc
Copy link
Contributor

Ah, so it looks like this actually exposed a small issue in our tests #3318

I think the fact that this now results in raising a WildcardError instead of letting #{target} through is an improvement. Thanks!

Copy link
Contributor

@mcdonaldseanp mcdonaldseanp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seanmil I just pushed the test fix from Cas to main, if you rebase this on that and we see a green run we'll merge.

seanmil added 2 commits May 21, 2024 19:21
When resolving a large number of targets from a plan the
match_wildcard? regex is compiled on the fly possibly
hundreds or thousands (for a large inventory) of times
per target name resolution.

Switching to File.fnmatch provides a significant speed
increase and seems to more accurately reflect the
functionality stated in the Bolt documentation with
regards to target matching.

!feature

* **Enable basic glob matching of target strings**

  Previously, only the '*' wildcard character would match
  when targets were resolved from areas other than the CLI
  (such as from within a Plan).  With this change wildcard
  matching is switched to use Ruby's fnmatch with basic
  glob matching enabled.  In addition to providing more
  wildcard matching options, it also provides a significant
  performance improvement over the prior implementation.
Performing wildcard target name resolution is fairly costly,
having to search through all group names, target names, and
target aliases, calling File.fnmatch on each one in order to
find matches.

In the case of the targets being non-wildcard strings much
faster exact matching can be performed resulting in a
significant performance improvement.

!feature

* **Optimize get_targets performance for exact-match cases**

  Attempt exact target name matches for strings passed to
  get_targets and only attempt the slower wildcard match if
  the string contains a valid glob wildcard character.
@seanmil seanmil force-pushed the get_target_performance branch from 5a1dfa7 to 4d6e563 Compare May 21, 2024 23:29
@seanmil
Copy link
Contributor Author

seanmil commented May 21, 2024

@seanmil I just pushed the test fix from Cas to main, if you rebase this on that and we see a green run we'll merge.

All green (finally!), thanks!

@donoghuc donoghuc merged commit b42739f into puppetlabs:main May 22, 2024
44 checks passed
@donoghuc
Copy link
Contributor

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants