Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CA DRA: support DRA AdminAccess #7685

Open
towca opened this issue Jan 9, 2025 · 0 comments
Open

CA DRA: support DRA AdminAccess #7685

towca opened this issue Jan 9, 2025 · 0 comments
Labels
area/cluster-autoscaler area/core-autoscaler Denotes an issue that is related to the core autoscaler and is not specific to any provider. kind/feature Categorizes issue or PR as related to a new feature. wg/device-management Categorizes an issue or PR as relevant to WG Device Management.

Comments

@towca
Copy link
Collaborator

towca commented Jan 9, 2025

Which component are you using?:

/area cluster-autoscaler
/area core-autoscaler
/wg device-management

Is your feature request designed to solve a problem? If so describe the problem this feature should solve.:

DRA has an AdminAccess feature, guarded by a separate feature guard (KEP). A request for admin access to Device(s) can be expressed in a ResourceClaim, meaning that the claim would only be used to monitor or otherwise manage the device, but not actually use it. Such claims can be allocated with devices that are already allocated in another ResourceClaim.

This has implications on some of the DRA logic in CA (e.g. such claims shouldn't be counted when computing utilization), but it isn't handled in the current MVP implementation.

Describe the solution you'd like.:

  • Figure out all parts of DRA support implementation in CA that need to take the new "some ResourceClaims don't actually reserve their allocated Devices" assumption into account. This is at least:
    • Calculating Node utilization for scale-down.
    • We could potentially simplify sanitization for ResourceClaims with AdminAccess, as sanitizing their allocations shouldn't matter for scheduling.
  • Take the new assumption into account in all of the identified places.

Additional context.:

This is a part of Dynamic Resource Allocation (DRA) support in Cluster Autoscaler. An MVP of the support was implemented in #7530 (with the whole implementation tracked in kubernetes/kubernetes#118612). There are a number of post-MVP follow-ups to be addressed before DRA autoscaling is ready for production use - this is one of them.

@towca towca added the kind/feature Categorizes issue or PR as related to a new feature. label Jan 9, 2025
@k8s-ci-robot k8s-ci-robot added area/cluster-autoscaler area/core-autoscaler Denotes an issue that is related to the core autoscaler and is not specific to any provider. wg/device-management Categorizes an issue or PR as relevant to WG Device Management. labels Jan 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cluster-autoscaler area/core-autoscaler Denotes an issue that is related to the core autoscaler and is not specific to any provider. kind/feature Categorizes issue or PR as related to a new feature. wg/device-management Categorizes an issue or PR as relevant to WG Device Management.
Projects
None yet
Development

No branches or pull requests

2 participants