feat: scale disruption cost by the node utilization #2028
+16
−1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #N/A
Description
Scales the disruption cost of nodes by their utilization of pod resources. A node with 1 pod and 99% utilization should have a higher disruption cost than a node with 1 pod and 10% utilization.
How was this change tested?
We've been looking at improving utilization of resources in our clusters, and noticed that Karpenter tends to prefer consolidating underutilized nodes with fewer pods rather than nodes with the most wasted resources. It seems like this can happen because the
disruptionutils.ReschedulingCost(ctx, pods)
produces a value that is roughly equivalent to the pod count (scaled by pod priority class). So nodes with fewer pods and higher utilization will be candidates for consolidation, while underutilized nodes with many smaller pods are not.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.