Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to ignore existing pods' preferred inter-pod affinities if the incoming pod has no preferred inter-pod affinities #114393

Merged
merged 1 commit into from
Jan 14, 2023

Conversation

danielvegamyhre
Copy link
Member

@danielvegamyhre danielvegamyhre commented Dec 9, 2022

What type of PR is this?
/kind api-change
/sig scheduling
/sig api-machinery
/priority backlog

What this PR does / why we need it:
The purpose of these changes are to add an option for the scheduler to ignore existing pods` preferred inter-pod affinities if the incoming pod has no preferred inter-pod affinities.

Preferred inter-pod affinity/anti-affinity rules tend to be made in both directions (i.e. if pod type A has a preferred affinity for pod type B, then pod type B will usually have a preferred affinity for pod type A). If an incoming pod that needs to be scheduled has no preferred inter-pod affinity rules, it is most likely the case that no existing pods have no preferred inter-pod affinity rules relating to it.

Therefore, the linear scan through existing pods checking their preferred inter-pod affinities is usually a waste of time that the user can choose to optimize away by enabling this option in the InterPodAffinity plugin config args (at the cost of an occasional pod being scheduled non-optimally/violating existing pods affinity rules).

Specifically, this option will cause PreScore to exit early if this option is enabled and the incoming pod has no existing inter-pod affinity rules.

Which issue(s) this PR fixes:
Fixes #114267

Does this PR introduce a user-facing change?

Added new option to the InterPodAffinity scheduler plugin to ignore existing pods` preferred inter-pod affinities if the incoming pod has no preferred inter-pod affinities. This option can be used as an optimization for higher scheduling throughput (at the cost of an occasional pod being scheduled non-optimally/violating existing pods' preferred inter-pod affinities). To enable this scheduler option, set the InterPodAffinity scheduler plugin arg "ignorePreferredTermsOfExistingPods: true".

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. labels Dec 9, 2022
@k8s-ci-robot
Copy link
Contributor

Welcome @danielvegamyhre!

It looks like this is your first PR to kubernetes/kubernetes 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/kubernetes has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Dec 9, 2022
@k8s-ci-robot
Copy link
Contributor

Hi @danielvegamyhre. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-priority Indicates a PR lacks a `priority/foo` label and requires one. label Dec 9, 2022
@k8s-ci-robot k8s-ci-robot added kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. and removed do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Dec 9, 2022
@k8s-triage-robot
Copy link

This PR may require API review.

If so, when the changes are ready, complete the pre-review checklist and request an API review.

Status of requested reviews is tracked in the API Review project.

@ahg-g
Copy link
Member

ahg-g commented Dec 10, 2022

/assign
/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Dec 10, 2022
@danielvegamyhre
Copy link
Member Author

danielvegamyhre commented Dec 28, 2022

@ahg-g any chance you could take a quick look at this when you have a moment? I would really appreciate it.

@danielvegamyhre
Copy link
Member Author

Thanks for the review @ahg-g I have addressed all your comments, this is ready for another look. The only outstanding issue is in the test case I added to options_test.go. For some reason, despite the test case YAML settingignorePreferredTermsOfExistingPods: true, the resulting scheduler config has it set as false. Any idea how this might be possible?

I have tried both bool and *bool data types (based on other tests I saw using `*bool), as well as read through the YAML -> KubeSchedulerConfiguration code looking for where this might be occurring, but no luck so far.

@ahg-g
Copy link
Member

ahg-g commented Jan 3, 2023

You need to update the auto-generated code that does the translation between the external and internal types. Run ./hack/update-codegen.sh to do that.

@k8s-ci-robot k8s-ci-robot added area/code-generation sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. labels Jan 3, 2023
@@ -1186,6 +1186,7 @@ profiles:
- args:
apiVersion: kubescheduler.config.k8s.io/v1beta2
hardPodAffinityWeight: 5
ignorePreferredTermsOfExistingPods: false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worth adding a test to the scheme tests that sets it to true.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I added test cases for both scheme encoding/decoding tests when ignorePreferredTermsOfExistingPods is true.

cycleState.Write(preScoreStateKey, &preScoreState{
topologyScore: make(map[string]map[string]int64),
})
return nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alculquicondor @Huang-Wei are we going to allow returning Skip for PreScore extensions to indicate that Score shouldn't run for the same plugin?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that sounds like a better contract, compared to plumbing the placeholder and checking if it's empty in Score.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we should have support for Skip in Score as well, but the work didn't start yet.

@ahg-g
Copy link
Member

ahg-g commented Jan 3, 2023

/retitle "Option to ignore existing pods' inter-pod affinities if the incoming pod has no inter-pod affinities"

@k8s-ci-robot k8s-ci-robot added priority/backlog Higher priority than priority/awaiting-more-evidence. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jan 5, 2023
@liggitt liggitt added this to Backlog in API Reviews Jan 6, 2023
@cici37
Copy link
Contributor

cici37 commented Jan 11, 2023

/cc
Curious about the community requirement on adding field directly into stable API :)
Is that allowed because scheduler API is internal?

@ahg-g
Copy link
Member

ahg-g commented Jan 11, 2023

Curious about the community requirement on adding field directly into stable API :) Is that allowed because scheduler API is internal?

Adding fields is allowed, removing them requires a new version.

@alculquicondor
Copy link
Member

Maybe the question is why no KEP? Since the component config API is not stored, we don't need feature gates or other things that a KEP imposes.

btw

/remove-sig api-machinery

this has nothing to do.

@k8s-ci-robot k8s-ci-robot removed the sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. label Jan 12, 2023
@liggitt liggitt self-assigned this Jan 12, 2023
@liggitt liggitt moved this from Backlog to Assigned in API Reviews Jan 12, 2023

// IgnorePreferredTermsOfExistingPods configures the scheduler to ignore existing pods' preferred affinity
// rules when scoring candidate nodes, unless the incoming pod has inter-pod affinities.
IgnorePreferredTermsOfExistingPods bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the "unless incoming pod..." aspect is not captured in the name... would we eventually want to also allow unconditionally ignoring preferred terms of existing pods? if so, it would be worth thinking about how we would name this field and that eventual field in ways that wouldn't be confusing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on my understanding, unconditionally ignoring preferred terms of existing pods is not something we'll want to do in the future. However, @ahg-g has more knowledge about future plans in this area so I will defer to his recommendation here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I don't think we will want to do that

@liggitt
Copy link
Member

liggitt commented Jan 12, 2023

one question on the field name and future plans around this incoming/existing preferred term behavior

@liggitt liggitt moved this from Assigned to API review completed, 1.27 in API Reviews Jan 13, 2023
@liggitt
Copy link
Member

liggitt commented Jan 13, 2023

/approve
for API bit

scheduling reviewer has lgtm

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 13, 2023
@ahg-g
Copy link
Member

ahg-g commented Jan 13, 2023

Please squash the commits to one

@k8s-ci-robot k8s-ci-robot added the sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. label Jan 13, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danielvegamyhre, liggitt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ahg-g
Copy link
Member

ahg-g commented Jan 14, 2023

/lgtm

Thanks @danielvegamyhre

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 14, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 307acffe5710111b780cc246c25bce08a81b89b4

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Jan 14, 2023
@k8s-ci-robot k8s-ci-robot merged commit 3a8e2e3 into kubernetes:master Jan 14, 2023
@k8s-ci-robot k8s-ci-robot added this to the v1.27 milestone Jan 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-review Categorizes an issue or PR as actively needing an API review. approved Indicates a PR has been approved by an approver from all required OWNERS files. area/code-generation cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. priority/backlog Higher priority than priority/awaiting-more-evidence. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Status: API review completed, 1.27
Development

Successfully merging this pull request may close these issues.

An option to disable Preferred Inter-pod affinity of existing pods
8 participants