Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not create the launcher job if the job starts suspended #670

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

GonzaloSaez
Copy link
Contributor

@GonzaloSaez GonzaloSaez commented Oct 31, 2024

When the MPIJob starts suspended, we were creating the launcher job no matter the initial suspended state. This causes issues with kueue, since it will suspend the MPIJob but it will create a job with the wrong NodeSelector coming from the kueue flavour. I think avoiding creating the launcher in this scenario is the right thing to do but I'm not sure if others have different thoughts.

Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign alculquicondor for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Comment on lines +661 to +663
// If the job is suspended, the list of worker pods will be incorrect. We also do
// not want to start the launcher job if the MPIJob starts suspended.
if launcher == nil && !isMPIJobSuspended(mpiJob) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This opens the question on what should be done when unsuspending the launcher job in case kueue has decided to change the NodeSelector? Should we instead recreate the job since NodeSelector is immutable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think this question does not have a straigtforward answer, I can see at least two approaches possible:

  1. as in JobSet- update the NodeSelector field in pod template when resuming the Job
  2. Recreate the launcher/worker Jobs , this can be probably achieved easily by deleting the jobs on suspending the MPIJob

I'm ok with any of those that is simpler to implement. Any optinion @tenzen-y ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case I think it is safe to decouple the fixes.

@mimowo
Copy link
Contributor

mimowo commented Nov 4, 2024

@GonzaloSaez please fix the DCO

@GonzaloSaez GonzaloSaez force-pushed the g/fix_job_launch_wait_for_pods_ready branch from 802bd49 to 804cfd8 Compare November 5, 2024 06:52
@mimowo
Copy link
Contributor

mimowo commented Nov 8, 2024

For context, linking it back to the related Kueue issue: kubernetes-sigs/kueue#3400 and the slack discussion https://kubernetes.slack.com/archives/C032ZE66A2X/p1730369507818399

@GonzaloSaez GonzaloSaez force-pushed the g/fix_job_launch_wait_for_pods_ready branch from 804cfd8 to c1ea13d Compare January 10, 2025 22:58
@GonzaloSaez GonzaloSaez force-pushed the g/fix_job_launch_wait_for_pods_ready branch from c1ea13d to 50abbdf Compare January 10, 2025 22:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants