Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] The calculation method of daemonset UpdatedNumberScheduled value is inaccurate #1854

Open
zcz-1020 opened this issue Dec 10, 2024 · 0 comments
Assignees
Labels
kind/bug Something isn't working

Comments

@zcz-1020
Copy link

What happened:
The updatedNumberScheduled count is to select the pod with index 0 after sorting the daemon pods on the node. However, if there are two daemon pods on the node (one before and one after the update and both have been scheduled), the pod before the update will be selected, but the updated pod on this node is not counted. In this case, the updatedNumberScheduled value cannot truly reflect the number of updates.

`func (dsc *ReconcileDaemonSet) updateDaemonSetStatus(ctx context.Context, ds *appsv1alpha1.DaemonSet, nodeList []*corev1.Node, hash string, updateObservedGen bool) error {
now := dsc.failedPodsBackoff.Clock.Now()
for _, node := range nodeList {

	if shouldRun {
		desiredNumberScheduled++
		if scheduled {
			currentNumberScheduled++
			// Sort the daemon pods by creation time, so that the oldest is first.
			daemonPods := nodeToDaemonPods[node.Name]
			sort.Sort(podByCreationTimestampAndPhase(daemonPods))
			pod := daemonPods[0]
                             
			generation, err := GetTemplateGeneration(ds)
			if err != nil {
				generation = nil
			}
			if util.IsPodUpdated(pod, hash, generation) {
				updatedNumberScheduled++
			}
		}
	} else {
		if scheduled {
			numberMisscheduled++
		}
	}
}
    
return nil

}`

What you expected to happen:
When calculating updatedNumberScheduled, traverse the pods on the node

for _, p := range daemonPods { if util.IsPodUpdated(p, hash, generation) { updatedNumberScheduled++ break } }

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • Kruise version: 1.7.2
  • Kubernetes version (use kubectl version):
  • Install details (e.g. helm install args):
  • Others:
@zcz-1020 zcz-1020 added the kind/bug Something isn't working label Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants