generated from kubernetes/kubernetes-template-project
-
Notifications
You must be signed in to change notification settings - Fork 117
Open
Labels
kind/featureCategorizes issue or PR as related to a new feature.Categorizes issue or PR as related to a new feature.
Description
What would you like to be added:
The possibility to have GroupIndex to not be integers for deployment that don't need it. Something like that:
NAME READY STATUS RESTARTS AGE
vllm-8hc4x 1/1 Running 0 2s
vllm-8hc4x-1 1/1 Running 0 2s
vllm-mk9w6 1/1 Running 0 2s
vllm-mk9w6-1 1/1 Running 0 2s
Why is this needed:
NAME READY STATUS RESTARTS AGE
vllm-0 1/1 Running 0 2s
vllm-0-1 1/1 Running 0 2s
vllm-1 1/1 Running 0 2s
vllm-1-1 1/1 Running 0 2s
Group index are the first 0 and 1 so (in bold) vllm-0-1 and vllm-1-1. (it's obvious, but just in case: GroupIndex == ReplicaIndex).
The issue with that is when we do rollout, it always tries to keep consecutive group indices since LWS reconciles statefulSets and those statefulsets have consistent naming. So if I redeploy with maxSurge at 2, we will have this:
NAME READY STATUS RESTARTS AGE
vllm-0 1/1 Running 0 20h
vllm-0-1 1/1 Running 0 20h
vllm-1 1/1 Running 0 20h
vllm-1-1 1/1 Running 0 20h
vllm-2 0/1 Pending 0 2s
vllm-2-1 0/1 Pending 0 2s
vllm-3 0/1 Pending 0 2s
vllm-3-1 0/1 Pending 0 2s
Then this:
NAME READY STATUS RESTARTS AGE
vllm-0 0/1 Pending 0 2s
vllm-0-1 0/1 Pending 0 2s
vllm-1 0/1 Pending 0 2s
vllm-1-1 0/1 Pending 0 2s
vllm-2 1/1 Running 0 2m
vllm-2-1 1/1 Running 0 2m
vllm-3 1/1 Running 0 2m
vllm-3-1 1/1 Running 0 2m
and finally this again:
NAME READY STATUS RESTARTS AGE
vllm-0 1/1 Running 0 20s
vllm-0-1 1/1 Running 0 20s
vllm-1 1/1 Running 0 20s
vllm-1-1 1/1 Running 0 20s
It slow down the rollout a lot. Do you have any thoughts on this @kerthcet? (cc @synthe102)
Metadata
Metadata
Assignees
Labels
kind/featureCategorizes issue or PR as related to a new feature.Categorizes issue or PR as related to a new feature.