Sorry for vagueness here, but I am having an issue...
# general
Sorry for vagueness here, but I am having an issue with pulumi and k8s, where replicasets fail to update. It seems like maybe pulumi is waiting for the "old" replicaset of a deployment to have the expected number of replicas, which of course never happens. Anything around this a known issue?
@millions-judge-24978 that should not be true. why do you think that?
in other words, what makes you think it’s waiting on old replicas
I wish I had more detail and something reproducible here, but Occasionally I get into a state where
pulumi up
will fail, timing out because of an error like this:
Copy code
error: Preview failed: 2 errors occurred:

    * Resource 'ops-view-kube-ops-view' was created but failed to initialize
    * Minimum number of Pods to consider the application live was not attained
when I do a
ktl get replicasets
I will see that two of them exist for the given deployment, the new one, which is fine and has the expected number of replicas. And the old one with zero replicas
The only way I've been able to fix is to manually delete the old replicaset from the cluster, perform manual state surgery on pulumi to remove the deployment, and then do
pulumi refresh
@millions-judge-24978 we never check the old replica count, we are basically waiting for the deployment status to be updated with a condition that says the new replicaset is marked available.
if that does not happen, we fail.
next time it happens
kubectl get -o yaml
on it and let’s take a look at the status field.
Gotcha, hmm yeah I have no idea what is really happening. I'll grab that next time it occurs.
please do and send it directly to me
if you can, get both replicasets too
we have seen some issues where the deployment controller is not populating that status condition, which might be a bug upstream.
if so, we’ll try to drive a fix.
Yeah maybe a k8s bug actually
Maybe. 🙂 Maybe not.
Frustrating in any case. Especially since state surgery is not very friendly yet, and my json file is growing large haha
this is not good, we’ll get to the bottom of it for sure though.
Thanks! Will update at some point
the deployment stuff has been stable for a reasonably long time though, and these issues are only very recently popping up