This message was deleted Pulumi Community #kubernetes

Join Slack

This message was deleted.

# kubernetes

sparse-intern-71089

03/29/2021, 9:29 PM

This message was deleted.

billowy-army-68599

03/29/2021, 9:38 PM

Could you open an issue for this? Ideally would with the code you’re using to repro

stocky-student-96739

03/29/2021, 9:38 PM

Sure.

stocky-student-96739

03/29/2021, 9:50 PM

https://github.com/pulumi/pulumi-kubernetes/issues/1512 @billowy-army-68599 🙏

stocky-student-96739

03/29/2021, 9:50 PM

Of course I butchered the formatting.

stocky-student-96739

03/29/2021, 9:51 PM

There we go.

gorgeous-egg-16927

03/29/2021, 10:56 PM

I took a look at the info you provided and nothing is jumping out to me. I did confirm that I was able to roll forward a Deployment on a local v1.19.6 cluster, so it may have something to do with the particular configuration you’re using. If you can, it would be helpful to see the actual Deployment spec that is getting sent to the cluster. https://www.pulumi.com/docs/reference/pkg/kubernetes/apps/v1/deployment/#deployment documents the conditions we’re checking to determine readiness, so you might see if any of those could be the issue.

stocky-student-96739

03/30/2021, 2:19 AM

Here ya go @gorgeous-egg-16927. Thanks for taking a look, let me know if you need more data.

👍 1

stocky-student-96739

03/30/2021, 3:02 PM

Looks like all the conditions mentioned in the document are fulfilled on the Deployment object during the rollout.

stocky-student-96739

03/30/2021, 3:38 PM

It’s worth noting that: • This is a multi-tenant cluster • Other processes leverage the exact same code via CI to deploy to different namespaces • This namespace is definitely set up the same as the others • Other processes deploy fine

stocky-student-96739

03/30/2021, 3:40 PM

Checked Kube API server logs in CloudWatch logging groups, found this:

Copy code

I0330 15:01:42.143571       1 deployment_controller.go:490] "Error syncing deployment" deployment="my-application/web-rov2l78t" err="Operation cannot be fulfilled on deployments.apps \"web-rov2l78t\": the object has been modified; please apply your changes to the latest version and try again"

stocky-student-96739

03/30/2021, 3:52 PM

This happened after the pods in the RS were rolled over

stocky-student-96739

03/31/2021, 1:48 PM

@gorgeous-egg-16927 Any updates on this? It’s blocking us from retiring some aging infrastructure.

gorgeous-egg-16927

03/31/2021, 3:33 PM

@billowy-army-68599 was troubleshooting this yesterday, and I believe the problem was related to the

<http://kubectl.kubernetes.io/restartedAt|kubectl.kubernetes.io/restartedAt>

annotation. I haven’t tested a workaround, but I expect some combination of removing that annotation and using the

skipAwait

annotation (https://www.pulumi.com/blog/improving-kubernetes-management-with-pulumis-await-logic/#new-annotations-to-customize-kubernetes-await-logic) would get you unblocked for now.

stocky-student-96739

03/31/2021, 4:32 PM

Thanks @gorgeous-egg-16927, will report status later today.

faint-table-42725

04/01/2021, 12:01 AM

Were you able to get unblocked w/ the suggestion above @stocky-student-96739?

stocky-student-96739

04/01/2021, 12:59 AM

@billowy-army-68599 @gorgeous-egg-16927 So I actually couldn’t find an annotation matching

<http://kubectl.kubernetes.io/restartedAt|kubectl.kubernetes.io/restartedAt>

on the Deployment, ReplicaSet, or Pods. Adding the

skipAwait

notation got us past the part where we were timing out, but I don’t feel it’s a proper solution since it’s just firing and forgetting and not concerned with the ultimate state of the deployment.

stocky-student-96739

04/01/2021, 12:59 AM

@faint-table-42725 ^^

able-afternoon-73080

05/12/2021, 2:08 PM

Hi all, I am seeing the same issues and also with EKS clusters. Issue is occurring consistently and across 4 different clusters (so not isolated to a particular cluster). I commented on the github issue before coming here but was wondering if any progress has been made with this issue?

stocky-student-96739

07/01/2021, 3:44 PM

I saw there were a couple of PRs to try to fix this: https://github.com/pulumi/pulumi-kubernetes/pull/1596 https://github.com/pulumi/pulumi-kubernetes/issues/1628 I’ve tried 3.5.0 and 3.4.1 and both exhibit the same behavior as before.

sparse-park-68967

07/07/2021, 10:59 PM

@stocky-student-96739 sorry late to this. Could you add any additional specifics to https://github.com/pulumi/pulumi-kubernetes/issues/1628? I have tried reproing this but after https://github.com/pulumi/pulumi-kubernetes/pull/1596 its not so easy for me to do so. If you are able to repro with 3.5.0 a dump of debug logs (e.g.

pulumi --logflow --verbose=9 --debug --logtostderr up --yes >& /tmp/logs

) would be very useful. Happy to setup time with you if you are concerned about sharing the detailed logs.

stocky-student-96739

07/07/2021, 11:01 PM

@sparse-park-68967 Thanks, will do that and post results here.

🙏 1

sparse-park-68967

07/08/2021, 8:20 PM

@stocky-student-96739 just checking in to see if you had a chance to repro with additional logs?

stocky-student-96739

07/08/2021, 8:25 PM

I haven’t yet, I’ll try to get that done this week.

stocky-student-96739

07/08/2021, 8:25 PM

Thanks for following up

sparse-park-68967

07/08/2021, 8:44 PM

sure thing!

9 Views

Open in Slack

Previous Next