lately (as in the last week or so) pulumi has been...
# aws
lately (as in the last week or so) pulumi has been having issues deleting subnets and internet gateways (see image for an example). This has been observed across several pulumi projects (which used to work with pulumi up / destroy without any issue, and some of these projects haven’t changed in the meantime), and for multiple of colleagues. Is anyone else having a similar issue? Any action steps to suggest?
In the example I showed, it’s also interesting that pulumi was able to destroy one of the subnets (in the “1c” availability zone), but not the others 🤔
this looks like a bug, can you open an issue with the code you're using?
sure 👍 . I’ll drop a link here when it’s done
Before wasting too many Pulumi developer hours, tho, is there something you would suggest checking on the AWS side to ensure this isn’t user-error? (FYI I am an admin on the AWS account in question, so it shouldn’t be a permissions-issue)
it could be a dependency like a security group or something in the subnet, if you try to delete the subnet manually from the console, does it tell you of any issues?
@agreeable-ram-97887 are you on a M1 Macbook by chance? If so, can you try the env var shown at ?
@billowy-army-68599, I’ve tried to delete one of the remaining subnets and it seems that I cannot because a NetworkInterface is still attached. I then tried to delete the NetworkInterface, but it seems something else is blocking that one as well (AWS Console wasn’t too helpful, tho 😕 ). As far as I’m aware, there haven’t been any resources manually attached to these resources; so it has to something defined in the code for building up an EKS cluster via pulumi (or a derivative resource). I’ll try to dig a bit deeper after looking into @gentle-diamond-70147’s suggestion
@gentle-diamond-70147, yes I am on a Mac M1. Although a colleague of mine who uses a Windows machine has experienced the same issue. I’ve tried destroying with this command
GODEBUG=asyncpreemptoff=1 pulumi destroy
and the process failed for the same reason
@agreeable-ram-97887 it's probably a load balancer created by the Kubernetes cluster
i've seen this before, what happens is the EKS cluster gets deleted before the resources that created the loadbalancer, and pulumi sees the cluster doesn't exist, puts the kubernetes resource as "deleted" in the state and moves on
but then it can't delete the igw because a loadbalancer exists
hey @billowy-army-68599, yeah, looks like that was it 👍 . Do you know of a way to avoid this, then? I already have the NodeGroup set as “depends_on” the EKS Cluster, which I thought would already make sure that one is completely deleted before the other. Also, has something recently changed on the Pulumi side which may have brought this problem up? (because
pulumi destroy
didn’t get stuck like this a month or so ago)
It’s a race condition, I think you can fix it by making the k8s resources depend on your cluster