Am I doing something wrong? We have a pulumi stack...
# general
a
Am I doing something wrong? We have a pulumi stack (on Google cloud) that includes a SSL certificate, a load balancer using that certificate and a bunch of other stuff. I ran a
pulumi up
where we changed domain name so a new SSL certificate needed to be replaced. A new certificate was provisioned, and the old resource was marked for deletion. But all deletions are normally queued until the last stages of a pulumi up (which is great). But before pulumi made it to the step to change the load balancer to use the new certificate, some other (unrelated) resource failed deployment. We fixed that issue and ran another
pulumi up
. But this second attempt starts with deleting the resources that are marked for deletion in the previous attempt. This means it starts with the deletion of the old certificate but it hasn't ran the steps yet to change the load balancer to use the new cert. This makes the deletion fail since you can't delete a cert that is still in use. Is there a way to tell pulumi to also postpone the deletes from previous updates until all creates/updates have completed?
t
I'm also interested in this question. In the past I've just manually resolved the issue via the AWS UI then ran: (optional: if there's any pending operations)
Copy code
pulumi stack export | jq '.deployment.pending_operations = []' | pulumi stack import --force
Then:
Copy code
pulumi refresh
And then:
Copy code
pulumi up
(or merge steps 2/3 and run
pulumi up -r
instead) Which I don't think is the best solution. Particularly the first optional step that clears the pending_operations (if for example the script fails out).
l
I think removing pending operations and running
pulumi refresh
is a good option in this case. The lowest-risk option would be to run changes in much smaller batches (creating new resources, then changing existing ones, then deleting old ones), but that's more-or-less Pulumi's raison d'etre.. and with it comes some higher risks in case of failure.
a
This example was easily fixable by going in to the Google cloud console, manually switch the load balancer to the new certificatie, and then another
pulumi up
This times the deletion of the old certificate succeeds and pulumi runs just fine. I am just wondering about the reasoning behind the decision to start a pulumi up with the deletion of resources from the previous attempt. Wouldn't it make more sense to also postpone those deletes until everything has completed, just as with the deletes that are collected during this run? It seems like adding explicit dependencies between resources has no effect on this behaviour. It would be nice to have a flag to run the deletes at the end of a next attempt, and not at the start
🤔 1