I am using Pulumi to do a nightly destroy a databa...
# azure
b
I am using Pulumi to do a nightly destroy a database and restore it from backup. It is an Azure Postgresql flexible server. As it should be accessed every day by the developers, I would like to keep the hostname stable to avoid daily reconfigurations. Azure does not allow to have 2 resources with the same name/hostnamed, so I need to use
deleteBeforeReplace
. The problem is that Azure API returns that the resource has been deleted and pulumi tries to recreate it, but, sometimes the resources take more time to delete and the following create operation fails. Looks like Azure API is kind of inconsistent...
m
While you could overcome this in your Pulumi program I've experienced azure psql server names taking up to 2 days to be freed up behind the scenes.
b
Thanks @microscopic-arm-69377, I looked at that option, but that is a timeout, the problem is that the Azure API returns a failure, so the only way to proceed is with a retry,.
a
But why do you need to destroy the whole server to restore the database from a backup instead of just running a nightly
pg_dump
/
pg_restore
?
And what is the exact error you are getting from Azure API?
b
because 1. We try to do everything from Pulumi / Typescript. 2. We wanted to take advantage of this approach to try restoring a backup, and Azure backups only allow you to restore to a new server. The error is "The server name is already in use", because as explained in your share dlink, Azure does not delete the server, it marks it for deletion and the deletion happens during the following minutes.
a
Does your server have vnet integration or is it configured for public access?
b
at the moment public access
a
There's a chance you could use Pulumi's random suffix autonaming feature (by just defining
resource_name
and not
server_name
) and then create a CNAME alias in a public DNS zone pointing to the server. Then your developers would use the same endpoint but your Pulumi program would just update the CNAME record. If SSL validation works for your DB client that could be an option. Another option would be to use a private endpoint with a custom DNS prefix that would be reused across the deleted and new server. Not sure if you'll hit similar limitations when releasing the private endpoint before the new server can use it.
b
thanks, I tried the first option as you described, but SSL failed, so I would need to configure SSL settings on my clients anyway to verify against the original Azure hostname
a
What database client are you using? Have you tried playing with the SSL mode paramter? https://learn.microsoft.com/en-us/azure/postgresql/flexible-server/concepts-networking-ssl-tls#configure-ssl-on-the-client
b
so, I'm getting to the corner of just using a k8s cronjob retry featru until Pulumi implements the "retry" option for a specific resource
regarding SSL mode, I could decrease the security to
verify-ca
instead of
verify-full
but I think i prefer to just let the pulumi deployment fail and retry N times via the k8s cronjob
a
Yet another option would be to have your program update a Key Vault secret with the connection endpoint which the client applications would use. Good luck overcoming this!
b
or just use a bash script 😂
and push for this: https://github.com/pulumi/pulumi/issues/7932 with Azure API eventual consistency, this is a must I think
a
Yep, it's a shame how some Azure product teams do not design their APIs to be used declaratively and idempodently by IaC frameworks. Often they implement some custom polling logic in the Azure Portal or double updates on the same resource. I've run into more problems like that with the
dbforpostgresql
provider than many other services.