How can you deal with a namespace that gets a fina...
# kubernetes
l
How can you deal with a namespace that gets a finalizer on it? Without a manual operation to remove the finalizer, I mean. It's causing
pulumi down
to fail because it first deletes the controller (ArgoCD, in this case), and then when it goes to delete the namespace, the controller isn't there to perform the finalizer operation.
b
I suspect the answer is "You can't, you need a manual step". What's adding the finaliser? Is it something that you set when creating the namespace in Pulumi, or does Argo add it to every namespace automatically as some kind of hook, so Pulumi has no visibility of it?
Finalisers can be really awkward, and I don't believe the behaviour of the thing it tries to call is standardised anywhere. I've come across some that fail as you've described, and other where, when the resources which they trigger are deleted, that resource runs all the finalisers as if those resources were deleted. IIRC this happened with the
aws_load_balancer_controller
, where I wanted to uninstall it and then install a newer version (at the time, the upgrade path wasn't clean/possible), without the existing K8s or AWS ingress/loadbalancer resources being touched. As long as nothing changed between the uninstall/reinstall, it would be fine. But uninstalling the controller caused it to run all its finalisers and delete all our AWS ALBs even though the K8s resources still existed, which was very much not what I wanted as it took everything down. Thank God I was testing it in our
dev
env first. I had to remove the finaliser annotations manually (given the controller added them based on other annotations which told it to manage the ingress, so IaC didn't know about them), then re-add them once I was done.
IMO what should happen when a resource like that (a controller/finaliser target) is deleted is all the finalisers which reference it should be removed from their resources without being triggered first, but presumably there's not a standard which says it must happen so it's up to each finaliser provider to decide what they'll do and how. Maybe there are some instances where that's not what someone would want, so it's explcitly left open ended and an 'exercise to the reader'.
l
This ended up being caused by something different than what I originally thought, I misread the error and was just skimming the namespace manifest. The only finalizer on the namespace was "kubernetes", which was not the issue. Instead, it was due to have misnamed one of the ArgoCD AppProject resources by calling it "apps", so that when it synced what was supposed to be the same AppProject (there called "applications"), it created a whole new AppProject which was not being tracked by Pulumi. AppProjects all get a finalizer on them (by an ArgoCD controller, I guess) to ensure that other ArgoCD resources get cleaned up. So then when
pulumi down
tried to delete the namespace (which was created by Pulumi), it tried to delete the untracked AppProject which triggered the finalizer, but the controller had been deleted by then so it got stuck and failed.
But I agree that this should probably work a bit differently, it would be better if Kubernetes could self-heal here and realize that with the controller missing, it needs to just delete the resource. Or maybe what would be even better is a method of describing finalizer actions in a resource itself, so that the controller is completely unnecessary.
But then again, I guess there could be finalizer actions needed outside of Kubernetes, that would be impossible to capture in a CRD-style configuration.
Maybe it should work the other way around: Kubernetes should refuse to tear down the controller as long as there are resources with a finalizer which the controller handles
b
Huh, what other finaliser actions are there? Every use-case I've seen has been where some external-to-K8s resource is created and kept in-sync with some properties of a K8s resource by a controller, and are used to ensure the external resources are correctly cleaned up when the K8s resource is deleted. Cloud loadbalancers linking to ingresses/services, DNS entries from external-dns, etc.