I've reordered resources and taken dependencies on...
# kubernetes
w
I've reordered resources and taken dependencies on helm charts rather than just crds (in separate config groups) and destroy got much further but was hanging for ages until finally failing with:
Copy code
Diagnostics:
  pulumi:pulumi:Stack (k8s-dev):
    could not get token: RequestError: send request failed
    caused by: Post <https://sts.amazonaws.com/>: dial tcp: lookup <http://sts.amazonaws.com|sts.amazonaws.com> on 192.168.65.5:53: no such host
    could not get token: RequestError: send request failed
    caused by: Post <https://sts.amazonaws.com/>: dial tcp: lookup <http://sts.amazonaws.com|sts.amazonaws.com> on 192.168.65.5:53: no such host

    error: update failed

  kubernetes:<http://apiextensions.k8s.io/v1:CustomResourceDefinition|apiextensions.k8s.io/v1:CustomResourceDefinition> (appmesh-system/appmesh-controller-selfsigned-issuer):
    error: Delete "https://[REDACTED].<http://gr7.us-west-2.eks.amazonaws.com/apis/cert-manager.io/v1alpha2/namespaces/appmesh-system/issuers/appmesh-controller-selfsigned-issuer|gr7.us-west-2.eks.amazonaws.com/apis/cert-manager.io/v1alpha2/namespaces/appmesh-system/issuers/appmesh-controller-selfsigned-issuer>": getting credentials: exec: executable
aws-iam-authenticator failed with exit code 1
    error: post-step event returned an error: failed to save snapshot: performing HTTP request: Patch "<https://api.pulumi.com/api/stacks/pharos/k8s/dev/update/27f96f11-ba00-4a38-b812-1a87d668ec8b/checkpoint>": dial tcp: lookup
<http://api.pulumi.com|api.pulumi.com> on 192.168.65.5:53: no such host

Resources:
    - 199 deleted

Duration: 2h44m7s
During the hang I checked resources and the aws appmesh controller and cert manager controllers had already been deleted. I didn't think it would matter since I didn't see any finalizers or owner refs associated with the
<http://cert-manager.io/v1/Issuer|cert-manager.io/v1/Issuer>
resource being deleted, but maybe I'm missing something? Looks like this is due to https://github.com/pulumi/pulumi-kubernetes/issues/861
This is what the resources look like in order of declaration and any explicit dependencies:
Copy code
environmentNs Namespace
environment-config ConfigMap; DependsOn = environmentNs
environment-sysctl ConfigGroup; DependsOn = environmentNs
kubePrometheusStackCrds ConfigGroup
fluent-bit Chart; DependsOn = kubePrometheusStackCrds
certManagerCrds ConfigGroup
certManagerNs Namespace
certManagerChart Chart; DependsOn = { certManagerCrds, kubePrometheusStackCrds }
certManagerTest ConfigGroup
awsLbcCrds ConfigGroup
awsLbcChart Chart; DependsOn = { awsLbcCrds, certManagerTest }
internal-gateway ConfigGroup; DependsOn = { awsLbcChart /* finalizers */, environmentNs }
internet-gateway ConfigGroup; DependsOn = { awsLbcChart /* finalizers */, environmentNs }
monitoringNs Namespace
kubePrometheusStackChart Chart; DependsOn = { awsLbcChart /* finalizers */, certManagerTest, kubePrometheusStackCrds }
prometheus-adapter Chart; DependsOn = { certManagerTest, kubePrometheusStackChart }
cluster-autoscaler Chart; DependsOn = kubePrometheusStackCrds
external-dns Chart; DependsOn = kubePrometheusStackCrds
appMeshCrds ConfigGroup
appMeshNs Namespace
appMeshChart Chart; DependsOn = { appMeshCrds, certManagerTest }
appmesh ConfigGroup; DependsOn = appMeshChart /* finalizers */
ackIamNs Namespace
ack-iam-controller Chart
It's proving to be essential to have a way to depend on a config group or helm chart and all it's children. Failing that I need a way to get one of the child resources to explicitly depend on.
certManagerTest
is my current attempt to wait until cert manager is ready to issue certs, but it's not working so far.
I posted my workaround for cert manager and after applying similar tricks elsewhere I can create and destroy my k8s stack without any issues!