I think I have narrowed this down after hours and ...
# general
e
I think I have narrowed this down after hours and hours and hours...I decided to just simply create a cluster, then run pulumi refresh. When I do this it wants to completely replace the cluster due to some metadata that is being added by the cloud provider after the fact. That seems incorrect...
w
Can you open an issue with a repro? Is this GKE?
e
yes I am composing a large issue
but I am still experimenting
I found a wierd bug, can you look into it while I do this...if you pass a provider into a NodePool it throws an exception.
panic: fatal: An assertion has failed: Kubernetes GVK is: "urnpulumidev:pptrgcpcontainer/nodePoolNodePool:worker-node" goroutine 71 [running]: github.com/pulumi/pulumi-kubernetes/vendor/github.com/pulumi/pulumi/pkg/util/contract.failfast(...) /home/travis/gopath/src/github.com/pulumi/pulumi-kubernetes/vendor/github.com/pulumi/pulumi/pkg/util/contract/failfast.go:23 github.com/pulumi/pulumi-kubernetes/vendor/github.com/pulumi/pulumi/pkg/util/contract.Assertf(0xc000510800, 0x16acc90, 0x15, 0xc000611578, 0x1, 0x1) /home/travis/gopath/src/github.com/pulumi/pulumi-kubernetes/vendor/github.com/pulumi/pulumi/pkg/util/contract/assert.go:33 +0x198
ill add it to the issue
w
Also - are you on latest versions? The upstream Terraform GKE resource has had a history of many spurious replacements. I believe all of those that we were tracking have been fixed upstream now. See https://github.com/pulumi/pulumi-gcp/issues/88#issuecomment-483930091 for some history.
e
I have no idea how to upgrade to latest kubernetes module manually tbh
im using the latest version of pulumi
0.17.27
There are several issues with how the behaviors handled. I would suggest someone take a well-extended look and create a Kubernetes Cluster on GCP with the following scenarios listed below. First and foremost, replacing a cluster or a node should be the very last option. This is essentially replacing your entire stack. Additionally the parent/child logic of what a cluster or node pool seems to be incorrect. For example, if replacing a cluster, the node pool that's part of the cluster will, in turn, get deleted as well. But when this occurs the current logic it attempts to delete the node pool, not understanding the nesting relationship here. Scenario 1: Create a cluster without a nodeConfig but use a single nodePool[] instead. Expected results, would be for it to match the same results as using nodeConfig (at least for the first node). Do a pulumi refresh, and it will suggest a replacement of the cluster without you making any changes at all. Scenario 2: Create a cluster Create a cluster with a nodeConfig, set "initialNodeCount: 1". Run pulumi update. Run pulumi refresh, instanceUrls should be merged into stack. Change initialNodeCount from 1 to 2. Run pulumi update, notice how it attempts to do a full replacement of the cluster, this is not necessary, and will completely destroy your entire stack, it should simply update the instance count of the cluster. Scenario 3: Create a cluster with a nodeConfig to define your default cluster. Create a secondary NodePool using new gcp.container.NodePool. Set the cluster to the cluster you created, etc. If you pass in a k8s.Provider into provider options it throws an exception. Scenario 4: Create a cluster with a StatefulSet lets say for example for a redis cluster, using VolumeClaimTemplates. Run a pulumi update, it will succeed with no errors and everything will look ok. Do not change anything, and run a pulumi refresh, then a pulumi update. It will now want to completely replace the StatefulSet due to some changes the cloud provider makes to certain metadata specs. This is a pain, as the last thing you would EVER want is your redis cluster to be replaced and having to set that all up again. Scenario 5: Repeat Scenario 4 but this time to prevent your redis cluster from being replaced set protect: true property, this way you can manually control deletion of this resource. But now what happens is ALL pulumi updates fail because it will refuse to preview, diff, or pretty much do anything because of this protected resource, so it completely breaks the stack.
Thats the ticket I am going to create, let me know if those have been fixed.
I think I can get around the StatefulSet issue if I could figure out how ignoreChanges works
but I can't seem to get it to work