I upgraded the eks module from 0.37.1 to 0.41.0 and this resulted in updating the ami that the default node pool runs on.
Updating that will roll all the nodes.
I think that because it's a change to the cloudformation template i.e. ASG that k8s won't have a chance to ensure that all the stateful sets remain quorate etc. whilst that roll happens.
To avoid this I could make a second node pool and then only update each node pool in isolation after having moved sufficient members of the stateful sets over to the safe pool.
I think this means I need the node pool in a separate stack so I can update the modules in each stack independently to ensure I can do this.
Is this right? Is there some other way to achieve this? Am I wrong about updates to the cloudformation template rolling the nodes without paying attention to the workloads, is there some setting I can make to the ASG to make it k8s aware?
So I guess I can use 1 stack with 2 node pools and when the updtae happens ignoreChanges on one pool, update the other and then stop ignoring them and updating the secon
07/06/2022, 4:12 PM
what are you ultimately trying to achieve?
07/07/2022, 5:00 PM
A resilient cluster which can survive updates to the cloudformation stack which include things like changing the ami- or the node types.
i.e. something where I can update the pulumi/eks module for different node pools at different times. Thus ensuring I can make such updates without downtime for my essential services.