01/25/2023, 9:15 PM
Not sure if this should be in here or in #kubernetes but my specific issue is aws/ec2 related so I chose here. I have deployed an eks cluster (or two) with pulumi using the eks module. Each kluster has 3 node pools, and my issue occurs when I come to update them, either because a new ami has been released or because I wish to change some aspect of the node pool, like the min/max or the node type. The ami changing causes the biggest issue as it changes all the node pools and then the nodes 'all roll at once'. By 'all roll at once' I need to further explain that I have some stateful services running on these clusters, e.g. percona's mysql, loki, redis and potentially ElasticSearch. These use ebs volumes to store state on using the ebs csi plugin/driver to mount ebs volumes to the required node. Experimentation has taught me that an ebs volume takes at least 8 minutes to disconnect from a terminating node and then reconnect to a new node and thus allow the pod it belongs to to be scheduled. Of course the ASG rolls the nodes based on their EC2 health check, rolling the next node as soon as its replacement is healthy, which means for a kluster of 10-15 nodes they all roll in a couple of minutes which is much faster than the 8min delay required to get a stateful node back an healthy. I have looked at the checkpointing available for the instance Refresh command, but that would seem to only work when using instance refresh and not when updating the instance profile of the ASG via pulumi. Also there are no options in the eks.createNodePool function which allow setting checkpoints. I guess my ideal solution would involve the autoscaler informing the ASG of which nodes have pods on which cannot be evicted now, and the ASG pausing the cluster roll until quorum is achieved again in the stateful service and the next node can be rolled, but I can't see an easy way of doing that, I guess something could be done with lifecycle events on the node pool's ASG and toggling the instance protection on nodes which need to be preserved, but it all feels kludgey. Secondly would be a setting on the ASG which slows down any roll of the nodes to have 10min pauses between each one, but there doesn't seem to be one. I guess I could look at setting the default instance warmup to 10mins but I'm not sure that will stop the old instances being replaced. This seems like it should be a common issue as updating node pools with stateful services is something everyone needs to do, right? How do you manage this?


01/25/2023, 9:22 PM
These seems like general EKS questions rather than pulumi questions.. however, questions: • do you have a pod disruption budget set on your stateful workloads? • have you tried an operator like: ? • have you tried setting the max unavailable settings in the update config?


01/25/2023, 9:55 PM
I guess it is more of an EKS question than Pulumi. Is there an EKS slack out there? I guess I should go and find out. I do have pod disruption budgets set but updating the node pool seems to ignore them I have not tried that operator I will go have a gander, thanks. I haven't explicitly set the max unavailable, but I don't think that would help, the node pool always has enough available nodes it's just the pvc can't be satisfied as the ebs vol is attached to a dying node, and won't be free for 8 mins.