Hi there I activated EKS Auto mode using Pulumi EKS v3 cross Pulumi Community #aws

Hi there! I activated EKS Auto mode using Pulumi E...

stale-tomato-37875

01/14/2025, 5:29 AM

Hi there! I activated EKS Auto mode using Pulumi EKS v3 (crosswalk). Is there a formal method to disable it? After I removed the

Copy code

autoMode: {
    enabled: true,
},

the

pulumi up

failed with following errors

Copy code

Diagnostics:
  pulumi:pulumi:Stack (brainfish-universe-eks-au):
    error: eks:index:Cluster resource 'brainfish-au' has a problem: grpc: the client connection is closing

  aws:eks:Cluster (brainfish-au-eksCluster):
    error:   sdk-v2/provider2.go:515: sdk.helper_schema: compute_config.enabled, kubernetes_networking_config.elastic_load_balancing.enabled, and storage_config.block_storage.enabled must all be set to either true or false: provider=aws@6.66.1
    error: diffing urn:pulumi:au::brainfish-universe-eks::eks:index:Cluster$aws:eks/cluster:Cluster::brainfish-au-eksCluster: 1 error occurred:
        * compute_config.enabled, kubernetes_networking_config.elastic_load_balancing.enabled, and storage_config.block_storage.enabled must all be set to either true or false

Additionally, I'd love to learn more about the design thinking behind the @pulumi/eks NodeGroup. It utilizes an autoscaling group in the background and requires minimum and maximum node counts. I'm curious about when it actually scales up, as I haven't noticed any changes in the machine nodes within the autoscaling group created by pulumi/eks.

quick-house-41860

01/14/2025, 9:57 AM

Hey @stale-tomato-37875, removing the autoMode block should do the trick. Alternatively you could try setting

enabled

to false. I opened an issue for this and will start looking into it: https://github.com/pulumi/pulumi-eks/issues/1585

quick-house-41860

01/14/2025, 10:00 AM

Regarding your question about the ASG of the node groups. Generally those are not scaled automatically in AWS. You can control the scaling behavior using the

aws.autoscaling.Policy

resource. But what's way better is hooking the ASGs directly into the kubernetes lifecycle and driving scaling decisions based on the resource requests in your cluster.

quick-house-41860

01/14/2025, 10:00 AM

You can do that using Cluster Autoscaler: https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/cloudprovider/aws/README.md

quick-house-41860

01/14/2025, 10:01 AM

Here's more insights into how it works: https://aws.github.io/aws-eks-best-practices/cluster-autoscaling/

stale-tomato-37875

01/14/2025, 10:11 AM

Thank you for providing these references! They really help clear things up.

quick-house-41860

01/14/2025, 10:13 AM

Another option that I really like is using Karpenter. It's way more dynamic than using Node Groups and has some very nifty capacity rebalancing features that can save you a ton of money. E.g. Slack was able to achieve 12% compute cost savings with it: https://aws.amazon.com/blogs/containers/how-slack-adopted-karpenter-to-increase-operational-and-cost-efficiency/

stale-tomato-37875

01/14/2025, 10:18 AM

Yeah, Karpenter looks promising. I feel EKS auto mode is managed Karpenter. I’d like to give it a further spinning. I’m working on a migration path from existing pulumi crosswalk self managed node group to auto managed nodes. Feel a bit challenging when I need to minimise the production disruption 😅 Meanwhile, if EKS auto becomes mainstream, I see less needs around @pulumi/eks package 😵‍💫

quick-house-41860

01/14/2025, 10:21 AM

Yeah, it's basically managed Karpenter + managed networking addons + managed Load Balancer integration. So operationally speaking it really takes away a lot of ops burden! At the same time you pay for that (quite literally 😄) with higher instance costs. I spot checked a few and generally saw a ~10% surcharge for instances managed with auto mode. It's all tradeoffs in the end 🙂

quick-house-41860

01/14/2025, 10:22 AM

My general recommendation now would be to start out with Auto Mode and then move to either Managed Node Groups or Karpenter if there's feature gaps or the additional cost's too high at scale.

stale-tomato-37875

01/14/2025, 10:25 AM

Thanks for sharing. I remember Managed Node Group is less feature rich than NodeGroupV2 right? For example, I can define extraSecurityGroup with NodeGroupV2 but impossible with Managed Node Group?

quick-house-41860

01/14/2025, 10:28 AM

You can do that with managed node groups, but you'll have to set it in the Launch Template and pass that in. So it's a bit more complicated to do that. The design challenge with ManagedNodeGroup is deciding on what underlying options to directly expose without making the API too convoluted. If there's certain gaps that currently stop you from using it, please open a feature request on GitHub! This helps us prioritize and make decisions 🙂

quick-house-41860

01/14/2025, 10:28 AM

But generally yes, there's certain advanced settings that AWS EKS just does not expose for the managed node groups.

stale-tomato-37875

01/14/2025, 10:30 AM

I saw some relevant issues around diskSize, I can understand the implications. That’s a hard tech decision to make. I’ll think twice and raise an issue if needed!

quick-house-41860

01/14/2025, 2:02 PM

Thanks for bringing this issue up. This is actually resolved already, our automation just failed to close it. I've closed it now. There's diskSize and gpu inputs now!

better-kitchen-49894

01/14/2025, 2:57 PM

One thing to think about regarding EKS automode is that it does not (yet) support using prefix mode via vpc-cni for allocating prefixed groups of IP addresses from the subnets for pods. Thus the number of pods possible to run on an instance is restricted to the number of ENIs and number of IP addresses per ENI. We were considering converting some clusters to automode, but since it does not support using the prefixes, that was a non-starter.

69 Views

Open in Slack

Previous Next