Hi team I m recently bumping the ` pulumi eks` version from Pulumi Community #aws

Hi team, I'm recently bumping the `@pulumi/eks` ve...

stale-tomato-37875

02/14/2025, 5:33 AM

Hi team, I'm recently bumping the

@pulumi/eks

version from v2.2.1 to v2.8.1. I keep receiving errors like below

Copy code

kubernetes:core/v1:ConfigMap (brainfish-prod-nodeAccess):
    error: configured Kubernetes cluster is unreachable: unable to load schema information from the API server: the server has asked for the client to provide credentials

I found similar issue from 5 years ago in pulumi github issues so I feel the root cause should be different. I struggle to debug this issue as the error isn't indicative. The aws eks cluster is currently ConfigMap only auth mode, will you suggest I switch the auth mode to API before upgrading? any hints are appreciated

quick-house-41860

02/14/2025, 9:52 AM

Hey Charlie, upgrading to the API auth mode shouldn't be necessary to get this working, but I generally recommend it because it's easier to manage. As for this error, it hints towards the current IAM principal not being able to access the cluster. Did you change the IAM role/user you use for deployments? I'd recommend trying to see if this works without pulumi. i.e. use the same role/user and do

aws eks update-kubeconfig ...

to configure the kubeconfig, followed by something like

kubectl get nodes

to confirm you can access the cluster. If that works you could compare the kubeconfig the aws cli generates with the one the provider generates (it should be an output on the cluster component).

stale-tomato-37875

02/14/2025, 10:12 AM

thanks Florian, I checked with plain aws eks / kubectl commands I can confirm that relevant information can be fetched. I also compared the kubeconfig generated by aws cli and pulumi provider. the only difference is that absence of region in pulumi side. what confused me is that this auth issue only occurs when I upgrade the pulumi/eks version. after I revert the version, pulumi script can access the eks cluster again 😵‍💫

quick-house-41860

02/14/2025, 10:19 AM

I just remembered that there was a change to not include the profile name in the kubeconfig the provider auto generates. It's not that issue, is it? https://github.com/pulumi/pulumi-eks/commit/fc90dafde9ae03627b15e7ff93768513789ba47e I assume you're using the

provider

output? Or are you creating a provider manually by using the

kubeconfig

output?

quick-house-41860

02/14/2025, 10:20 AM

If it turns out that this is indeed caused by the profile, I'd recommend trying whether you can use the

getKubeconfig

method instead to set the

profileName

to use

stale-tomato-37875

02/14/2025, 10:25 AM

aha moment! we do need profile name in the kubeconfig no wonder I found the profile name to be removed in the pulumi preview

stale-tomato-37875

02/14/2025, 10:26 AM

however I'm updating the eks with default provider

stale-tomato-37875

02/14/2025, 10:28 AM

Copy code

import * as aws from "@pulumi/aws";
import * as eks from "@pulumi/eks";
import * as pulumi from "@pulumi/pulumi";

import { ORG } from "./constants";
import {
  stack,
  defaultVpcSubnetsIds,
  defaultSecurityGroupId,
  defaultVpcId,
} from "./stackRef";
import * as config from "./config";

const securityGroup = aws.ec2.SecurityGroup.get(
  "default",
  defaultSecurityGroupId
);

const awsAccountId = aws
  .getCallerIdentity()
  .then((current) => current.accountId);

const eksMasterRole = new aws.iam.Role(`${ORG}-eks-${stack}-master-role`, {
  assumeRolePolicy: pulumi.interpolate`{
      "Version": "2012-10-17",
      "Statement":[
        {
          "Sid": "",
          "Effect": "Allow",
          "Principal": {
            "AWS": "arn:aws:iam::${awsAccountId}:root"
          },
          "Action": "sts:AssumeRole"
        }
      ]
     }
  `,
});

// TODO: Remove this line when deploying to AWS
if (stack == "dev") {
  process.env.AWS_PROFILE = "brainfish-dev"; // Make it compatible with local deployment
}
if (stack == "prod" || stack == "eu" || stack == "au") {
  process.env.AWS_PROFILE = "brainfish-prod"; // Make it compatible with local deployment
}

const cluster = new eks.Cluster(`${ORG}-${stack}`, {
  vpcId: defaultVpcId,
  subnetIds: defaultVpcSubnetsIds,
  deployDashboard: false,
  nodeGroupOptions: {
    minSize: config.EKS_MIN_WORKER_NODE_NUMBER,
    maxSize: config.EKS_MAX_WORKER_NODE_NUMBER,
    desiredCapacity: config.EKS_DESIRED_WORKER_NODE_NUMBER,
    nodeRootVolumeEncrypted: true,
    amiId: config.EKS_NODE_AMI_ID, // pin Amazon EKS-optimized Amazon Linux 2 AMI to avoid accidental nodes destruction
    nodeRootVolumeSize: 100, // 100GB, reasonable default
    extraNodeSecurityGroups: [securityGroup],
    instanceType: config.EKS_NODE_INSTANCE_TYPE,
    nodeRootVolumeType: config.EKS_NODE_ROOT_VOLUME_TYPE as
      | "standard"
      | "gp2"
      | "gp3"
      | "st1"
      | "sc1"
      | "io1",
  },
  roleMappings: [
    // Provides full administrator cluster access to the k8s cluster
    {
      groups: ["system:masters"],
      roleArn: eksMasterRole.arn,
      username: "pulumi:master-role-user", // not used but required
    },
  ],
});

if (stack == "prod" || stack == "dev" || stack == "eu" || stack == "au") {
  process.env.AWS_PROFILE = ""; // Reset the AWS_PROFILE
}

export const clusterKubeconfigOrigin = pulumi.secret(cluster.kubeconfig);

export const clusterKubeconfig = pulumi.secret(
  cluster.getKubeconfig({
    roleArn: eksMasterRole.arn,
  })
);

// This is the security group created by @pulumi/eks (Pulumi AWS Crosswalk) for the self-managed node groups; this is also added to the AWS EKS cluster's additional security groups field.
export const clusterNodeSecurityGroupId = cluster.nodeSecurityGroup.id;

// This is the security group that is created by AWS by default for all new EKS clusters. This is an EKS created security group that applied to ENI that is attached to EKS Control Plane master nodes, as well as any managed workloads.
export const clusterSecurityGroupId = cluster.clusterSecurityGroup.id;

I encounter this issue with the pulumi script above

stale-tomato-37875

02/14/2025, 10:31 AM

can you expand a bit around "trying whether you can use the

getKubeconfig

method instead to set the

profileName

to use"

quick-house-41860

02/14/2025, 11:20 AM

The cluster class exposes a method called

getKubeconfig

that you can use to generate a kubeconfig instead of using the one from the output. It allows you to set an aws profile name to be included. Using that you should be able to generate a kubeconfig that looks like the old v2.2.1 one

quick-house-41860

02/14/2025, 11:22 AM

Ah wait, I see that you're already using that one

stale-tomato-37875

02/14/2025, 11:23 AM

that way only affects downstream component that relys upon the cluster generated by the current pulumi stack

quick-house-41860

02/14/2025, 11:25 AM

In that case it should be good enough to explicitly add the

profileName

argument like so I think:

Copy code

cluster.getKubeconfig({
    roleArn: eksMasterRole.arn,
    profileName: "..."
  })

stale-tomato-37875

02/14/2025, 11:28 AM

what actually puzzles me is the chicken egg problem in the current pulumi eks stack. the pulumi eks cluster module generates an aws eks cluster that includes a configmap granting the current aws iam role sufficient access. if I'm changing the configmap auth, the script should use the kubeconfig from history, however it generates a new kubeconfig on the fly assuming the kubeconfig was the previous one

stale-tomato-37875

02/14/2025, 11:29 AM

now sure if I expressed my confusion clearly 😂

quick-house-41860

02/14/2025, 11:30 AM

Can you go into more detail about that last bit? I don't think I'm fully following 😅

stale-tomato-37875

02/14/2025, 11:41 AM

My understanding of how pulumi eks crosswalk v2(.2.1) works: 1.

@pulumi/eks

generates an aws eks and configmap by default (this was preferred by aws in history) 2. That configmap includes the iam role used to generate the cluster and grants that iam role permissions related to system:masters 3. Therefore following up update against that cluster can succeed because of this 4. If there is any change against the configmap (in the current case, the upgrade of the eks behaviour removes some fields like profile-name in the configmap), the change will fail 5. This is because configmap only allows a very specific iam role combination, the change of the default provider due to upgrade of the eks package will fail the authentication 6. This become the deadlock of the upgrade

quick-house-41860

02/14/2025, 11:45 AM

EKS grants the iam principal used during cluster creation admin access by default (otherwise you couldn't even create the aws-auth configmap for example). The provider actually uses two different kubeconfigs to get around the profile you've mentioned in 4). The one it uses internally, actually still includes the profile name, so this should still continue working.

quick-house-41860

02/14/2025, 11:46 AM

What part fails exactly for you? Using the generated kubeconfig or the cluster updating the aws-auth configmap internally?

stale-tomato-37875

02/14/2025, 11:48 AM

wow, I didn't know two different kubeconfigs "cluster updating the aws-auth configmap internally" is failing at the moment afaik

stale-tomato-37875

02/14/2025, 11:49 AM

I saw this error diagnosis info

stale-tomato-37875

02/14/2025, 11:49 AM

this is the preview

stale-tomato-37875

02/14/2025, 11:49 AM

CleanShot 2025-02-14 at 19.49.29@2x.png

stale-tomato-37875

02/14/2025, 11:49 AM

this is the error

quick-house-41860

02/14/2025, 11:54 AM

Yeah, this is all very convoluted sadly. Luckily the

API

access mode improves this situation a lot That diff is expected (that's the profile change), but this is the "user facing" k8s provider. You should see another k8s provider in your preview that's called

brainfish-prod-eks-k8s

. That's the one used for the auth config map. Do you see any changes for that one?

stale-tomato-37875

02/14/2025, 12:00 PM

now I'm reading your first comment - comparing kubeconfig diff between aws eks commands and pulumi k8s provider I said the only diff is the aws region field I just made my last experiment to expose the missing aws region env explicitly tata it works oh my god, how surprising!

stale-tomato-37875

02/14/2025, 12:01 PM

I suspect the pulumi/eks package changed the behaviour in terms of fetching aws region field

stale-tomato-37875

02/14/2025, 12:01 PM

between v2.2.1 and v2.8.1

stale-tomato-37875

02/14/2025, 12:05 PM

thanks for letting me rubber duck you care your hints are invaluable!

quick-house-41860

02/14/2025, 12:51 PM

Mhm, I don't see us setting the region at all on v2 (either of the versions you mentioned). What happens if you try to change the aws-auth configmap on v2.2.1 (e.g. add an additional iam role). Maybe this would've also failed there, just never surfaced. Btw, on v3 we're setting the region now, so it should "just work" there 🤞

36 Views

Open in Slack

Previous Next