Hi team, I'm recently bumping the `@pulumi/eks` ve...
# aws
s
Hi team, I'm recently bumping the
@pulumi/eks
version from v2.2.1 to v2.8.1. I keep receiving errors like below
Copy code
kubernetes:core/v1:ConfigMap (brainfish-prod-nodeAccess):
    error: configured Kubernetes cluster is unreachable: unable to load schema information from the API server: the server has asked for the client to provide credentials
I found similar issue from 5 years ago in pulumi github issues so I feel the root cause should be different. I struggle to debug this issue as the error isn't indicative. The aws eks cluster is currently ConfigMap only auth mode, will you suggest I switch the auth mode to API before upgrading? any hints are appreciated
q
Hey Charlie, upgrading to the API auth mode shouldn't be necessary to get this working, but I generally recommend it because it's easier to manage. As for this error, it hints towards the current IAM principal not being able to access the cluster. Did you change the IAM role/user you use for deployments? I'd recommend trying to see if this works without pulumi. i.e. use the same role/user and do
aws eks update-kubeconfig ...
to configure the kubeconfig, followed by something like
kubectl get nodes
to confirm you can access the cluster. If that works you could compare the kubeconfig the aws cli generates with the one the provider generates (it should be an output on the cluster component).
s
thanks Florian, I checked with plain aws eks / kubectl commands I can confirm that relevant information can be fetched. I also compared the kubeconfig generated by aws cli and pulumi provider. the only difference is that absence of region in pulumi side. what confused me is that this auth issue only occurs when I upgrade the pulumi/eks version. after I revert the version, pulumi script can access the eks cluster again 😵‍💫
q
I just remembered that there was a change to not include the profile name in the kubeconfig the provider auto generates. It's not that issue, is it? https://github.com/pulumi/pulumi-eks/commit/fc90dafde9ae03627b15e7ff93768513789ba47e I assume you're using the
provider
output? Or are you creating a provider manually by using the
kubeconfig
output?
If it turns out that this is indeed caused by the profile, I'd recommend trying whether you can use the
getKubeconfig
method instead to set the
profileName
to use
s
aha moment! we do need profile name in the kubeconfig no wonder I found the profile name to be removed in the pulumi preview
however I'm updating the eks with default provider
Copy code
import * as aws from "@pulumi/aws";
import * as eks from "@pulumi/eks";
import * as pulumi from "@pulumi/pulumi";

import { ORG } from "./constants";
import {
  stack,
  defaultVpcSubnetsIds,
  defaultSecurityGroupId,
  defaultVpcId,
} from "./stackRef";
import * as config from "./config";

const securityGroup = aws.ec2.SecurityGroup.get(
  "default",
  defaultSecurityGroupId
);

const awsAccountId = aws
  .getCallerIdentity()
  .then((current) => current.accountId);

const eksMasterRole = new aws.iam.Role(`${ORG}-eks-${stack}-master-role`, {
  assumeRolePolicy: pulumi.interpolate`{
      "Version": "2012-10-17",
      "Statement":[
        {
          "Sid": "",
          "Effect": "Allow",
          "Principal": {
            "AWS": "arn:aws:iam::${awsAccountId}:root"
          },
          "Action": "sts:AssumeRole"
        }
      ]
     }
  `,
});

// TODO: Remove this line when deploying to AWS
if (stack == "dev") {
  process.env.AWS_PROFILE = "brainfish-dev"; // Make it compatible with local deployment
}
if (stack == "prod" || stack == "eu" || stack == "au") {
  process.env.AWS_PROFILE = "brainfish-prod"; // Make it compatible with local deployment
}

const cluster = new eks.Cluster(`${ORG}-${stack}`, {
  vpcId: defaultVpcId,
  subnetIds: defaultVpcSubnetsIds,
  deployDashboard: false,
  nodeGroupOptions: {
    minSize: config.EKS_MIN_WORKER_NODE_NUMBER,
    maxSize: config.EKS_MAX_WORKER_NODE_NUMBER,
    desiredCapacity: config.EKS_DESIRED_WORKER_NODE_NUMBER,
    nodeRootVolumeEncrypted: true,
    amiId: config.EKS_NODE_AMI_ID, // pin Amazon EKS-optimized Amazon Linux 2 AMI to avoid accidental nodes destruction
    nodeRootVolumeSize: 100, // 100GB, reasonable default
    extraNodeSecurityGroups: [securityGroup],
    instanceType: config.EKS_NODE_INSTANCE_TYPE,
    nodeRootVolumeType: config.EKS_NODE_ROOT_VOLUME_TYPE as
      | "standard"
      | "gp2"
      | "gp3"
      | "st1"
      | "sc1"
      | "io1",
  },
  roleMappings: [
    // Provides full administrator cluster access to the k8s cluster
    {
      groups: ["system:masters"],
      roleArn: eksMasterRole.arn,
      username: "pulumi:master-role-user", // not used but required
    },
  ],
});

if (stack == "prod" || stack == "dev" || stack == "eu" || stack == "au") {
  process.env.AWS_PROFILE = ""; // Reset the AWS_PROFILE
}

export const clusterKubeconfigOrigin = pulumi.secret(cluster.kubeconfig);

export const clusterKubeconfig = pulumi.secret(
  cluster.getKubeconfig({
    roleArn: eksMasterRole.arn,
  })
);

// This is the security group created by @pulumi/eks (Pulumi AWS Crosswalk) for the self-managed node groups; this is also added to the AWS EKS cluster's additional security groups field.
export const clusterNodeSecurityGroupId = cluster.nodeSecurityGroup.id;

// This is the security group that is created by AWS by default for all new EKS clusters. This is an EKS created security group that applied to ENI that is attached to EKS Control Plane master nodes, as well as any managed workloads.
export const clusterSecurityGroupId = cluster.clusterSecurityGroup.id;
I encounter this issue with the pulumi script above
can you expand a bit around "trying whether you can use the
getKubeconfig
method instead to set the
profileName
to use"
q
The cluster class exposes a method called
getKubeconfig
that you can use to generate a kubeconfig instead of using the one from the output. It allows you to set an aws profile name to be included. Using that you should be able to generate a kubeconfig that looks like the old v2.2.1 one
Ah wait, I see that you're already using that one
s
that way only affects downstream component that relys upon the cluster generated by the current pulumi stack
q
In that case it should be good enough to explicitly add the
profileName
argument like so I think:
Copy code
cluster.getKubeconfig({
    roleArn: eksMasterRole.arn,
    profileName: "..."
  })
s
what actually puzzles me is the chicken egg problem in the current pulumi eks stack. the pulumi eks cluster module generates an aws eks cluster that includes a configmap granting the current aws iam role sufficient access. if I'm changing the configmap auth, the script should use the kubeconfig from history, however it generates a new kubeconfig on the fly assuming the kubeconfig was the previous one
now sure if I expressed my confusion clearly 😂
q
Can you go into more detail about that last bit? I don't think I'm fully following 😅
s
My understanding of how pulumi eks crosswalk v2(.2.1) works: 1.
@pulumi/eks
generates an aws eks and configmap by default (this was preferred by aws in history) 2. That configmap includes the iam role used to generate the cluster and grants that iam role permissions related to system:masters 3. Therefore following up update against that cluster can succeed because of this 4. If there is any change against the configmap (in the current case, the upgrade of the eks behaviour removes some fields like profile-name in the configmap), the change will fail 5. This is because configmap only allows a very specific iam role combination, the change of the default provider due to upgrade of the eks package will fail the authentication 6. This become the deadlock of the upgrade
q
EKS grants the iam principal used during cluster creation admin access by default (otherwise you couldn't even create the aws-auth configmap for example). The provider actually uses two different kubeconfigs to get around the profile you've mentioned in 4). The one it uses internally, actually still includes the profile name, so this should still continue working.
What part fails exactly for you? Using the generated kubeconfig or the cluster updating the aws-auth configmap internally?
s
wow, I didn't know two different kubeconfigs "cluster updating the aws-auth configmap internally" is failing at the moment afaik
I saw this error diagnosis info
this is the preview
CleanShot 2025-02-14 at 19.49.29@2x.png
this is the error
q
Yeah, this is all very convoluted sadly. Luckily the
API
access mode improves this situation a lot That diff is expected (that's the profile change), but this is the "user facing" k8s provider. You should see another k8s provider in your preview that's called
brainfish-prod-eks-k8s
. That's the one used for the auth config map. Do you see any changes for that one?
s
now I'm reading your first comment - comparing kubeconfig diff between aws eks commands and pulumi k8s provider I said the only diff is the aws region field I just made my last experiment to expose the missing aws region env explicitly tata it works oh my god, how surprising!
I suspect the pulumi/eks package changed the behaviour in terms of fetching aws region field
between v2.2.1 and v2.8.1
thanks for letting me rubber duck you care your hints are invaluable!
q
Mhm, I don't see us setting the region at all on v2 (either of the versions you mentioned). What happens if you try to change the aws-auth configmap on v2.2.1 (e.g. add an additional iam role). Maybe this would've also failed there, just never surfaced. Btw, on v3 we're setting the region now, so it should "just work" there 🤞