I am trying to create a Pulumi project that automa...
# kubernetes
b
I am trying to create a Pulumi project that automatically populates the aws_auth ConfigMap with user mappings using .NET. The EKS Crosswalk appears to does this, but as that's not available for dotnet yet, I was trying to do it myself.
Copy code
var configMap = new ConfigMap("aws-auth", new ConfigMapArgs()
{
    Metadata = new ObjectMetaArgs()
    {
        Namespace = "kube-system",
        Name = "aws-auth"
    },
    Data = new InputMap<string>()
    {
        ["mapRoles"] = workerNodeRoleArn.Apply(arn =>
            new[] {
                //recreate default aws node role map
                new
                {
                    groups = new[]
                    {
                        "system:bootstrappers",
                        "system:nodes"
                    },
                    rolearn = arn,
                    username = "system:node:{{EC2PrivateDNSName}}"
                }
            }.ToYaml()
        )
    }
});
The issue is that as is, Pulumi complains that the resource already exists. I don't want to import it (EKS Crosswalk doesn't appear to import it either) as I want this to work without manual intervention on brand new EKS clusters. How does Crosswalk do it, and how can I get Pulumi to take control of this ConfigMap without importing it? At this point, it would be fine if I could just delete that ConfigMap and re-create it, but don't think Pulumi supports that either.
b
Hey, i just spun up a brand new EKS cluster, and that configmap doesn't exist. I don't see a provider being set on your configmap resource, are you sure this pulumi config isn't being applied to the cluster in your
$KUBECONFIG
?
how did you create your cluster @bumpy-motorcycle-53357?
b
I created the cluster using Pulumi.Aws.Eks.Cluster. But that ConfigMap gets created in AWS using Terraform or even the AWS Console. It's how AWS maps IAM users to K8s
b
yeah I understand the need for it to exist, but it seems in your case, it already exists and was created by something out of band? in your original post
I want this to work without manual intervention on brand new EKS clusters
it should work without issue, I guess I'm trying to determine what created the configmap in this case. if you're happy to delete it, can you just delete it manually and go from there?
b
AWS creates the confiug map, but only the creating user gets access to the cluster. You have to then manually add mappings per user (IAM groups aren't supported). It's a very manually and very tedious process. Hence the desire to control it using Pulumi
b
ah, it just showed up in my brand new cluster, so it seems to be async somehow, how strange
b
EKS Crosswalk does it somehow, and I'm trying to figure out how they do it since Crosswalk isn't available for dotnet yet. https://www.pulumi.com/docs/guides/crosswalk/aws/eks/#managing-eks-cluster-authentication-with-iam
b
this is a little curious to me, I'll try dig a little deeper later today, I'm doing something similar with the Go SDK at the moment
b
correct. and that's what I tried. But since AWS already creates that ConfigMap, Pulumi complains that the resource already exists and fails. Hence the confusion.
I don't get how CrossWalk gets around that issue
b
me neither at this point
w
I'm in the same boat
b
So after several spin ups/downs, I discovered that the aws-auth ConfigMap is only created after the first Node Group is created (not during the cluster creation itself). So if I create the ConfigMap, and make the Node Group dependent on it (to ensure the ConfigMap gets created first), then Pulumi gets the chance to create and own it. Of course, you have to create a valid ConfigMap from scratch. Here's the c# helper method for creating the config map in case it's helpful.
Copy code
private ConfigMap AwsAuthConfigMap(Output<string> workerNodeRoleArn)
{
    static object CreateUserMap(GetUserResult user, params string[] groups)
    {
        return new
        {
            userarn = user.Arn,
            username = user.UserName,
            groups
        };
    }

    var adminUsernames = _config.GetObject<List<string>>("eks-admins") ?? Enumerable.Empty<string>();
    var devUsernames = _config.GetObject<List<string>>("eks-devs") ?? Enumerable.Empty<string>(); ;

    var admins = adminUsernames.Select(x => GetUser.InvokeAsync(new GetUserArgs() { UserName = x }).Result).ToArray();
    var devs = devUsernames.Select(x => GetUser.InvokeAsync(new GetUserArgs() { UserName = x }).Result).ToArray();

    var mapUsers = admins.Select(x => CreateUserMap(x, "system:masters"))
        .Concat(devs.Select(x => CreateUserMap(x, "system:basic-user")))
        .ToArray()
        .ToYaml();

    var configMap = new ConfigMap("aws-auth", new ConfigMapArgs()
    {
        Metadata = new ObjectMetaArgs()
        {
            Namespace = "kube-system",
            Name = "aws-auth"
        },
        Data = new InputMap<string>()
        {
            ["mapRoles"] = workerNodeRoleArn.Apply(arn =>
                new[] {
                        //recreate default aws node role map
                        new
                        {
                            groups = new[]
                            {
                                "system:bootstrappers",
                                "system:nodes"
                            },
                            rolearn = arn,
                            username = "system:node:{{EC2PrivateDNSName}}"
                        }
                }.ToYaml()
            ),
            ["mapUsers"] = mapUsers
        }
    });

    return configMap;
}
b
@bumpy-motorcycle-53357 awesome find! nice work!
w
@bumpy-motorcycle-53357 good to know. Thanks!
I've been testing creating a cluster from scratch and this always fails with a timeout:
Copy code
Diagnostics:
  kubernetes:core:ConfigMap (alpha-eks-auth):
    error: configured Kubernetes cluster is unreachable: unable to load schema information from the API server: Get "https://<endpoint>.<http://gr7.us-west-2.eks.amazonaws.com/openapi/v2?timeout=32s|gr7.us-west-2.eks.amazonaws.com/openapi/v2?timeout=32s>": dial tcp <ip>:443: i/o timeout
I've tried configuring a create timeout to no avail:
Copy code
var k8sProvider = new Provider($"{name}-k8s", new ProviderArgs { KubeConfig = KubeConfig });

// aws-auth config map
var authConfigMap = new ConfigMap($"{name}-auth", new ConfigMapArgs
{
    Metadata = new ObjectMetaArgs
    {
        Namespace = "kube-system",
        Name = "aws-auth"
    },
    Data =
    {
        ["mapRoles"] = IamHelpers.GetRoleMappings(nodeRole, awsAccountId),
        ["mapUsers"] = IamHelpers.GetUserMappings()
    }
},
new CustomResourceOptions { CustomTimeouts = new CustomTimeouts { Create = TimeSpan.FromMinutes(1) }, Provider = k8sProvider });
I also tried adding a dependency on the cluster resource but this also made no difference; from what I can see this is already an implicit dependency of the k8s provider.
There seems to be a delay between the cluster being determined as created and actually being usable
Is there a way to configure the
timeout=32s
query parameter above?
@gorgeous-egg-16927 any ideas?
b
weird. I noticed the short delay in availability, but it never timed out on me. It just waited the few seconds and then started working.
g
I think @breezy-hamburger-69619 would know
w
This shows that it took ~2.5m before the eks cluster was actually responsive πŸ€”
I ran a quick and dirty test with the following:
Copy code
date && pulumi up --skip-preview --suppress-outputs
date && mkdir -p /home/user/.kube && pulumi stack output --show-secrets KubeConfig > /home/user/.kube/alpha
while true; do date && kubectl --kubeconfig=/home/user/.kube/alpha version; done
With the following output:
Copy code
user@e6d07659f7d8:/workspaces/gemini-pulumi/eks-infra$ date && pulumi up --skip-preview --suppress-outputs
Wed Jul 22 02:16:41 UTC 2020
Updating (pharos/alpha):
     Type                                Name                          Status      
     pulumi:pulumi:Stack                 eks-infra-alpha                           
 +   β”œβ”€ aws:iam:Role                     alpha-eks-cluster-role        created     
 +   β”‚  └─ aws:iam:RolePolicyAttachment  alpha-eks-cluster-rp-cluster  created     
 +   β”œβ”€ aws:iam:Role                     alpha-eks-node-role           created     
 +   β”‚  β”œβ”€ aws:iam:RolePolicyAttachment  alpha-eks-node-rp-cni         created     
 +   β”‚  β”œβ”€ aws:iam:RolePolicyAttachment  alpha-eks-node-rp-node        created     
 +   β”‚  β”œβ”€ aws:iam:RolePolicy            alpha-eks-node-rp-alb         created     
 +   β”‚  └─ aws:iam:RolePolicyAttachment  alpha-eks-node-rp-ecr         created     
 +   └─ aws:eks:Cluster                  alpha-eks-cluster             created     
 
Resources:
    + 8 created
    1 unchanged

Duration: 10m20s

Permalink: <https://app.pulumi.com/pharos/eks-infra/alpha/updates/79>

user@e6d07659f7d8:/workspaces/gemini-pulumi/eks-infra$ date && mkdir -p /home/user/.kube && pulumi stack output --show-secrets KubeConfig > /home/user/.kube/alpha
Wed Jul 22 02:27:05 UTC 2020
user@e6d07659f7d8:/workspaces/gemini-pulumi/eks-infra$ while true; do date && kubectl --kubeconfig=/home/user/.kube/alpha version; done
Wed Jul 22 02:27:07 UTC 2020
Client Version: <http://version.Info|version.Info>{Major:"1", Minor:"18", GitVersion:"v1.18.4", GitCommit:"c96aede7b5205121079932896c4ad89bb93260af", GitTreeState:"clean", BuildDate:"2020-06-17T11:41:22Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
Unable to connect to the server: dial tcp 35.155.125.57:443: i/o timeout
Wed Jul 22 02:27:37 UTC 2020
Client Version: <http://version.Info|version.Info>{Major:"1", Minor:"18", GitVersion:"v1.18.4", GitCommit:"c96aede7b5205121079932896c4ad89bb93260af", GitTreeState:"clean", BuildDate:"2020-06-17T11:41:22Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
Unable to connect to the server: dial tcp 35.155.125.57:443: i/o timeout
Wed Jul 22 02:28:07 UTC 2020
Client Version: <http://version.Info|version.Info>{Major:"1", Minor:"18", GitVersion:"v1.18.4", GitCommit:"c96aede7b5205121079932896c4ad89bb93260af", GitTreeState:"clean", BuildDate:"2020-06-17T11:41:22Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
Unable to connect to the server: dial tcp 44.227.187.110:443: i/o timeout
Wed Jul 22 02:28:37 UTC 2020
Client Version: <http://version.Info|version.Info>{Major:"1", Minor:"18", GitVersion:"v1.18.4", GitCommit:"c96aede7b5205121079932896c4ad89bb93260af", GitTreeState:"clean", BuildDate:"2020-06-17T11:41:22Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
Unable to connect to the server: dial tcp 44.227.187.110:443: i/o timeout
Wed Jul 22 02:29:07 UTC 2020
Client Version: <http://version.Info|version.Info>{Major:"1", Minor:"18", GitVersion:"v1.18.4", GitCommit:"c96aede7b5205121079932896c4ad89bb93260af", GitTreeState:"clean", BuildDate:"2020-06-17T11:41:22Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
Unable to connect to the server: dial tcp 35.155.125.57:443: i/o timeout
Wed Jul 22 02:29:37 UTC 2020
Client Version: <http://version.Info|version.Info>{Major:"1", Minor:"18", GitVersion:"v1.18.4", GitCommit:"c96aede7b5205121079932896c4ad89bb93260af", GitTreeState:"clean", BuildDate:"2020-06-17T11:41:22Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: <http://version.Info|version.Info>{Major:"1", Minor:"17+", GitVersion:"v1.17.6-eks-4e7f64", GitCommit:"4e7f642f9f4cbb3c39a4fc6ee84fe341a8ade94c", GitTreeState:"clean", BuildDate:"2020-06-11T13:55:35Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
Wed Jul 22 02:29:38 UTC 2020
Client Version: <http://version.Info|version.Info>{Major:"1", Minor:"18", GitVersion:"v1.18.4", GitCommit:"c96aede7b5205121079932896c4ad89bb93260af", GitTreeState:"clean", BuildDate:"2020-06-17T11:41:22Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: <http://version.Info|version.Info>{Major:"1", Minor:"17+", GitVersion:"v1.17.6-eks-4e7f64", GitCommit:"4e7f642f9f4cbb3c39a4fc6ee84fe341a8ade94c", GitTreeState:"clean", BuildDate:"2020-06-11T13:55:35Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
...
b
IME these delays from the EKS service can occur, and it can be also be some minutes before nodes are marked
READY
In most cases delays are eventual consistency, but 2+ min is more indicative of EKS just having an odd week
w
So how can I synchronize with it, waiting for it to be ready, so the aws-auth config map and node group can subsequently be created?
b
This is how it’s done for p/eks in TS - we poll for the APIserver to be ready: https://github.com/pulumi/pulumi-eks/blob/master/nodejs/eks/cluster.ts#L468 Multi lang EKS will help with this in the future.
🍺 1