This message was deleted Pulumi Community #aws

Join Slack

This message was deleted.

# aws

sparse-intern-71089

05/18/2023, 4:27 PM

This message was deleted.

billowy-army-68599

05/18/2023, 4:39 PM

can you share your code?

proud-noon-87466

05/18/2023, 4:59 PM

We make this call several times with few different sets of settings. There's more code but I'm not sure how much I can share.

Copy code

eks.ManagedNodeGroup(
                ng_name,
                node_role_arn=ec2_role.arn,
                cluster=cluster.core,
                ami_type='CUSTOM' if ami_id else None,
                # Instance type will be specified in the launch template
                # instance_types=[node_type],
                scaling_config=aws.eks.NodeGroupScalingConfigArgs(
                    desired_size=max(node['min_node_count'], 1),
                    min_size=node['min_node_count'],
                    max_size=node['max_node_count']),
                subnet_ids=ng_subnet_ids,
                # The node version determines the AMI id, if AMI id already specified no need for node version
                version=node_version if not ami_id else None,
                tags=ng_tags,
                labels=node_labels,
                launch_template=aws.eks.NodeGroupLaunchTemplateArgs(
                    id=template.id,
                    version=template.latest_version,
                ),
                capacity_type=node.get('capacity_type', 'ON_DEMAND'),
                taints=[
                    aws.eks.NodeGroupTaintArgs(effect=taint.get('effect'), key=taint.get('key'),
                                               value=taint.get('value'))
                    for taint in node.get('taints', [])
                ],
                opts=ResourceOptions(ignore_changes=["scalingConfig.desiredSize"]),
            )

billowy-army-68599

05/18/2023, 5:02 PM

these two lines look suspicious:

Copy code

capacity_type=node.get('capacity_type', 'ON_DEMAND'),
                taints=[
                    aws.eks.NodeGroupTaintArgs(effect=taint.get('effect'), key=taint.get('key'),
                                               value=taint.get('value'))
                    for taint in node.get('taints', [])
                ],

have they ever worked? if you comment them out does it work?

proud-noon-87466

05/18/2023, 5:18 PM

Those lines have worked in the past. Is there some way debug this?

billowy-army-68599

05/18/2023, 5:37 PM

not super easily, something in your code is sending incorrect grpc messages to the engine

billowy-army-68599

05/18/2023, 5:38 PM

there’s a property that’s undefined basically. you’ll have to comment or print properties or use a debugger

proud-noon-87466

05/18/2023, 5:53 PM

Is there a way to find the underlying proto to see what "map" is?

billowy-army-68599

05/18/2023, 6:04 PM

what do you mean?

proud-noon-87466

05/18/2023, 6:11 PM

Is it possible to figure out what field might be set incorrectly here? In terms of debug logging I'm not sure how to enable it using the python SDK.

billowy-army-68599

05/18/2023, 6:24 PM

we are having an internal discussion about ways to improve this, but the underlying component is https://github.com/pulumi/pulumi-eks/blob/master/nodejs/eks/nodegroup.ts anywhere there’s a call to

map

may be the issue, so maybe look at the

tags

input? https://github.com/pulumi/pulumi-eks/blob/master/nodejs/eks/nodegroup.ts#L1313 so likely the value of

ng_tags

proud-noon-87466

05/18/2023, 6:54 PM

I can share a bit more code I don't think tags are the issue. Several other instance groups get setup without an issue.

Copy code

ng_zones_list = list(map(lambda z: [z], node.get('node_zones', [])))
        ng_multizones_lists = node.get('multizone_nodegroup_zones', [])
        is_multizone = bool(ng_multizones_lists)
        # if specified, use multizone_nodegroup_zones, else use node_zones (with 1 zone per nodegroup)
        for node_zone_list in (ng_multizones_lists if is_multizone else ng_zones_list):
            # include first subnet name in nodegroup name (even if multizone)
            zone = node_zone_list[0]
            sn_name = subnet_names[zone]
            ng_name = apply_name_overrides(f"{node['group_name']}{'-mz' if is_multizone else ''}-{sn_name}")
            ng_subnet_ids = list(map(lambda z: subnet_ids[z], node_zone_list))
            ng_tags = cluster.eks_cluster.name.apply(lambda cname: {
                "<http://k8s.io/cluster-autoscaler/enabled|k8s.io/cluster-autoscaler/enabled>": "true",
                f"<http://k8s.io/cluster-autoscaler/{cname}|k8s.io/cluster-autoscaler/{cname}>": "true",
                **tags,
                **({'Name': ng_name} if tags else {})
            })

            default_capacity_type = 'spot' if node.get('capacity_type') == 'SPOT' else 'ondemand'
            node_labels[CAPACITY_TYPE_LABEL] = node_labels.get(CAPACITY_TYPE_LABEL, default_capacity_type)

            eks.ManagedNodeGroup(
                ng_name,
                node_role_arn=ec2_role.arn,
                cluster=cluster.core,
                ami_type='CUSTOM' if ami_id else None,
                # Instance type will be specified in the launch template
                # instance_types=[node_type],
                scaling_config=aws.eks.NodeGroupScalingConfigArgs(
                    desired_size=max(node['min_node_count'], 1),
                    min_size=node['min_node_count'],
                    max_size=node['max_node_count']),
                subnet_ids=ng_subnet_ids,
                # The node version determines the AMI id, if AMI id already specified no need for node version
                version=node_version if not ami_id else None,
                tags=ng_tags,
                labels=node_labels,
                launch_template=aws.eks.NodeGroupLaunchTemplateArgs(
                    id=template.id,
                    version=template.latest_version,
                ),
                capacity_type=node.get('capacity_type', 'ON_DEMAND'),
                taints=[
                    aws.eks.NodeGroupTaintArgs(effect=taint.get('effect'), key=taint.get('key'),
                                               value=taint.get('value'))
                    for taint in node.get('taints', [])
                ],
                opts=ResourceOptions(ignore_changes=["scalingConfig.desiredSize"]),
            )

proud-noon-87466

05/18/2023, 6:57 PM

I'm not able to run a debugger super easily but I can println debug.

proud-noon-87466

05/18/2023, 7:57 PM

Some additional context: That nodegroup got created and is healthy in AWS. I think this is a Pulumi bug.

billowy-army-68599

05/18/2023, 8:00 PM

More than likely, if you could file an issue if you have a reliable repro that’d be great

proud-noon-87466

05/18/2023, 8:03 PM

Are there any possible workarounds you could recommend?

proud-noon-87466

05/18/2023, 8:07 PM

I checked and this appears to be the relevant call. https://www.npmjs.com/package/@pulumi/eks?activeTab=code

Copy code

// Check that the nodegroup role has been set on the cluster to
    // ensure that the aws-auth configmap was properly formed.
    const nodegroupRole = pulumi.all([core.instanceRoles, roleArn]).apply(([roles, rArn]) => {
        // Map out the ARNs of all of the instanceRoles.
        const roleArns = roles.map((role) => {
            return role.arn;
        });
        // Try finding the nodeRole in the ARNs array.
        return pulumi.all([roleArns, rArn]).apply(([arns, arn]) => {
            return arns.find((a) => a === arn);
        });
    });

proud-noon-87466

05/18/2023, 9:57 PM

Some more context, running pulumi refresh causes a repro of this.

315 Views

Open in Slack

Previous Next