Running into the following error with EKS. Has any...
# aws
p
Running into the following error with EKS. Has anyone seen something similar? (I'm from Pinecone).
Copy code
aws:eks:NodeGroup eks-[cluster]-index-spots-mz-0 created (119s)
    pulumi:pulumi:Stack aws-[cluster]-us-west-2 **failed** 5 errors
    eks:index:ManagedNodeGroup eks-[cluster]-index-spots-mz-0

Diagnostics:
  pulumi:pulumi:Stack (aws-[cluster]):
    error: Program failed with an unhandled exception:
    Traceback (most recent call last):
      File "/home/runner/actions-runner/_work/iac/iac/pulumi/aws/venv/lib/python3.7/site-packages/pulumi/runtime/resource.py", line 602, in do_rpc_call
        return monitor.RegisterResource(req)
      File "/home/runner/actions-runner/_work/iac/iac/pulumi/aws/venv/lib/python3.7/site-packages/grpc/_channel.py", line 1030, in __call__
        return _end_unary_response_blocking(state, call, False, None)
      File "/home/runner/actions-runner/_work/iac/iac/pulumi/aws/venv/lib/python3.7/site-packages/grpc/_channel.py", line 910, in _end_unary_response_blocking
        raise _InactiveRpcError(state)  # pytype: disable=not-instantiable
    grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
    	status = StatusCode.UNKNOWN
    	details = "Cannot read properties of undefined (reading 'map')"
    	debug_error_string = "UNKNOWN:Error received from peer  {grpc_message:"Cannot read properties of undefined (reading \'map\')", grpc_status:2, created_time:"2023-05-18T00:39:30.364908383+00:00"}"
    >

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
      File "/home/runner/actions-runner/_work/_tool/pulumi/3.60.0/x64/pulumi-language-python-exec", line 197, in <module>
        loop.run_until_complete(coro)
      File "/home/runner/actions-runner/_work/_tool/Python/3.7.12/x64/lib/python3.7/asyncio/base_events.py", line 587, in run_until_complete
        return future.result()
      File "/home/runner/actions-runner/_work/iac/iac/pulumi/aws/venv/lib/python3.7/site-packages/pulumi/runtime/stack.py", line 126, in run_in_stack
        await run_pulumi_func(lambda: Stack(func))
      File "/home/runner/actions-runner/_work/iac/iac/pulumi/aws/venv/lib/python3.7/site-packages/pulumi/runtime/stack.py", line 51, in run_pulumi_func
        await wait_for_rpcs()
      File "/home/runner/actions-runner/_work/iac/iac/pulumi/aws/venv/lib/python3.7/site-packages/pulumi/runtime/stack.py", line 110, in wait_for_rpcs
        raise exception
      File "/home/runner/actions-runner/_work/iac/iac/pulumi/aws/venv/lib/python3.7/site-packages/pulumi/runtime/rpc_manager.py", line 68, in rpc_wrapper
        result = await rpc
      File "/home/runner/actions-runner/_work/iac/iac/pulumi/aws/venv/lib/python3.7/site-packages/pulumi/output.py", line 98, in is_value_known
        return await is_known and not contains_unknowns(await future)
      File "/home/runner/actions-runner/_work/iac/iac/pulumi/aws/venv/lib/python3.7/site-packages/pulumi/output.py", line 98, in is_value_known
        return await is_known and not contains_unknowns(await future)
      File "/home/runner/actions-runner/_work/iac/iac/pulumi/aws/venv/lib/python3.7/site-packages/pulumi/output.py", line 98, in is_value_known
        return await is_known and not contains_unknowns(await future)
      [Previous line repeated 19 more times]
      File "/home/runner/actions-runner/_work/iac/iac/pulumi/aws/venv/lib/python3.7/site-packages/pulumi/runtime/resource.py", line 607, in do_register
        resp = await asyncio.get_event_loop().run_in_executor(None, do_rpc_call)
      File "/home/runner/actions-runner/_work/_tool/Python/3.7.12/x64/lib/python3.7/concurrent/futures/thread.py", line 57, in run
        result = self.fn(*self.args, **self.kwargs)
      File "/home/runner/actions-runner/_work/iac/iac/pulumi/aws/venv/lib/python3.7/site-packages/pulumi/runtime/resource.py", line 604, in do_rpc_call
        handle_grpc_error(exn)
      File "/home/runner/actions-runner/_work/iac/iac/pulumi/aws/venv/lib/python3.7/site-packages/pulumi/runtime/settings.py", line 276, in handle_grpc_error
        raise grpc_error_to_exception(exn)
    Exception: Cannot read properties of undefined (reading 'map')
    error: TypeError: Cannot read properties of undefined (reading 'map')
        at /snapshot/eks/bin/nodegroup.js:894:32
        at /snapshot/eks/node_modules/@pulumi/pulumi/output.js:257:35
        at Generator.next (<anonymous>)
        at /snapshot/eks/node_modules/@pulumi/pulumi/output.js:21:71
        at new Promise (<anonymous>)
        at __awaiter (/snapshot/eks/node_modules/@pulumi/pulumi/output.js:17:12)
        at applyHelperAsync (/snapshot/eks/node_modules/@pulumi/pulumi/output.js:236:12)
        at /snapshot/eks/node_modules/@pulumi/pulumi/output.js:190:65
        at processTicksAndRejections (node:internal/process/task_queues:95:5)
    error: TypeError: Cannot read properties of undefined (reading 'map')
        at /snapshot/eks/bin/nodegroup.js:894:32
        at /snapshot/eks/node_modules/@pulumi/pulumi/output.js:257:35
        at Generator.next (<anonymous>)
        at /snapshot/eks/node_modules/@pulumi/pulumi/output.js:21:71
        at new Promise (<anonymous>)
        at __awaiter (/snapshot/eks/node_modules/@pulumi/pulumi/output.js:17:12)
        at applyHelperAsync (/snapshot/eks/node_modules/@pulumi/pulumi/output.js:236:12)
        at /snapshot/eks/node_modules/@pulumi/pulumi/output.js:190:65
        at processTicksAndRejections (node:internal/process/task_queues:95:5)
    error: TypeError: Cannot read properties of undefined (reading 'map')
        at /snapshot/eks/bin/nodegroup.js:894:32
        at /snapshot/eks/node_modules/@pulumi/pulumi/output.js:257:35
        at Generator.next (<anonymous>)
        at /snapshot/eks/node_modules/@pulumi/pulumi/output.js:21:71
        at new Promise (<anonymous>)
        at __awaiter (/snapshot/eks/node_modules/@pulumi/pulumi/output.js:17:12)
        at applyHelperAsync (/snapshot/eks/node_modules/@pulumi/pulumi/output.js:236:12)
        at /snapshot/eks/node_modules/@pulumi/pulumi/output.js:190:65
        at processTicksAndRejections (node:internal/process/task_queues:95:5)
    error: TypeError: Cannot read properties of undefined (reading 'map')
        at /snapshot/eks/bin/nodegroup.js:894:32
        at /snapshot/eks/node_modules/@pulumi/pulumi/output.js:257:35
        at Generator.next (<anonymous>)
        at /snapshot/eks/node_modules/@pulumi/pulumi/output.js:21:71
        at new Promise (<anonymous>)
        at __awaiter (/snapshot/eks/node_modules/@pulumi/pulumi/output.js:17:12)
        at applyHelperAsync (/snapshot/eks/node_modules/@pulumi/pulumi/output.js:236:12)
        at /snapshot/eks/node_modules/@pulumi/pulumi/output.js:190:65
        at processTicksAndRejections (node:internal/process/task_queues:95:5)
b
can you share your code?
p
We make this call several times with few different sets of settings. There's more code but I'm not sure how much I can share.
Copy code
eks.ManagedNodeGroup(
                ng_name,
                node_role_arn=ec2_role.arn,
                cluster=cluster.core,
                ami_type='CUSTOM' if ami_id else None,
                # Instance type will be specified in the launch template
                # instance_types=[node_type],
                scaling_config=aws.eks.NodeGroupScalingConfigArgs(
                    desired_size=max(node['min_node_count'], 1),
                    min_size=node['min_node_count'],
                    max_size=node['max_node_count']),
                subnet_ids=ng_subnet_ids,
                # The node version determines the AMI id, if AMI id already specified no need for node version
                version=node_version if not ami_id else None,
                tags=ng_tags,
                labels=node_labels,
                launch_template=aws.eks.NodeGroupLaunchTemplateArgs(
                    id=template.id,
                    version=template.latest_version,
                ),
                capacity_type=node.get('capacity_type', 'ON_DEMAND'),
                taints=[
                    aws.eks.NodeGroupTaintArgs(effect=taint.get('effect'), key=taint.get('key'),
                                               value=taint.get('value'))
                    for taint in node.get('taints', [])
                ],
                opts=ResourceOptions(ignore_changes=["scalingConfig.desiredSize"]),
            )
b
these two lines look suspicious:
Copy code
capacity_type=node.get('capacity_type', 'ON_DEMAND'),
                taints=[
                    aws.eks.NodeGroupTaintArgs(effect=taint.get('effect'), key=taint.get('key'),
                                               value=taint.get('value'))
                    for taint in node.get('taints', [])
                ],
have they ever worked? if you comment them out does it work?
p
Those lines have worked in the past. Is there some way debug this?
b
not super easily, something in your code is sending incorrect grpc messages to the engine
there’s a property that’s undefined basically. you’ll have to comment or print properties or use a debugger
p
Is there a way to find the underlying proto to see what "map" is?
b
what do you mean?
p
Is it possible to figure out what field might be set incorrectly here? In terms of debug logging I'm not sure how to enable it using the python SDK.
b
we are having an internal discussion about ways to improve this, but the underlying component is https://github.com/pulumi/pulumi-eks/blob/master/nodejs/eks/nodegroup.ts anywhere there’s a call to
map
may be the issue, so maybe look at the
tags
input? https://github.com/pulumi/pulumi-eks/blob/master/nodejs/eks/nodegroup.ts#L1313 so likely the value of
ng_tags
p
I can share a bit more code I don't think tags are the issue. Several other instance groups get setup without an issue.
Copy code
ng_zones_list = list(map(lambda z: [z], node.get('node_zones', [])))
        ng_multizones_lists = node.get('multizone_nodegroup_zones', [])
        is_multizone = bool(ng_multizones_lists)
        # if specified, use multizone_nodegroup_zones, else use node_zones (with 1 zone per nodegroup)
        for node_zone_list in (ng_multizones_lists if is_multizone else ng_zones_list):
            # include first subnet name in nodegroup name (even if multizone)
            zone = node_zone_list[0]
            sn_name = subnet_names[zone]
            ng_name = apply_name_overrides(f"{node['group_name']}{'-mz' if is_multizone else ''}-{sn_name}")
            ng_subnet_ids = list(map(lambda z: subnet_ids[z], node_zone_list))
            ng_tags = cluster.eks_cluster.name.apply(lambda cname: {
                "<http://k8s.io/cluster-autoscaler/enabled|k8s.io/cluster-autoscaler/enabled>": "true",
                f"<http://k8s.io/cluster-autoscaler/{cname}|k8s.io/cluster-autoscaler/{cname}>": "true",
                **tags,
                **({'Name': ng_name} if tags else {})
            })

            default_capacity_type = 'spot' if node.get('capacity_type') == 'SPOT' else 'ondemand'
            node_labels[CAPACITY_TYPE_LABEL] = node_labels.get(CAPACITY_TYPE_LABEL, default_capacity_type)

            eks.ManagedNodeGroup(
                ng_name,
                node_role_arn=ec2_role.arn,
                cluster=cluster.core,
                ami_type='CUSTOM' if ami_id else None,
                # Instance type will be specified in the launch template
                # instance_types=[node_type],
                scaling_config=aws.eks.NodeGroupScalingConfigArgs(
                    desired_size=max(node['min_node_count'], 1),
                    min_size=node['min_node_count'],
                    max_size=node['max_node_count']),
                subnet_ids=ng_subnet_ids,
                # The node version determines the AMI id, if AMI id already specified no need for node version
                version=node_version if not ami_id else None,
                tags=ng_tags,
                labels=node_labels,
                launch_template=aws.eks.NodeGroupLaunchTemplateArgs(
                    id=template.id,
                    version=template.latest_version,
                ),
                capacity_type=node.get('capacity_type', 'ON_DEMAND'),
                taints=[
                    aws.eks.NodeGroupTaintArgs(effect=taint.get('effect'), key=taint.get('key'),
                                               value=taint.get('value'))
                    for taint in node.get('taints', [])
                ],
                opts=ResourceOptions(ignore_changes=["scalingConfig.desiredSize"]),
            )
I'm not able to run a debugger super easily but I can println debug.
Some additional context: That nodegroup got created and is healthy in AWS. I think this is a Pulumi bug.
b
More than likely, if you could file an issue if you have a reliable repro that’d be great
p
Are there any possible workarounds you could recommend?
I checked and this appears to be the relevant call. https://www.npmjs.com/package/@pulumi/eks?activeTab=code
Copy code
// Check that the nodegroup role has been set on the cluster to
    // ensure that the aws-auth configmap was properly formed.
    const nodegroupRole = pulumi.all([core.instanceRoles, roleArn]).apply(([roles, rArn]) => {
        // Map out the ARNs of all of the instanceRoles.
        const roleArns = roles.map((role) => {
            return role.arn;
        });
        // Try finding the nodeRole in the ARNs array.
        return pulumi.all([roleArns, rArn]).apply(([arns, arn]) => {
            return arns.find((a) => a === arn);
        });
    });
Some more context, running pulumi refresh causes a repro of this.
187 Views