Hello everyone, i’m having issue to setup a pretty...
# kubernetes
t
Hello everyone, i’m having issue to setup a pretty standard kubernetes cluster in EKS regarding dns resolution. I’ve followed multiple guide and troubleshoot guide and spent many hours torturing my head around this. My pod(s) running in a worker node other than a default one i create during the step of creating the cluster can’t resolve dns. The “default worker” host the default deployment of 2 replicas of coredns. Targetted deployment on this “default” worker node makes my deployment successfull. The issue is likely related to network communication between different worker nodes and CoreDNS. Did i missed anything ? It seems like a very basic setup that i wanna run.
Here is the code used to create the vpc/cluster
Copy code
private createCustomVpc(): awsx.ec2.Vpc {
		return new awsx.ec2.Vpc(`${this.orgName}-eks-vpc`, {
			enableDnsSupport: true,
			enableDnsHostnames: true,
			cidrBlock: this.vpcNetworkCidr,
		});
	}

	private createEKSCluster(): eks.Cluster {
		return new eks.Cluster(this.clusterName, {
			name: this.clusterName,
			version: '1.27',
			tags: {
				Project: 'k8s-eks-cluster',
				Org: `${this.orgName}`,
			},
			createOidcProvider: true,
			clusterSecurityGroupTags: { ClusterSecurityGroupTag: 'true' },
			nodeSecurityGroupTags: { NodeSecurityGroupTag: 'true' },
			skipDefaultNodeGroup: true,
			vpcId: this.eksVpc.vpcId,
			enabledClusterLogTypes: ['api', 'audit', 'authenticator', 'controllerManager', 'scheduler'],
			instanceRoles: [this.eksNodeRole],
			roleMappings: [],
			publicSubnetIds: this.eksVpc.publicSubnetIds,
			privateSubnetIds: this.eksVpc.privateSubnetIds,
			nodeAssociatePublicIpAddress: false,
		});
	}
And the code used to create the “non-working” worker node that can’t resolve dns.
Copy code
const nodeGroup = new eks.NodeGroup(
		`public-api-nodegroup-${params.envName}`,
		{
			version: '1.27',
			cluster: params.cluster,
			instanceType: 't2.medium',
			nodeAssociatePublicIpAddress: false,
			desiredCapacity: 1,
			minSize: 1,
			maxSize: 10,
			labels: {
				name: 'public-api-nodegroup-alpha',
				application: `public-api-${params.envName}`,
				env: params.envName,
			},
			instanceProfile: params.eksNodeInstanceProfile,
		},
		{
			providers: { kubernetes: params.cluster.provider },
		},
	);
b
@tall-lion-84030 I don’t see any subnets for your nodegroup, you need to pass node subnet ids
t
@billowy-army-68599 I thought that setting the subnets in the eks.cluster will do the jobs and use the private available subnets by default ? I should pass manually private subnets in the nodegroup ?
b
attaching the nodes to the cluster doesn’t determine which subnet it’s in no, you need to use
NodeSubnetIds
to specify the subnet ids https://www.pulumi.com/registry/packages/eks/api-docs/nodegroup/#nodesubnetids_go
t
Do you know which subnets ids to use ?
b
private subnet ids
t
Passed the
privateSubnetIds
from the vpc it’s inm still can’t resolve dns in the pod. Didn’t changed anything than the code presented and that i followed at https://www.pulumi.com/docs/clouds/kubernetes/guides/playbooks/
b
can you elaborate on what you mean by resolve DNS in the pod? what are you trying to look up? What DNS server are you using? Is CoreDNS running?
t
My pod is a basic nodejs application that needs some secret hosted on aws secret manager. In our use case, we have an entrypoint script that should fetch the secrets using the aws cli. The pod fail to launch with logs such as
Could not connect to the endpoint URL: "<https://secretsmanager.eu-west-3.amazonaws.com/>"
The 2 default replicaSet of the coredns deployment are running and logging issue such as
[INFO] 10.1.78.23:57886 - 33767 "A IN <http://secretsmanager.eu-west-3.amazonaws.com|secretsmanager.eu-west-3.amazonaws.com>.public-api-alpha-da393fe3.svc.cluster.local. udp 100 false 512" NXDOMAIN qr,aa,rd 193 0.000230575s
I have set up and double-checked the ressource for the aws IRSA logic. Everything is created and linked correctly. I don’t know how to solve this issue i’ve been on since 2 days.
b
<http://secretsmanager.eu-west-3.amazonaws.com|secretsmanager.eu-west-3.amazonaws.com>.public-api-alpha-da393fe3.svc.cluster.local
This isn’t a valid DNS address though right? Does it every query in the tree for just the standard amazonaws.com ?
t
i know but shouldn’t the dns resolver tries to resolve the original dns after going through the ones it gets from
/etc/resolv.conf
? see ->
search public-api-alpha-da393fe3.svc.cluster.local svc.cluster.local cluster.local eu-west-3.compute.internal
as the 1st line of resolv.conf