https://pulumi.com logo
b

busy-pizza-73563

04/25/2019, 12:59 PM
Any reason why the EKS security group that manages the communication between the k8s API server and the node instances only allows ports 1025-65535? https://github.com/pulumi/pulumi-eks/blob/master/nodejs/eks/securitygroup.ts#L48
w

white-balloon-205

04/25/2019, 1:47 PM
This was originally derived from the EKS documentation and NodeGroup CloudFormation templates: https://github.com/awslabs/amazon-eks-ami/blob/master/amazon-eks-nodegroup.yaml#L228. It does actually look like that upstream template added port
443
as well, which
@pulumi/eks
does not currently enable. Does the service you are interested in really only work over port
80
, or could it be exposed on port
443
if we fixed this to match the EKS recommended ingress/egress? I am not sure of the underlying reason for constraining access to other lower port numbers from/to the control plane. cc @breezy-hamburger-69619 in case he has thoughts on this?
b

busy-pizza-73563

04/25/2019, 1:54 PM
Well, that monitoring stack is completely managed by rancher, so I don't think that port can be easily changed.
But I really don't see any reason for not being able to f.e.
kubectl proxy
to services with ports lower than 1025... 🤷
I mean, I know that ports <= 1024 are "privileged", but still... 🙂
b

breezy-hamburger-69619

04/25/2019, 5:06 PM
We enable all ports that are recommended in the EKS docs: https://docs.aws.amazon.com/eks/latest/userguide/sec-group-reqs.html, 1025-65535 being a range listed. Not enabling ports <=1024 is more about enforcing least privilege where possible. To take a step back though, the control plane subnet and the worker subnet should only be used for k8s cluster communications. IIUC your rancher monitoring needs, it should instead be running in-cluster as a DaemonSet or similar and using the internal cluster networking. If you still want to go ahead and get around these limits, you can always pass in your own
nodeSecurityGroup
and
eksClusterIngressRule
into the `NodeGroup`: https://github.com/pulumi/pulumi-eks/blob/master/nodejs/eks/nodegroup.ts#L239
b

busy-pizza-73563

04/25/2019, 6:48 PM
@breezy-hamburger-69619 Thanks for the explanation!
To take a step back though, the control plane subnet and the worker subnet should only be used for k8s cluster communications
As I said above,
kubectl proxy
(and the k8s proxy API subsystem, in general) needs for connections from the control plane to a pod to be allowed. Just for the argument's sake, I really don't see why you won't allow proxying to ports lower than 1025.
c

creamy-potato-29402

04/25/2019, 8:11 PM
I actually agree with @breezy-hamburger-69619 here. We want the EKS package to produce clusters that are “prod-first”, with good security defaults. Generally I think it’s a good idea to add a bit of friction to things like exposing ports <= 1024, because it makes people work intentionally. Especially if the work-around is as simple as supplying your own security group.
w

white-balloon-205

04/25/2019, 8:14 PM
Especially if it is as simple as supplying your own security group.
Even better, in the release coming out today, it is possible to just add an additional ingress rule to the existing security group to allow this specific access pattern.
c

creamy-potato-29402

04/25/2019, 8:14 PM
ah that’s great too.
b

busy-pizza-73563

04/25/2019, 8:15 PM
@creamy-potato-29402 I (partly) agree. But being used to GKE and bare metal clusters, it wasn't immediately apparent for me what the issue is, and I "wasted" 1/2h trying to find it. 🙂
c

creamy-potato-29402

04/25/2019, 8:15 PM
mmm
b

breezy-hamburger-69619

04/25/2019, 8:16 PM
@white-balloon-205 we moved to separate security group rules but have not made the ability to provide user rules and merge them in quite yet. that’s being tracked in https://github.com/pulumi/pulumi-eks/issues/97 that @busy-pizza-73563 opened up
c

creamy-potato-29402

04/25/2019, 8:16 PM
I do think that’s a usability problem.
but I’m not sure how to do better.
b

busy-pizza-73563

04/25/2019, 8:17 PM
Well, except for allowing everything TCP, I don't have a good suggestion either.
I still think the k8s baseline is allowing to proxy everything.
c

creamy-potato-29402

04/25/2019, 8:19 PM
what are you trying to proxy, now?
w

white-balloon-205

04/25/2019, 8:19 PM
@breezy-hamburger-69619 the resulting security group is exposed to the user, so they can just add additional ingress rules, right?
c

creamy-potato-29402

04/25/2019, 8:19 PM
I understand there’s something with rancher, but I’m not super familiar.
b

breezy-hamburger-69619

04/25/2019, 8:22 PM
That’s correct @white-balloon-205. @busy-pizza-73563 you have the
nodeSecurityGroup
available to you so you can create separate seccgroup rules to open what you need using its id
b

busy-pizza-73563

04/25/2019, 8:24 PM
@creamy-potato-29402 When using rancher's integrated monitoring stack, grafana should be accessible at
<https://rancher.url/k8s/clusters/c-12345/api/v1/namespaces/cattle-prometheus/services/http:access-grafana:80/proxy/>
, which in turn proxies to
<https://k8s.url:port/api/v1/namespaces/cattle-prometheus/services/http:access-grafana:80/proxy/>
.
And
access-grafana
service is pointing to
:80
inside the corresponding pod(s).
c

creamy-potato-29402

04/25/2019, 8:25 PM
and you can’t run this in-cluster?
b

busy-pizza-73563

04/25/2019, 8:26 PM
It doesn't really matter, as long as the "vanilla" k8s proxying is not working.
c

creamy-potato-29402

04/25/2019, 8:27 PM
I see. And rancher really does not let you change the port??
b

busy-pizza-73563

04/25/2019, 8:27 PM
The issue is that the control plane tries to connect to the grafana pod's
IP:80
.
No, but I don't really see why they would.
c

creamy-potato-29402

04/25/2019, 8:29 PM
Sorry — isn’t that in the kube overlay network though?
I’m super confused.
If you’re running inside the cluster, you should be able to access those ports, I think? Am I missing something?
b

busy-pizza-73563

04/25/2019, 8:32 PM
The control plane is EKS managed, and communication between it and the node instances honors the
${name}-nodeSecurityGroup
, see https://github.com/pulumi/pulumi-eks/blob/master/nodejs/eks/securitygroup.ts#L57 .
I’m super confused.
I was the same during that 1/2h looking for the issue. 🙂
c

creamy-potato-29402

04/25/2019, 8:34 PM
sorry, still confused — the SG does not disallow ports in the kube overlay network, though, right?
b

busy-pizza-73563

04/25/2019, 8:35 PM
The SG only allows cluster plane -> node instance/pod
IP:1025-65536
.
But that service points to pod
IP:80
.
c

creamy-potato-29402

04/25/2019, 8:36 PM
right, but what I’m asking is: that port is not the “real” port allocated by the overlay network, is it?
b

busy-pizza-73563

04/25/2019, 8:37 PM
Well, afaik how k8s proxy works is it looks at the service's endpoints and connects (randomly?) to one of them.
c

creamy-potato-29402

04/25/2019, 8:38 PM
I forget this part of the kube networking. I thought the port was mapped to some other port. could be wrong.
b

busy-pizza-73563

04/25/2019, 8:38 PM
So if svc
access-grafana:80
has endpoint
grafanaPodIP:80
, when proxying to the svc the API will try to connect to the pod.
c

creamy-potato-29402

04/25/2019, 8:38 PM
@breezy-hamburger-69619 is that true? Or is the IP address the only thing that’s faked by the overlay network?
b

busy-pizza-73563

04/25/2019, 8:39 PM
In-cluster each service has a cluster IP, true. But that's not how proxy works, afaik. But I might be wrong. 🙂
b

breezy-hamburger-69619

04/25/2019, 8:39 PM
I dont remember kubectl proxy internals, but the grafana Pod IP is in the overlay networking space so this secgroup rules should not apply…unless there is some overarching reproxying to a cluster port, but this does not ring a bell
c

creamy-potato-29402

04/25/2019, 8:40 PM
that’s what I think as well.
again — could be wrong
b

busy-pizza-73563

04/25/2019, 8:40 PM
Well, then how would you explain that if I change 1025 to 80 everything starts working? 🙂
b

breezy-hamburger-69619

04/25/2019, 8:41 PM
in fact, in other distros they limit 1025-65535 even further to only the absolute, necessary ports needed between control plane and worker. i’m actually surprised AWS suggests opening the secgroup this widely
c

creamy-potato-29402

04/25/2019, 8:41 PM
is it internal or external?
b

busy-pizza-73563

04/25/2019, 8:42 PM
Oh, another thing, it was in the rancher issue. When trying to access
/api/v1/.../services/http:access-grafana:80/proxy/
I got
Copy code
Error: 'dial tcp a.b.c.d:80: connect: connection timed out'
Trying to reach: '<http://a.b.c.d:80/>'
where
a.b.c.d
is the grafana pod IP.
@breezy-hamburger-69619 Which other distros? I didn't get that with either GKE or
kubeadm
bare metal clusters.
b

breezy-hamburger-69619

04/25/2019, 8:44 PM
CoreOS Tectonic, which was open source k8s and i worked on it back in the day. I haven’t looked into what GKE is doing, that’s prob a good next step
c

creamy-potato-29402

04/25/2019, 8:45 PM
I still think it’s the right move to lock down the SG right now — I’m just trying to understand why this doesn’t work.
If it’s cluster-internal, I believe this should “just work”
b

busy-pizza-73563

04/25/2019, 8:45 PM
Exactly! 😄
b

breezy-hamburger-69619

04/25/2019, 8:46 PM
b

busy-pizza-73563

04/25/2019, 8:47 PM
I have no issue with locking things down, but if this deviates from the baseline, it should be (somehow) documented.
c

creamy-potato-29402

04/25/2019, 8:47 PM
100% agree. But my question is: does this work from inside the cluster?
b

busy-pizza-73563

04/25/2019, 8:48 PM
Does what work? Proxying only makes sense from outside the cluster, right?
c

creamy-potato-29402

04/25/2019, 8:49 PM
you can proxy in the cluster, and you would be using on the kube overlay network, where the SGs should have no effect.
That’s what I think anyway.
b

busy-pizza-73563

04/25/2019, 8:49 PM
It still goes through the API server, so I see no reason why it would work.
Ok, found this: https://docs.aws.amazon.com/eks/latest/userguide/sec-group-reqs.html
Copy code
*Note*

To allow proxy functionality on privileged ports or to run the CNCF conformance tests yourself, you must edit the security groups for your control plane and the worker nodes. The security group on the worker nodes' side needs to allow inbound access for ports 0-65535 from the control plane, and the control plane side needs to allow outbound access to the worker nodes on ports 0-65535.
c

creamy-potato-29402

04/25/2019, 8:53 PM
ah
I see.
you are correct, then.
b

busy-pizza-73563

04/25/2019, 8:55 PM
Well, seems like an upstream "issue"/consideration, then. 🙂
b

breezy-hamburger-69619

04/25/2019, 8:59 PM
Your work arounds are: 1. You can provide your own
nodeSecurityGroup
for the
NodePool
that you can configure yourself entirely 2. get the id of the
nodeSecurityGroup
to build new secgroup rules, but this will be a step that occurs post-secgroup and cluster creation 3. Take a stab at https://github.com/pulumi/pulumi-eks/issues/97 and we can review and guide you through it if needed. Given that the secgroups and secgroup rules are now separated [1] this should be a bit more straight-forward to implement. -- 1 - https://github.com/pulumi/pulumi-eks/pull/109
c

creamy-potato-29402

04/25/2019, 9:04 PM
indeed
b

busy-pizza-73563

04/25/2019, 9:04 PM
@breezy-hamburger-69619 For now the workaround was to edit that rule from the AWS console and change that 1025 to 80. 🙂
b

breezy-hamburger-69619

04/25/2019, 9:05 PM
Fair enough, but note this will create a mismatch of state between AWS and pulumi 🙂
b

busy-pizza-73563

04/25/2019, 9:05 PM
Yeah, I know, but I documented it. 🙂
🙂 1
I'll try to take a stab at #97 when I'll find a bit of time.
🎉 1
(actually I opened that because of a completely different use case - opening 22/tcp from the internet to the node instances, which is the other SG 🙂 )
Anyway, thanks all for your time!
👍🏼 1