Any advances coming up to reduce the amount of tim...
# general
c
Any advances coming up to reduce the amount of time it takes run update a stack with around 50-150 resources?
g
Are the updates updating that many resources? Or just a handful of resources in a large stack?
c
No, 1 or 2 resources are different.
Talking about ~10 minutes for a preview?
Same amount of time for the update
For example, right now, I’m waiting for the update to finish even though it’s already made all of the changes.
As it’s still checking every other resource
g
Can you run with tracing a couple times (https://pulumi-community.slack.com/archives/C84L4E3N1/p1570031417208300) and provide the trace files to us?
👍🏽 1
c
Sure. I’ll get it to you later or tomorrow. I’m about to head out for a few minutes.
It’d be nice to fix, but not a blocker in any way. Thanks.
c
There isn’t any specific fix in the pipeline, but we will devote more engineering cycles in the next few weeks specifically for performance work. I would expect it to “take longer than a stack with 5 - 15 resources”, but not 10m. Can you share with me a prototypical stack update that is too slow? e.g. https://app.pulumi.com/nesl247/website-proj/production/45 With that I can look at the update logs for the specific update and see what is the biggest culprit server-side. But like Cameron suggested, having the trace file from your client would give us an even better picture.
c
Install the istio helm chart for an example, at least that’s 95% of what our stack has. We have like 2-3 other resources in it for istio.
But I will give you a link in a minute or two when it’s done
I’m not sure what the duration is based on (updating or preview + updating), but this one took 9m23s
c
Perfect. Please do take a look at creating a trace file and sending it to us, but I’ll take a peek at that particular update and if anything stands out as problematic.
c
I’ll DM it to you when it’s done
🙏🏽 1
Why did
pulumi preview -s LinioIT/support --tracing=file:/out.trace
not work?
Ah crap, forgot the
.
b
@handsome-airport-56801 might be something useful in this thread
w
Left some notes on this at https://github.com/pulumi/pulumi/issues/3257#issuecomment-538501110. We're still investigating the issues identified there - but the root issue appears to be just that the
Diff
and
Check
calls to Kubernetes are (a) taking more time than expected and (b) being done serially. We should be able to improve on both of these.
c
Awesome news. Do you think it’ll be a while before improvements are seen, or is this something that is being prioritized?
w
Performance is a priority - and depending on what the ultimate root causes are, we aim to do work here as soon as possible. One question for you - do you know what performance if like for
kubectl get
operations against your cluster? The fact that all these operations are taking multiple seconds suggests there may be a fundamental issue related to a slow connection to Kubernetes that is accentuating the issue here. (Some of the follow-ups here will hopefully be able to mitigate that - but this could be a contributing factor).
c
It took 2.72 seconds to do a kubectl get namespaces
If you have anything specific you’d like for me to try, I’m happy to
I will say, part of the problem will be our current network structure.
We’re migrating from AWS to GCP, and our client VPN is in AWS right now.
So we have Client -> AWS -> Site to Site VPN -> GKE
w
Hmm - yeah - that's inline with what this data shows. That seem pretty slow! It's 330ms for me on a test cluster.
c
I will say it’s still a bit slower than I’d like when running from CI within GCP, but definitely better.
w
Actually - trying against a GKE cluster from my local laptop - I'm seeing:
Copy code
$ time kubectl get namespaces
NAME              STATUS   AGE
default           Active   23d
kube-node-lease   Active   23d
kube-public       Active   23d
kube-system       Active   23d
myns-mn0bllwq     Active   16d

real	0m0.132s
user	0m0.087s
sys	0m0.019s
That's an order of magnitude faster - which I think is related to why this update time is quite so bad in your case. That said, there are more things Pulumi can do to mitigate here.
c
We’re only temporarily like this, but it’s been several months, and will probably be at least another month, so that would be awesome to see changes in the near future.
g
https://github.com/pulumi/pulumi-kubernetes/releases/tag/v1.2.2 includes a couple performance improvements that should help
c
Oh cool. I also saw it got rid of the annoying annotation.
g
Yep, sorry about that 🙂