Hi all, we have a GCP project with around 190 reso...
# general
j
Hi all, we have a GCP project with around 190 resources (mostly k8s deployments and services) that takes around 10-15 minutes to deploy: it takes around 1 minute to get the confirmation prompt, then it shows no updates for like 10 minutes and then it starts create/updating resources. Is there anything that can be done (except for skipping checkpoints)? I was thinking about splitting up the project into smaller subprojects, but really, I wouldn't expect 190 resources to be so slow to update. Any tips will be welcome!
d
What version of the k8s provider are you using? V4 changed to server-side apply by default + other changes that improves performance massively
For my stack (~100 k8s resources), we went from 3-5minute deploys to ~10-15s
s
how can I find out what version I am using?
j
I tried upgrading to v4, but it looks more or less the same
d
Can you check if
server_side_apply
is set in the config or on the provider
j
@dry-keyboard-94795 isn't this the default behaviour?
d
Yes. Just checking if you happen to have it set explicitly to
false
I can't think of anything that would cause slowness at this scale. Are you using Pulumi Cloud or a different state backend, like s3?
j
it's Pulumi Cloud. I thought network latency might be part of the reason, but I don't know if it's still doing continuous back and forth with the Pulumi server. I'm in Italy and I see slightly better results (nothing major though) if I do it from a VPS in Germany.
it might have to do with my internet connection at home being a bit wonky and the one at the VPS being obviously much better
d
It might be worth running a trace to see if anything stands out. There's been a couple of others reporting that pulumi cloud has been a bit slow lately. Is the slowness a recent development?
j
no, it's always been there, but we recently grew a little bit sick of it
what should I look for in the trace?
d
Just checking regarding the V4 upgrade. Provider settings don't always change on the first run, they get set during the update. On subsequent runs after the upgrade, do you still get slowness?
what should I look for in the trace?
I'm actually unsure on this, and don't have access to a sizeable stack to offer guidance. With the slowness you're experiencing, I'd expect something obvious to stick out though. Perhaps one of the pulumi devs can step in to help here? (cc channel)
p
I think I may have something related to this thread. My stack is not as big as 190 resources, but it takes a considerate amount of time to apply. We've been wondering if it was caused by Pulumi Cloud, and decided to test with three different scenarios: the state in Pulumi Cloud, in s3 with passphrase encryption and in s3 with Hashicorp Vault as the secret provider. It appears Pulumi Cloud's struggles when decrypting the secrets. Here are the results of my tests. Any idea on what may going on here? edit: the command used to gather this data is a simple
pulumi stack export
.
j
This looks promising! And we have more than 100 secrets, so I don't know if the add up...will look into testing with a different secret provider, thanks!
e
oh this is interesting, and very unexpected. Can you raise an issue about this at github.com/pulumi/pulumi/issues? We should definitely take a look into this.
p
@jolly-window-25842 It does add up. I just tested it with the biggest stack we have over here (1.8k resources, 3.7k secrets) and it takes ~17min do fetch when
--with-secrets
is set and 4.6s when not set.
e
Thanks, we'll take a look into this.
j
Wow! I just tried changing the secrets provider to gcpkms and deploying now takes like 5-6 seconds plus the actual time to update the resources...what a change! thanks @plain-businessperson-30883!
e
Hey all, I posted on that issue but just to close the loop back here. We found some cases where "--show-secrets" was doing way more work than was actually required and that's been fixed in the most recent release of the cli. I'd be very interested in hearing what numbers you get for your experiments if you run using that latest version.
p
@echoing-dinner-19531 I saw your comment in the email notification but I could not test it yet. I'll re-run my tests asap.
d
👀 @plain-businessperson-30883 I'll pretend it was just you mashing the keyboard :)
p
@echoing-dinner-19531 I just sent an updated benchmark in the GitHub issue.