Running `pulumi up` has become incredibly slow, it...
# typescript
j
Running
pulumi up
has become incredibly slow, it always was, but now it takes around 5-6 minutes to get the preview and another 2-3 minutes to update even a single resource: it hangs for 2 minutes and then starts updating. It's a Typescript GCP stack with around 150 resources and about 20-30 secrets. Any idea why this might be and/or suggestions to improve it? Thanks!
s
You could consider splitting it into separate Pulumi projects (using stack references to share information across projects as needed). This would allow you to decouple resources that need to be updated more frequently from those that do not need to be updated as often.
j
That's certainly an idea, thanks! But is it normal for it to be so slow?
w
To get a better idea why it's slow, consider running with
--tracing
https://www.pulumi.com/docs/support/troubleshooting/#tracing
RE: Updating a single resource, it very much depends on the resource! For example, DigitalOcean Kubernetes clusters take about 5-7 minutes to start up. Pulumi waits for the cluster to enter a ready state before returning (to confirm the cluster started successfully). Other resources might come up a lot faster. For example, GitHub Repos have much lower latency. It really depends on the cloud provider, and what the cloud provider is doing behind the scenes. I would expect resources like compute instances, and managed databases to take minutes to come up. I would expect resources like S3 buckets, SQS queues, and other "mostly logical" resources to become available within seconds.
s
wondering how large your statefile is. IIRC
pulumi preview|up
was slow for me at some point a couple years ago, and it was because I had a bunch of K8s CRDs in the state.
j
it's a GCP project, mostly involving GKE plus some other resources (like pubsub topics/subscription). When updating a single deployment (new spec because of a new image version), it takes a few minutes for the preview, then I confirm and it takes 2-3 minutes of nothing, and then it starts updating and that takes what I would expect (15-20 seconds)
how would I check how big my statefile is?
w
I think it’s “Pulumi stack export”. Sorry, I’m on mobile right now or I’d check the CLI help.
j
So it's 27k lines and it took a couple of seconds to run it
w
How big is the file? In KB or MB
j
2.2M
w
So, you might benefit from the checkpoint diff work. Or, PULUMI_SKIP_CHECKPOINTS. Be very careful with skipping checkpoints. If Pulumi is interrupted, you’ll have to manually delete your resources and start again.
The “Checkpoint diff” feature is automatically enabled after a certain file size. I forget what the number is… The default algorithm sends the entire statefile after every resource update. The diffing algorithm calculates the textual diff of the state and sends only that, so it’s better one bigger statefiles. It only recently left the experimental phase.
Chatting with the team, the diffing algorithm automatically kicks in after 1MB. Unfortunately, you can't set the threshold any lower. So you should see improved perf after the 1MB mark, but nothing to be done before that. You can still SKIP_CHECKPOINTS or share a program trace.
j
the other day I was travelling and noticed that on my laptop
pulumi up
takes a couple of minutes and preview is almost instant, maybe 10-15 seconds, while on my home computer it takes easily ten minutes total and around 5 to show the preview. So I started wondering if there was some cache that got corrupted, but looking around the filesystem I couldn't find anything that resembled cache for pulumi
w
Holy cow, you should not be experiencing variation in performance on the order of minutes. 😱 Am I reading this correctly? Preview, Home Computer: 5 minutes Up, Home Computer: 10+ minutes Preview, Laptop: A few seconds Up, Laptop: A few minutes
j
yes, correct, it surprised me, too
w
Yeah, I'd check out the traces there. I've never seen a performance differential like that.
1627 Views