great-sundown-78827
04/17/2024, 1:02 AMpulumi up…
run. When we run pulumi refresh
it takes ~1m or so to get the data back - and that feels reasonable, given that we have a few hundred resources. When we then run a pulumi up
where even only 1 resource changes, the run takes over 9 minutes. The primary API we are interfacing with is the PagerDuty API using the Pulumi PagerDuty provider.
I’ve run pulumi up --tracing…
and I have some trace files. Using the AppDash
is a little confusing though and I am not sure how to really get much use out of it. The one thing I see is the Profile View
at the bottom and sorting by `Cumulative Time(ms)`:
When I do that, I get a MASSIVE output of RegisterResource
calls, … like … thousands and thousands of lines. We don’t have thousands of resources. There are over 11,500 lines in this output. There are 932 RegisterResourceOutputs
calls… 1864 RegisterResource
calls..
% cat pulumi.cumulative.out | awk '{print $1}' | sort | uniq -c | sort -n
1 /pulumirpc.Engine/SetRootResource
1 execNodejs
1 locateModule
1 pf.CheckConfig
1 pf.Configure
1 pf.ValidateProviderConfig
1 pulumi-plan
1 sdkv2.CheckConfig
1 sdkv2.Configure
1 sdkv2.GetPluginInfo
1 sdkv2.ValidateProviderConfig
2 /pulumirpc.LanguageRuntime/GetPluginInfo
2 /pulumirpc.LanguageRuntime/GetRequiredPlugins
2 /pulumirpc.LanguageRuntime/Run
2 Cumulative
2 Name
3 newPlugin
4 /pulumirpc.ResourceProvider/CheckConfig
4 /pulumirpc.ResourceProvider/Configure
4 /pulumirpc.ResourceProvider/DiffConfig
4 /pulumirpc.ResourceProvider/GetPluginInfo
4 Time
12 /pulumirpc.ResourceMonitor/SupportsFeature
130 sdkv2.Invoke
260 /pulumirpc.ResourceMonitor/Invoke
260 /pulumirpc.ResourceProvider/Invoke
459 sdkv2.Check
459 sdkv2.Diff
928 /pulumirpc.ResourceProvider/Check
928 /pulumirpc.ResourceProvider/Diff
932 /pulumirpc.ResourceMonitor/RegisterResourceOutputs
1864 /pulumirpc.ResourceMonitor/RegisterResource
5240 /pulumirpc.Engine/Log
I don’t see much detail though in the trace view… is there something I am missing in order to figure out why we see so many calls? Also, it feels like the calls get cumulatively slower as we go. While looking at the --debug --verbose=7
logs, I would swear that the pulumi up
command is making S3 calls after each and every resource update. Is that possible?great-sundown-78827
04/17/2024, 1:11 AMstate.go
and a noisy string error copying
, we get this:
I0416 17:55:02.213363 33154 state.go:342] error copying .pulumi/stacks/ourapp/production.json.gz to .pulumi/stacks/ourapp/production.json.gz.bak: blob (key ".pulumi/stacks/ourapp/production.json.gz -> .pulumi/stacks/ourapp/production.json.gz.bak") (code=NotFound): NoSuchKey: The specified key does not exist.
I0416 17:55:03.660768 33154 state.go:342] error copying .pulumi/stacks/ourapp/production.json.gz to .pulumi/stacks/ourapp/production.json.gz.bak: blob (key ".pulumi/stacks/ourapp/production.json.gz -> .pulumi/stacks/ourapp/production.json.gz.bak") (code=NotFound): NoSuchKey: The specified key does not exist.
I0416 17:55:04.526660 33154 state.go:342] error copying .pulumi/stacks/ourapp/production.json.gz to .pulumi/stacks/ourapp/production.json.gz.bak: blob (key ".pulumi/stacks/ourapp/production.json.gz -> .pulumi/stacks/ourapp/production.json.gz.bak") (code=NotFound): NoSuchKey: The specified key does not exist.
I0416 17:55:05.615803 33154 state.go:342] error copying .pulumi/stacks/ourapp/production.json.gz to .pulumi/stacks/ourapp/production.json.gz.bak: blob (key ".pulumi/stacks/ourapp/production.json.gz -> .pulumi/stacks/ourapp/production.json.gz.bak") (code=NotFound): NoSuchKey: The specified key does not exist....
I0416 18:03:56.379596 33154 state.go:342] error copying .pulumi/stacks/ourapp/production.json.gz to .pulumi/stacks/ourapp/production.json.gz.bak: blob (key ".pulumi/stacks/ourapp/production.json.gz -> .pulumi/stacks/ourapp/production.json.gz.bak") (code=NotFound): NoSuchKey: The specified key does not exist.
I0416 18:03:57.366905 33154 state.go:342] error copying .pulumi/stacks/ourapp/production.json.gz to .pulumi/stacks/ourapp/production.json.gz.bak: blob (key ".pulumi/stacks/ourapp/production.json.gz -> .pulumi/stacks/ourapp/production.json.gz.bak") (code=NotFound): NoSuchKey: The specified key does not exist.
I0416 18:03:59.225002 33154 state.go:342] error copying .pulumi/stacks/ourapp/production.json.gz to .pulumi/stacks/ourapp/production.json.gz.bak: blob (key ".pulumi/stacks/ourapp/production.json.gz -> .pulumi/stacks/ourapp/production.json.gz.bak") (code=NotFound): NoSuchKey: The specified key does not exist.
great-sundown-78827
04/17/2024, 1:11 AMechoing-dinner-19531
04/17/2024, 7:36 AMI would swear that theYeh you might want to enable checkpoint skipping https://www.pulumi.com/blog/pulumi-release-notes-80/#skip-checkpoints-experimental-flag I don't think this needs EXPERIMENTAL set anymore, justcommand is making S3 calls after each and every resource updatepulumi up
PULUMI_SKIP_CHECKPOINTS=true
, will skip all but the final checkpoint write so should be much faster at the cost that if something causes your pulumi process to be hard-killed halfway through you lose all track of what had been done so far.great-sundown-78827
04/17/2024, 1:05 PMgreat-sundown-78827
04/17/2024, 1:10 PMResources:
~ 1 updated
933 unchanged
Duration: 9s
^^ that made a difference!echoing-dinner-19531
04/17/2024, 1:55 PMbetter-lunch-24345
04/18/2024, 1:37 PMPULUMI_SKIP_CHECKPOINTS=true
?
Resources:
~ 1 to update
616 unchanged
Do you want to perform this update? yes
Updating (live):
Type Name Status Info
pulumi:pulumi:Stack upcoupons-live
└─ gcp:utils:UpfeatAlert live-alert-backend-requests-component
~ └─ gcp:monitoring:AlertPolicy live-alert-backend-requests updated (12s) [diff: ~conditions]
Resources:
~ 1 updated
616 unchanged
Duration: 6m40s