My `pulumi preview` has become incredibly slow as ...
# general
m
My
pulumi preview
has become incredibly slow as soon as a
k8s.yaml.ConfigFile
resource is instantiated. Comment it out: quick. Uncomment: slow. I have been using lots of these, and only just recently encountered this issue. They did not previously cause any slowness. Any ideas what could have changed or be causing this?
There is a large amount of time between when the interactive console shows:
Copy code
+      └─ nirvana:k8s-external-dns           external-dns      create
 +         └─ kubernetes:yaml:ConfigFile      external-dns.yml  create
and when the ConfigFile is expanded with its children:
Copy code
+      └─ nirvana:k8s-external-dns                                       external-dns         create
 +         └─ kubernetes:yaml:ConfigFile                                  external-dns.yml     create
 +            ├─ kubernetes:core:ServiceAccount                           external-dns         create
 +            ├─ kubernetes:<http://rbac.authorization.k8s.io:ClusterRole|rbac.authorization.k8s.io:ClusterRole>         external-dns         create
 +            ├─ kubernetes:extensions:Deployment                         external-dns         create
 +            └─ kubernetes:<http://rbac.authorization.k8s.io:ClusterRoleBinding|rbac.authorization.k8s.io:ClusterRoleBinding>  external-dns-viewer  create
i
I've noticed something similar, but I just assumed it was some intrinsic fact rather than a problem. Subscribing to this thread. 😉
c
Same here.
Though we started with ConfigFile from our first test, so we had no idea there was a performance issue. I just thought Pulumi was a bit slow.
m
I just figured it out I think! At least in my case, this is because a K8S provider was being used that was no longer valid (cluster did not exist anymore)
c
Hmm. Maybe our case is that there are ~30 resources in the yaml so it takes some time to parse.
I was a bit hopeful when I saw your message.
m
It seems lots of calls to the cluster are made during the expansion, and though it seems to work with the invalid provider, they must timeout first or something.
I suppose it might depend on how slow. This was taking minutes for expansion
c
Ah. I don’t think ours is that slow. Maybe 30-40 seconds?
m
even that seems suspect. Here is an example I have now with a mix of
ConfigFile
,
ConfigGroup
and (helm) `Chart`:
Copy code
❯ time pulumi preview
...
Resources:
    + 129 to create
    8 unchanged

Permalink: XXX
pulumi preview  7.83s user 1.29s system 110% cpu 8.265 total
Feels pretty snappy for the amount of resources that spew out in the plan
c
Copy code
Resources:
    ~ 1 to update
    34 unchanged

       47.91 real         4.58 user         1.10 sys
m
Yeah I would expect you're encountering a bug. Are you either providing a
K8SProvider
to the resources via some fashion, or have a default that is accessible?
c
Not sure what you mean
This is via the KUBECONFIG way.
The index.ts is basically just
new k8s.yaml.ConfigFile('filename');
Though we are applying some transformations to the yaml
m
If the cluster is responsive via
KUBECONFIG=X kubectl
then it must be some other issue
c
We’ve not had perf issues with kubectl itself, so I do believe it’s pulumi.
c
It’s possible it is an issue with our Kubernetes library. An issue we’ve seen recently is that if the underlying resource provider spews a lot of diagnostic data. (e.g. sitting in a tight loop reporting the same error) it can send a lot of error diagnostics to our service (e.g. captured in your update logs.) @creamy-potato-29402 , is this something you could look into?
c
How much slower?
There are a lot of reasons this might happen — we’re synchronously reading files from disk, which takes a long time in node; the provider will need to call the API server many times if there are a lot of resource definitions; could be some other weirdness with the engine.
So questions are: • How many resource definitions are there in the YAML? • How much longer does it take? • What is the behavior at the CLI? Do all the resource operations show up at once, or are they 1 by 1? • DOes this happen during preview?
also cc @gorgeous-egg-16927
c
@creamy-potato-29402 if you look up a little bit in this thread you’ll see my snippet with the resource count and time. And it is in preview.
w
For anyone hitting this - could you open an issue with a few details of the problem? We can use that to investigate and fix this.
c
Sure. I will do so on Monday. I honestly don’t know if there is an issue or if it’s just going to take that long because of kubernetes.
w
It's possible - but given the descriptions above I feel like there might be something in Pulumi that is contributing here that we could improve on. Thanks for opening an issue to track!