hi, trying to `pulumi up` my prod environment afte...
# general
c
hi, trying to
pulumi up
my prod environment after not touching it for a while, and it is hanging (100% cpu on python process) in preview. I believe is it related to the large (alb and cert-manager)
k8s.yaml.ConfigFile
resources I have. Everything has been fine for many months, but since the last time I ran it, pulumi, python, pulumi-kubernetes, and my laptop (new m1) have gone through many updates (which I have just applied). I've tried logging as suggested on the troubleshooting page, but can't see anything interesting.
pulumi refresh
seems to work ok.
aws cli
and
kubectl
are connecting. If I comment out the
ConfigFile
resource, then preview completes normally (and offers to delete my resources). I'm out of ideas. This is prod and I don't want to delete anything. Any ideas?. The last line of the log is
Copy code
I0114 22:12:30.762818   83092 eventsink.go:62] eventSink::Debug(<{%reset%}>resource registration successful: ty=kubernetes:<http://apiextensions.k8s.io/v1:CustomResourceDefinition|apiextensions.k8s.io/v1:CustomResourceDefinition>, urn=urn:pulumi:xxxx.prod::xxxx::kubernetes:yaml:ConfigFile$kubernetes:<http://apiextensions.k8s.io/v1:CustomResourceDefinition::certificaterequests.cert-manager.io<{%reset%}>)|apiextensions.k8s.io/v1:CustomResourceDefinition::certificaterequests.cert-manager.io<{%reset%}>)>
e
This sounds suspiciously similar to https://github.com/pulumi/pulumi-kubernetes/issues/1731 which was closed back in October. As in the last comment on that thread can you make sure all your pulumi dependencies are up to date and if still having issues raise a new issue at https://github.com/pulumi/pulumi-kubernetes/issues/ referencing the other isuse.
c
v
Hi facing the same issue. Some even times it would crash with huge traceback that ends on
Copy code
File "/Users/danylo/projects/python/surus-pulumi/venv/lib/python3.8/site-packages/pulumi/runtime/resource.py", line 853, in _resolve_depends_on_urns
        return await _expand_dependencies(all_deps, from_resource)
      File "/Users/danylo/projects/python/surus-pulumi/venv/lib/python3.8/site-packages/pulumi/runtime/resource.py", line 824, in _expand_dependencies
        await _add_dependency(urns, d, from_resource)
      File "/Users/danylo/projects/python/surus-pulumi/venv/lib/python3.8/site-packages/pulumi/runtime/resource.py", line 804, in _add_dependency
        await _add_dependency(deps, child, from_resource)
      File "/Users/danylo/projects/python/surus-pulumi/venv/lib/python3.8/site-packages/pulumi/runtime/resource.py", line 804, in _add_dependency
        await _add_dependency(deps, child, from_resource)
      File "/Users/danylo/projects/python/surus-pulumi/venv/lib/python3.8/site-packages/pulumi/runtime/resource.py", line 803, in _add_dependency
        for child in res._childResources:
    RuntimeError: Set changed size during iteration
e
@victorious-continent-984 can you raise an issue at https://github.com/pulumi/pulumi/issues with more information about that
c
Sorry, I'm back again. This issue is still not resolved for me. I have updated pulumi and libraries to current releases, but python still hangs. I managed to attach a python debugger to the process and it seems to get stuck forever in
grpc/protobuf
code. I tried stepping through but the stack was 50 deep and just low level serialization code. This is the yaml I'm trying to apply:
<https://github.com/cert-manager/cert-manager/releases/download/v1.8.0/cert-manager.crds.yaml>
If I comment out half of it, it seems fine. If I put some back in, it hangs. But it doesn't seem to matter which bit I comment out, more so the amount. To recap: this used to work on my Intel mac. It works now on a debian arm64 VM running on my m1 mac. It hangs on the mac using python 3.9 arm64 build.
e
If your still able to repro it might help to add debug logs to the ticket. Run with
--logflow --logtostderr -v9
c
I've put the log up.
I've got a reproducible script now. Can I escalate this issue somehow? It's a showstopper for me at the moment.
e
That script will help. I'll have a quick look now.
Ah need an M1 to test this, I'll flag it to the providers team I'm sure at least one of them has an M1 to take a look at it.
c
I've resolved the issue for now. It's not a pulumi root cause, but it certainly is manifested in pulumi. https://github.com/pulumi/pulumi-kubernetes/issues/1867#issuecomment-1107353905