https://pulumi.com logo
Title
b

big-account-56668

07/28/2022, 3:17 PM
We've just had Pulumi wanting to recreate a slew of resources seemingly out of the blue because it has decided to change some automatically generated attributes i.e. the hash that it automatically adds to say, a Cloud Run service name, is now different so it wants to replace the resource. The types of resources are varied and span three providers, AWS, GCP, GCP native. This happened just after applying changes to the stack, immediately after which it would not diff, but slightly later on the order of minutes/hour, it suddenly diffs. There are no known other factors that I can think of that would've caused this, and it's happened twice in short succession. Replacing the resources would cause downtime in our production system. First time we were able to accept the downtime as part of scheduled maintenance, but now we need to figure out what is going on. Does anyone have any suggestions on how to debug this further?
e

echoing-dinner-19531

07/28/2022, 3:25 PM
the hash that it automatically adds to say, a Cloud Run service name
Is this actually the thing changing? That's supposed to look at the current state and use the existing name as is if it's present, so not sure why it would want to change that.
b

big-account-56668

07/28/2022, 3:31 PM
Here's a sample diff output
+-gcp:cloudrun/service:Service: (replace)
        [id=locations/europe-west1/namespaces/x/services/name-acd63ea]
        [urn=urn:pulumi:x::x::gcp:cloudrun/service:Service::name]
      ~ name: "name-acd63ea" => "name-936a38b"
I suppose this would cascade to places of usage, like PubSub push endpoints where it uses the Cloud Run generated URL that has the service name in it, or an AWS SNS subscription. At a second glance that does indeed explain why there are so many modifications. Let me have a closer look to see if it is only Cloud Run services that diffs without it coming from another dependent resource.
e

echoing-dinner-19531

07/28/2022, 3:51 PM
If you do a
pulumi stack export
can you find the details for that resource and check the name is in the "inputs" details?
b

big-account-56668

07/28/2022, 3:55 PM
The name is not in the inputs, no.
e

echoing-dinner-19531

07/28/2022, 4:01 PM
Well that'll be why it want's to recreate them. Although that just leads to the next question of why they would be missing from inputs.
Per chance have you run a refresh on this stack?
b

big-account-56668

07/28/2022, 4:02 PM
I have yes.
e

echoing-dinner-19531

07/28/2022, 4:03 PM
And your seeing this with aws, gcp, and gcp-native?
b

big-account-56668

07/28/2022, 4:04 PM
I have to look closer, but I'm definitely seeing this on aws and gcp. We're only using gcp-native for select things, so would have to verify if it is indeed on resources managed with that.
e

echoing-dinner-19531

07/28/2022, 4:05 PM
Could be a tfbridge issue if its just on aws and gcp. I'll see if anythings changed there recently
b

big-account-56668

07/28/2022, 4:08 PM
The following resources have the same issue
+-gcp:cloudrun/iamMember:IamMember: (replace)
        [id=v1/projects/project123/locations/europe-west2/services/service-b6d985b/roles/run.invoker/allUsers]
        [urn=urn:pulumi:x::x::gcp:cloudrun/iamMember:IamMember::invoke-permission]
      ~ service: "v1/projects/project123/locations/europe-west2/services/service-b6d985b" => "service-85343cb"
 +-aws:iam/role:Role: (replace)
        [id=role-a107c15]
        [urn=urn:pulumi:x::x::aws:iam/role:Role::role]
      ~ name: "role-a107c15" => "role-1baf75a"
    +-aws:iam/rolePolicyAttachment:RolePolicyAttachment: (replace)
        [id=role-a107c15-20220727175101879400000001]
        [urn=urn:pulumi:x::x::aws:iam/rolePolicyAttachment:RolePolicyAttachment::role]
      ~ role: "role-a107c15" => "role-1baf75a"
    +-gcp:compute/targetHttpProxy:TargetHttpProxy: (replace)
        [id=projects/project123/global/targetHttpProxies/target-proxy-f1ccc22]
        [urn=urn:pulumi:x::x::gcp:compute/targetHttpProxy:TargetHttpProxy::target-proxy]
      ~ name: "target-proxy-f1ccc22" => "target-proxy-0c5368b"
    +-gcp:compute/managedSslCertificate:ManagedSslCertificate: (replace)
        [id=projects/project123/global/sslCertificates/managed-cert-02bbda3]
        [urn=urn:pulumi:x::x::gcp:compute/managedSslCertificate:ManagedSslCertificate::managed-cert]
      ~ name: "managed-cert-02bbda3" => "managed-cert-1c52b92"
    +-gcp:compute/globalAddress:GlobalAddress: (replace)
        [id=projects/project123/global/addresses/ip-86578ac]
        [urn=urn:pulumi:x::x::gcp:compute/globalAddress:GlobalAddress::ip]
      ~ name: "ip-86578ac" => "ip-bd66070"
I don't see any gcp-native resources listed, so only gcp and aws then.
e

echoing-dinner-19531

07/28/2022, 4:11 PM
Just testing something quickly
See if I can repro this easily
hmm ok refresh alone doesn't seem to be the issue at least
b

big-account-56668

07/28/2022, 4:16 PM
Might have also used
up --refresh
in case there is a difference, and
preview
at some point.
e

echoing-dinner-19531

07/28/2022, 4:17 PM
~ service: "v1/projects/project123/locations/europe-west2/services/service-b6d985b" => "service-85343cb"
That's the most suspicious change. Do your other resources depend on that IamMember in someway? Might be just that changed and triggered everything else to do a replace.
b

big-account-56668

07/28/2022, 4:21 PM
Not really, as far as I'm aware. There are 10 or so Cloud Run services and it wants to replace them all. Only some of those have the role applied, to allow public access.
Perhaps it's already been clear, but the names are unchanged in the respective provider console. As in the Cloud Run service names are equivalent to the values that it wants to change.
e

echoing-dinner-19531

07/28/2022, 4:25 PM
? the diff doesn't look like that
"v1/projects/project123/locations/europe-west2/services/service-b6d985b" => "service-85343cb"
That's a very different name, and a different hash ending
b

big-account-56668

07/28/2022, 4:28 PM
True. Let me query using
gcloud
to see what it says for that particular one. In the case of
~ name: "ip-86578ac" => "ip-bd66070"
GCP console currently has it as
ip-86578ac
, and a refresh does not show a diff on that name.
e

echoing-dinner-19531

07/28/2022, 4:29 PM
but after refresh the name is still missing from "inputs" in the stack export?
b

big-account-56668

07/28/2022, 4:30 PM
I've avoided running it again until we can figure this out, but can try it now.
e

echoing-dinner-19531

07/28/2022, 4:30 PM
I don't think it could possibly hurt at this point
b

big-account-56668

07/28/2022, 4:31 PM
It has not helped for any of the affected resources, no.
e

echoing-dinner-19531

07/28/2022, 4:31 PM
But the names are present in the output sections of the stack export?
b

big-account-56668

07/28/2022, 4:34 PM
It is, yes.
e

echoing-dinner-19531

07/28/2022, 4:36 PM
Can you try fixing this manually. Export the state to a file, and edit the input blocks to re-add the names then use
pulumi stack import
to import that state. Can just do this for one resource to start, and then check that
refresh
doesn't remove the name. If that's ok can then fix up the other names and try and do the up.
b

big-account-56668

07/28/2022, 4:38 PM
Actually the role diff is caused by the Cloud Run service name diff- the role is linked to the service so when the name diffs it will also diff. While I agree it's a bit suspicious/confusing that the diff does not show the canonical name for both before and after, I'm fairly sure that one is a red herring. I can try to manually edit one.
e

echoing-dinner-19531

07/28/2022, 4:41 PM
Actually the role diff is caused by the Cloud Run service name diff- the role is linked to the service so when the name diffs it will also diff. While I agree it's a bit suspicious/confusing that the diff does not show the canonical name for both before and after, I'm fairly sure that one is a red herring.
It might not be. I've got to go in a sec (end of day here) but I'd suggest raising an issue about that at https://github.com/pulumi/pulumi-gcp
b

big-account-56668

07/28/2022, 5:26 PM
I've manually added back the name attribute on
gcp:pubsub/subscription:Subscription
,
gcp:cloudrun/service:Service
,
aws:iam/role:Role
,
aws:iam/rolePolicyAttachment:RolePolicyAttachment
,
gcp:compute/globalAddress:GlobalAddress
,
gcp:compute/managedSslCertificate:ManagedSslCertificate
gcp:compute/targetHttpProxy:TargetHttpProxy
,
gcp:compute/globalForwardingRule:GlobalForwardingRule
I've also added back statementId attribute on
aws:lambda/permission:Permission
which similarly was suddenly missing from the input field. These manual changes has removed all the unexpected diffs.
Note that we've never specified either of these attributes in our code.
e

echoing-dinner-19531

07/28/2022, 9:35 PM
Note that we've never specified either of these attributes in our code.
That's expected. Pulumi will use the logical resource name you assign to resources as the default physical name of the resource. What's odd is that your state file didn't have that recorded. For example if I have some code like
new aws.s3.Bucket("my-bucket");
I get a state file that looks like:
"urn": "urn:pulumi:dev::import_test::aws:s3/bucket:bucket::my-bucket",
"id": "my-bucket-3292aff",
"type": "aws:s3/bucket:Bucket",
"inputs": {
	"__defaults": [
		"acl",
		"bucket",
		"forceDestroy"
	],
	"acl": "private",
	"bucket": "my-bucket-3292aff",
	"forceDestroy": false
},
"outputs": {
	"arn": "arn:aws:s3:::my-bucket-3292aff",
	"bucket": "my-bucket-3292aff",
	...
},
...
Note that "bucket" (the name property for a bucket object) is saved to my inputs
So the oddity with your situation is how all those names managed to get removed from your state file. Glad to hear that after a manual fix up at least it seems to be behaving better again.