Every once in a while when I run pulumi up, the pr...
# aws
g
Every once in a while when I run pulumi up, the preview shows that it will replace a lot of resources due to a provider change. Sometimes this happens when I upgrade the AWS provider, and in the past this tended to happen more when the project has evolved a lot between updates. In some of these cases I can avoid replacement by breaking down the changes to the project into smaller units and applying them in order instead of in one large update. But today it seems to be happening even though the provider version is the same. There are some resources with additional changes, but many resources being replaced only have a provider diff, but it's the same version. Attaching an excerpt from the preview details here. Why is this? And how can I avoid it?
Copy code
++aws:s3/bucket:Bucket: (create-replacement)
        [id=xxxx-dev]
        [urn=urn:pulumi:dev::datalake-iac::aws:s3/bucket:Bucket::xxxx]
        [provider: urn:pulumi:dev::datalake-iac::pulumi:providers:aws::default_6_48_0::1b1bc0e4-8f0c-4ca7-98eb-e9f2d2c280ad => urn:pulumi:dev::datalake-iac::pulumi:providers:aws::default_6_48_0::output<string>]
        acl         : "private"
        bucket      : "xxxx-dev"
        forceDestroy: false
        tags        : {
            abadai:component     : "Artifacts"
            abadai:env           : "dev"
            abadai:ops:owner     : "Engineering"
            abadai:pulumi:project: "datalake-iac"
            abadai:pulumi:stack  : "dev"
            abadai:service       : "Datalake"
        }
        tagsAll     : {
            abadai:component     : "Artifacts"
            abadai:env           : "dev"
            abadai:ops:owner     : "Engineering"
            abadai:pulumi:project: "datalake-iac"
            abadai:pulumi:stack  : "dev"
            abadai:service       : "Datalake"
        }

    +-aws:s3/bucket:Bucket: (replace)
        [id=xxxx-dev]
        [urn=urn:pulumi:dev::datalake-iac::aws:s3/bucket:Bucket::xxxx]
        [provider: urn:pulumi:dev::datalake-iac::pulumi:providers:aws::default_6_48_0::1b1bc0e4-8f0c-4ca7-98eb-e9f2d2c280ad => urn:pulumi:dev::datalake-iac::pulumi:providers:aws::default_6_48_0::output<string>]
      - accelerationStatus               : ""
      - arn                              : "arn:aws:s3:::xxxx-dev"
      - bucketDomainName                 : "xxxx-dev.s3.amazonaws.com"
      - bucketRegionalDomainName         : "xxxx-dev.s3.amazonaws.com"
      - corsRules                        : []
      - grants                           : []
      - hostedZoneId                     : "REDACTED"
      - id                               : "xxxx-dev"
      - lifecycleRules                   : []
      - loggings                         : []
      - region                           : "us-east-1"
      - requestPayer                     : "BucketOwner"
      - serverSideEncryptionConfiguration: {
          - rule: {
              - applyServerSideEncryptionByDefault: {
                  - kmsMasterKeyId: ""
                  - sseAlgorithm  : "AES256"
                }
              - bucketKeyEnabled                  : false
            }
        }
      - tagsAll                          : [secret]
      + tagsAll                          : {
          + abadai:component     : "Artifacts"
          + abadai:env           : "dev"
          + abadai:ops:owner     : "Engineering"
          + abadai:pulumi:project: "datalake-iac"
          + abadai:pulumi:stack  : "dev"
          + abadai:service       : "Datalake"
        }
      - versioning                       : {
          - enabled  : true
          - mfaDelete: false
        }
m
This looks like you're renaming the provider based on an output value? https://github.com/pulumi/pulumi-aws/issues/2009 might have some pointers
g
This is the default provider, not an explicit custom provider. It has the same name ("default"). The part that pulumi is changing seems to be some UUID attached to the end of the provider automatically, not sure exactly what generates that and how it is tied to an output. Note that in theory, it may end up having the exact same value (if the output generates the same UUID), but I'm not sure how to verify that without actually running pulumi up and risking replacement (which I want to avoid).
So this keeps getting stranger and stranger: 1. The state of the stack contains two AWS default providers, one with version 6.47.0 and one with version 6.48.0. But the 6.47.0 provider is not referenced by any resource. Removing it from the state doesn't solve the problem. 2. I also tried downgrading pulumi versions and provider versions, to no avail 3. Even stranger - the problem happens when I run pulumi up from my machine, both from powershell and from WSL (both on the same code). But when a colleague of mine does the same thing from their machine, with the same pulumi version and same code, on the same stack - the problem doesn't occur, they get a small set of updates and not a massive list of resource to replace. 4. Manually replacing all occurences of the old provider ID with the new provider ID solves the problem, but this sounds dangerous and unstable. I'd prefer to understand why this was happening. What could be causing my pulumi to think it needs to generate a new provider ID and conclude it must replace all resources, while my colleague's doesn't?
m
the problem happens when I run pulumi up from my machine, both from powershell and from WSL (both on the same code). But when a colleague of mine does the same thing from their machine, with the same pulumi version and same code, on the same stack - the problem doesn't occur, they get a small set of updates and not a massive list of resource to replace.
This smells like a problem with your local environment. What programming language are you using? Can you try to run in a fresh environment, e.g., a Docker container?
g
I'm using Java (sad, I know). I'm actually trying it from two environments on my machine - from Windows, and from WSL (Ubuntu). The two have separate copies of my code (from the same commit in git), separate pulumi installations (but same version), and separate build / dependency resolution. But both show the same issue on my machine, while my colleague's machine (Windows) seems to not show the issue. Worth mentioning is that the last successful update of the stack happened from this colleague's machine. So perhaps the provider ID that was generated from their run is consistent with future runs from their environment, but not with mine. Not sure why the provider ID would depend on the environment and how, but this seems to hint that it may be the case here
m
I'm not familiar with Pulumi's Java flavor. Are you sure that you have the same Pulumi CLI version?
g
As much as I can be sure - I checked the version used to update the stack (invoked by my colleague) and then preview the stack on my colleagues machine, in pulumi cloud (which records this info). I then downgraded my own pulumi to match that version, ran pulumi version to verify, and ran a preview.
m
There has to be a difference between your colleague's and your environment if you see two different results when running the same program against the same state.
g
I agree. But nothing that seems to be captured by pulumi cloud. I'll run pulumi with verbose logging, maybe that will reveal something
m
I think it's more likely that you have different dependencies or something similar. Hence my suggestion to launch a Docker container and install everything in there, which will avoid things lurking in caches, someone accidentally using a different version due to a misconfigured PATH etc.
Or spin up a VM somewhere if that's easier for you. In my experience, there are a lot of things going on on developer's machines that are not always obvious even to the person owning it 😉
g
Thanks, I'll try that.
Ok the culprit was found! After going to level 10 verbose logging, I noticed that on my machine, the provider is not given a region parameter, while in the existing state it was. My guess is that the reason is that on my machine, the region is part of the AWS CLI profile I use - the project / stack itself have no region configuration, nor is there any region environment variable. Pulumi still manages to connect to the correct region set as default in the profile, but it seems that sometime during the setup of the provider it doesn't have that info yet, and when comparing to current state it finds a diff. Not sure yet if this diff is because my colleague's machine does explicitly state a region via an environment variable or command line argument, or if it's just stored as an input to the provider in the state but at a later stage. Anyway when I explicitly specify a region via an environment variable or the stack config, things work as expected.
g
Thi sis one of the reasons I try to avoid default providers in multitenant clients (aws, k8s, etc...). Thanks to recent update you can prevent the creation of default providers in Pulumi.yaml for the project instead of per stack. I also put the guard rail in the shared library which checks if project has default providers disabled.
340 Views