ok… here’s another one. I have the following role ...
# pulumi-deployments
d
ok… here’s another one. I have the following role assuming pattern locally (for… reasons):
Copy code
role/aws-reserved/sso.amazonaws.com/AWSReservedSSO_AWSAdministratorAccess_xxxxxxxxx (the role I am logged into via the CLI)

=> assumes

role/service-role/AWSControlTowerAdmin (via Pulumi.<stack>.yaml config var aws:assumeRole.roleArn)

=> assumes

role/AWSControlTowerExecution (via `aws.Provider` in the Pulumi code)
locally this works. When running via Pulumi Deployments it doesn’t seem to. The Deployment “configuration” seems to show the correct
aws:assumeRole
value but it doesn’t get assumed. It basically skips the middle assumption and uses the Deployment role to try to directly assume the 3rd role, AWSControlTowerExecution and it fails. If anything I’d expect to see that
assumed-role/PulumiOIDC/pulumi
cannot assume
role/service-role/AWSControlTowerAdmin
or that
role/service-role/AWSControlTowerAdmin
cannot assume
role/AWSControlTowerExecution
, but I’m getting
assumed-role/PulumiOIDC/pulumi
cannot assume
role/AWSControlTowerExecution
. This makes me think that the deploy executor is not respecting the
Pulumi.<stack>.yaml
configuration…
Two pieces of background for the pattern: The reason I’m leveraging these existing roles is that AWS Control Tower automatically sets them up in accounts created via the factory such that
AWSControlTowerAdmin
can assume
AWSControlTowerExecution
in every account and it already has the
AdministratorAccess
policy attached. This means that I don’t have to update/create roles/policies in new accounts when they get created. This particular Pulumi stack loops over all the accounts in the organization to set up some standard controls (like a password policy). For developer ease of use, I didn’t want people to have to assume the control tower admin role before running the stack. Rather, I just wanted people to use their SSO roles. That’s why I wanted to use
AWSReservedSSO_AWSAdministratorAccess_xxxxxxxxx
to assume
role/service-role/AWSControlTowerAdmin
in the management account to assume
role/AWSControlTowerExecution
in each account in the loop. The only way to do this was to leverage the Pulumi config and the default provider. Unfortunately there’s no way to do it in the code as you cannot use a Provider to create a Provider.
👀 any thoughts? Is a
Pulumi.yaml
setting like
aws:assumeRole.roleArn
something that is supposed to be used by the Pulumi operation during Pulumi Deployments?
r
I'm not certain about what's going on here but I can provide some insight into deployments perhaps. There is nothing in pulumi deployments that would change overwrite your project's Pulumi.yaml or Pulumi.<stack>.yaml.
d
Hmm I mean the UI is displaying the correct values from the Pulumi.yaml config. But the update does not seem to be running with the roleArn assumed
l
I believe the AWS OIDC integration sets some environment variables so that providers can automatically pick up credentials without you having to modify your program. This could explain the difference in provider configuration that you are seeing locally vs in the deployment run. From the deployment OIDC docs for AWS:
With this configuration, each deployment of this stack will attempt to exchange the deployment’s OIDC token for AWS credentials using the specified IAM role prior to running any pre-commands or Pulumi operations. The fetched credentials are published in the
AWS_ACCESS_KEY_ID
,
AWS_SECRET_ACCESS_KEY
, and
AWS_SESSION_TOKEN
environment variables. The raw OIDC token is also available for advanced scenarios in the
PULUMI_OIDC_TOKEN
environment variable and the
/mnt/pulumi/pulumi.oidc
file.
Might be helpful to share code on how you are configuring the provider and assuming the roles.
d
This is my `Pulumi.prod.yaml`:
Copy code
config:
  aws:region: us-east-1
  aws:allowedAccountIds: [xxx]
  aws:assumeRole:
    roleArn: arn:aws:iam::xxx:role/service-role/AWSControlTowerAdmin
This is my `Pulumi.yaml`:
Copy code
name: admin
runtime:
  name: python
description: Manage resources related to our AWS account structure
And this is the Pulumi code in `__main__.py`:
Copy code
import pulumi
import pulumi_aws as aws

MANAGEMENT_ACCOUNT_ID = "xxx"
PASSWORD_PARAMS = {
    "require_uppercase_characters": True,
    "require_lowercase_characters": True,
    "require_symbols": True,
    "require_numbers": True,
    "minimum_password_length": 20,
}

organization = aws.organizations.get_organization()

for account in organization.accounts:
    # Skip deactivated accounts and the management account
    if account.status != "ACTIVE":
        continue

    if account.id == MANAGEMENT_ACCOUNT_ID:
        # See aws-admin-bootstrap-stack where the ManagementAccountAdmin role is created
        provider = aws.Provider(
            f"Provider: Mighty Management Account",
            region="us-east-1",
            assume_role={
                "roleArn": "arn:aws:iam::xxx:role/ManagementAccountAdmin",
            },
        )

    else:
        role_arn = pulumi.Output.format(
            "arn:aws:iam::{0}:role/AWSControlTowerExecution", account.id
        )

        # Create Provider to assume role
        provider = aws.Provider(
            f"Provider: {account.name}",
            region="us-east-1",
            assume_role={
                "roleArn": role_arn,
            },
        )

    password_policy = aws.iam.AccountPasswordPolicy(
        f"AccountPasswordPolicy: {account.name}",
        **PASSWORD_PARAMS,
        opts=pulumi.ResourceOptions(provider=provider),
    )
Hmm using
aws-sso-util
and
awsume
locally… when I “awsume” the SSO role before I run Pulumi up, it does set
AWS_*
env vars
Copy code
env | grep AWS

AWSUME_COMMAND=Mighty-Management-Account.AWSAdministratorAccess
AWS_ACCESS_KEY_ID=XXX
AWS_SECRET_ACCESS_KEY=YYY
AWS_SESSION_TOKEN=ZZZ
AWS_REGION=us-east-1
AWS_DEFAULT_REGION=us-east-1
AWSUME_PROFILE=Mighty-Management-Account.AWSAdministratorAccess
AWSUME_EXPIRATION=2023-03-17T20:23:39
Locally I just run
poetry run pulumi up
and it runs correctly
l
It basically skips the middle assumption and uses the Deployment role to try to directly assume the 3rd role, AWSControlTowerExecution and it fails.
I don't see anywhere in your code where there is an explicit attempt to assume
role/service-role/AWSControlTowerAdmin
Is it possible that you are using an account/role locally that has broader permissions than the role you've assigned to the OIDC identity provider? Another way of saying this, is it possible that locally you are directly assuming
AWSControlTowerExecution
and that your starting role happens to have appropriate permissions to do this? It looks like from your env, that you are using the administrator access profile locally, so it would not surprise me if this has greater permissions than the role you assigned to the OIDC provider.
Can you try out creating two explicit providers? You'll need to create the first provider specifying assume role on
AWSControlTowerAdmin
and then specify that provider when creating the second provider that assumes
AWSControlTowerExecution
d
So locally I am definitely assuming my SSO role and the way the
role/service-role/AWSControlTowerAdmin
role gets assumed is via the Pulumi.prod.yaml:
I once tried to do it explicitly via providers in the code but I got an error saying that a provider does not accept another provider as a parameter
I wish I could do it via code
image.png
TypeError: Explicit providers may not be used with provider resources
I did just realize that this might not be necessary anymore as the Trust Relationship of
role/AWSControlTowerExecution
is actually the root for the management account meaning that I don’t have to assume an intermediary role…
l
Ah, indeed I think I'm leading you astray with that suggestion. But I believe this issue may explain some of the behavior you are seeing: https://github.com/pulumi/pulumi/issues/12176
I'd encourage you to share notes on that issue and give it a +1. But it sounds like you have a workaround for the time being?
d
Hmm I’m not sure if that issue applies here… it does make sense and I +1ed it, but I think the issue here is that the project/stack configuration from the Pulumi.yaml files are being ignored during the Pulumi Deployments…? But yes, I think I do have a workaround, but need to see if I can implement it
(thank you for looking into this btw!!)
l
I think the issue here is that the project/stack configuration from the Pulumi.yaml files are being ignored during the Pulumi Deployments…?
Pulumi Deployments still use the CLI when running an update. My guess is that there are a different set of environment variables that get set during a deployment vs what you have set locally. It is possible that the different env vars are causing some difference in behavior between the precedence of explicit configuration, stack configuration, and environment variables as is mentioned in the issue.
d
ahh ok—I’m almost done with reorganizing all the roles and will be testing out the new set up in a few min 🤞🏽
Annnnd it worked! So in summary, I was able to get Pulumi Deployments set up with a single OIDC provider and role in the AWS Control Tower management account that is able to assume the
AWSControlTowerExecution
role within the organization’s sub-accounts seamlessly (both via explicit provider created in code and via
config.aws:assumeRole.roleArn
specified in Pulumi.yaml files)! party pizza
l
woohoo glad to hear you got it working! Was there any trick in particular that was required? Would be useful to know for helping folks in the future.
b
i am struggling to get this to work, was there a trick?
Constantly getting
Copy code
error: could not validate provider configuration: 1 error occurred:
    	* Invalid or unknown key
when i provide something like so
Copy code
config:
  aws:region: us-east-1
  aws:assumeRole.roleArn:  *************
d
Copy code
config:
  aws:assumeRole:
    roleArn: xxx
  aws:region: us-east-1
I’ve never seen that dot notation in yaml…. are you sure that’s valid? Might be worth testing
b
yeah it’s how they talk about it here
When I nest the config as you have I get this error
Copy code
unexpected configuration type 'map[string]interface {}': valid types are string, List<string>, number, List<number>, integer, List<integer>, boolean, List<number>
d
hmm could that be related to the code?
what line is it coming from?
b
doesn’t say, assuming it is coming from the provider
what language you using? TS?
d
Python here
b
d
so it sounds like it’s a separate item in the config that’s blowing up now?
b
No, it’s the config. The style needed to configure assumeRoles doesn’t work properly when using the yaml runtime
d
Oh you’re using
yaml
for the whole shebang rather than TS/python/etc.?
b
yeah in this stack
d
ahh ok, that’s a bummer about that bug—sorry!!
b
Well thanks for helping me to at least arrive at what is wrong
l
@brash-gigabyte-81569 sorry for the trouble here! I followed up with the team and we have a workaround for you here as we work on a fix: https://github.com/pulumi/pulumi-yaml/issues/434#issuecomment-1519177770
b
No worries, I got it working with profiles and setting the profile flag for my case. But I can see using this workaround when i hit another case where the nesting is an issue