hello friends - I have stumbled upon an issue in t...
# aws
a
hello friends - I have stumbled upon an issue in the
pulumi-aws
driver we rely on. The symptoms match a number of open and closed issues on the repo, so I hope the question I have makes sense. When using SSO,
pulumi preview
and
pulumi up
both give the same error, consistently:
Copy code
Error: failed to refresh cached credentials, the SSO session has expired or is invalid: open /Users/[redacted]/.aws/sso/cache/1f00d08c5e62b5eaa7523500a301bc70997c42d9.json: no such file or directory
It's always the same hash, even across days, and no matter how many times I've renewed my AWS credentials. What I have been able to dig up through investigation of the aws sdk and the
pulumi-aws
repository is that at some point, pulumi had duplicated whatever hashing approach (sha-1?) that the aws sdk was performing at the time to generate new cache file names so that you could pull the latest cache file. Unfortunately, aws's approach has changed over time and now
pulumi-aws
generates an erroneous hash which sends it looking for the same non-existent file. The work-arounds I've found from other folks reporting this issue is to symlink the latest cache file to the file pulumi expects. This is irritating because I have to first run pulumi preview to get the error in order to know what hash it's trying to use for the file. I would like to add to my shell script that refreshes all my credentials each day that symlinks the hash pulumi expects to the latest cache file after I've refreshed the login. The catch is, I can't figure out where exactly you all are passing things in my environment into the hash function in order to determine that filename. Can someone here help me determine the approach so we can get a work around? Thank you in advance for reading the long-winded explanation 😅
b
@astonishing-journalist-77684 I think opening an issue on pulumi-aws with links to the upsream issues is the best bet
a
hi Jaxx - there are several open (and closed) issues reporting this problem.
I figured instead of just opening another one, I could be proactive and automate a local work-around
and while I could run
pulumi preview
and wait for it to dump the error out and parse that looking for the missing file name, I thought it would be easier if I just asked how that hash is generated
b
the underlying AWS provider uses the aws go sdk to authenticate, i do not believe we generate any hashes or manipulate aws credentials in any way
it just follows the credential chain
what lead you to believe we’re changing the credential file?
a
it's reading the cached credential file that aws stores after an sso login
the fix I mentioned is to symlink what pulumi thinks the file should be to the actual latest cache file
but aws is not giving pulumi that hash
b
can you link me to some of the issues you’re referring to?
it’s reading the cached credential file that aws stores after an sso login
if it’s doing this, again, the issue is related directly to the aws GO sdk. there’s nothing special in the provider that does this to my knowledge
also: are you using SSO sessions? how is your SSO profile configured
a
sure thing, this is the latest: https://github.com/pulumi/pulumi-aws/issues/3023
b
are you using an S3 backend?
a
the aws config is set to use an sso session, yes
I'll have to find out if we are, I'm in a new role and as I was trying to get pulumi working, I ran into this. Anyone on the team who moved to the latest version is also running into the issue now but were not before
b
at what stage does the error throw? during
pulumi login
or
pulumi up
?
a
pulumi preview or pulumi up
basically, when it tries to read from aws
with the symlink in place, it will work for a day
but when I have to re-auth with AWS, a new cache file is created and then I have to delete and recreate the symlink pointing to a new file
b
if you look in your aws config, does the
sso_start_url
have a
#
in it?
a
yes
b
remove it, make sure the AWS CLI is at the latest version, and I would bet money that solves the issue 🙂
a
I'll give it a try now, seems like an easy enough thing 🙂
b
oh, make sure you reauth as well, once you’ve modified your start url. You might want to clear your sso cache
as an aside, I wrote a tool to automate the retrieval of temp credentials that I use every day: https://github.com/jaxxstorm/aws-sso-creds
a
removing the hashmark from the end of that url broke the aws sso login
b
does it look like this?
Copy code
sso_start_url = <https://lbrlabs.awsapps.com/start>
a
yes, after I removed the
#
at the end
b
what error do you get?
a
Copy code
Missing the following required SSO configuration values: sso_start_url, sso_region. To make sure this profile is properly configured to use SSO, please run: aws configure sso
b
what version of the aws cli are you using?
a
aws-cli/2.13.37 Python/3.11.6 Darwin/23.1.0
b
can you share your redacted
.aws/config
file?
a
Copy code
[default]
region = [region]
output = json
[profile myprofile]
sso_session = myprofile
sso_account_id = [sso-account-id]
sso_role_name = [sso-role]
sso_region = [sso-region]
sso_start_url = [sso-start-url]
sso_registration_scopes = sso:account:access
region = [region]
output = json
[sso-session myprofile]
sso_start_url = [sso-start-url]
sso_region = [sso-region]
sso_registration_scopes = sso:account:access
b
that doesn’t look correct to me, that looks like a cross between a legacy profile and an sso session profile. Here’s how mine looks, and it works:
Copy code
[sso-session personal]
sso_region = us-west-2
sso_start_url = <https://lbrlabs.awsapps.com/start>

[profile personal-development]
sso_session = personal
output = json
region = us-west-2
sso_account_id = <x>
sso_role_name = AWSAdministratorAccess

[profile personal-management]
sso_session = personal
output = json
region = us-west-2
sso_account_id = <x>
sso_role_name = AWSAdministratorAccess
I’d choose to use sso-session and configure it correctly, or use a legacy profile (where you copy the start url and region to every profile)
a
ok, I'll see if I can make edits to get this to work with your format and report back
ok, this is the config now:
Copy code
[sso-session dev]
sso_region = [sso-region]
sso_start_url = [sso-start-url]

[profile profile-name]
sso_session = dev
output = json
region = [profile-region]
sso_account_id = [sso-account-id]
sso_role_name = [sso-role-name]
the aws sso login works now, but, I get a different error from pulumi:
Copy code
aws:kms:Key (cluster-key):
    error: unable to validate AWS credentials.
    Details: loading configuration: profile "[profile-name]" is configured to use SSO but is missing required configuration: sso_region, sso_start_url

    Make sure you have:

     	 • Set your AWS region, e.g. `pulumi config set aws:region us-west-2`
     	 • Configured your AWS credentials as per <https://pulumi.io/install/aws.html>
     	 You can also set these via cli using `aws configure`.

  pulumi:pulumi:Stack ([stack-name]):
    error: Error: invocation of aws:index/getPartition:getPartition returned an error: unable to validate AWS credentials - see <https://pulumi.io/install/aws.html> for details on configuration
        at Object.callback (/Users/alex/git/rownd/infrastructure/node_modules/@pulumi/runtime/invoke.ts:159:33)
        at Object.onReceiveStatus (/Users/alex/git/rownd/infrastructure/node_modules/@grpc/grpc-js/src/client.ts:338:26)
        at Object.onReceiveStatus (/Users/alex/git/rownd/infrastructure/node_modules/@grpc/grpc-js/src/client-interceptors.ts:426:34)
        at Object.onReceiveStatus (/Users/alex/git/rownd/infrastructure/node_modules/@grpc/grpc-js/src/client-interceptors.ts:389:48)
        at /Users/alex/git/rownd/infrastructure/node_modules/@grpc/grpc-js/src/call-stream.ts:276:24
        at processTicksAndRejections (node:internal/process/task_queues:77:11)
    error: Error: invocation of aws:index/getRegion:getRegion returned an error: unable to validate AWS credentials - see <https://pulumi.io/install/aws.html> for details on configuration
        at Object.callback (/Users/alex/git/rownd/infrastructure/node_modules/@pulumi/runtime/invoke.ts:159:33)
        at Object.onReceiveStatus (/Users/alex/git/rownd/infrastructure/node_modules/@grpc/grpc-js/src/client.ts:338:26)
        at Object.onReceiveStatus (/Users/alex/git/rownd/infrastructure/node_modules/@grpc/grpc-js/src/client-interceptors.ts:426:34)
        at Object.onReceiveStatus (/Users/alex/git/rownd/infrastructure/node_modules/@grpc/grpc-js/src/client-interceptors.ts:389:48)
        at /Users/alex/git/rownd/infrastructure/node_modules/@grpc/grpc-js/src/call-stream.ts:276:24
        at processTicksAndRejections (node:internal/process/task_queues:77:11)
b
which version of the aws provider are you using?
a
I got stuck in this with the more sparse config the other day and we added the fields that my colleagues had until we got it working (using the symlink work-around mentioned before)
by that, you mean on the server?
b
no, the aws provider is configured in your stack. you can retrieve it by running
pulumi about
I suspect you’re on an outdated version of the pulumi aws provider with an embedded aws sdk go version that doesn’t actually support sso-session
a
Copy code
Dependencies:
NAME                VERSION
@pulumi/aws-native  0.40.2
@pulumi/kubernetes  3.22.1
@pulumi/pulumi      3.46.1
@pulumi/cloudflare  4.12.1
@pulumi/eks         0.42.7
@pulumi/gitlab      4.9.0
@types/node         14.18.33
simple-git          2.48.0
@aws-cdk/aws-ec2    1.180.0
@pulumi/aws         5.21.0
@pulumi/awsx        0.40.1
b
yep, that’s quite old. you’ll need to either: • update your aws provider to use an aws sdk for go that allows using sso-session • use a legacy profile setup, where you set the start_url and sso_region in every profile
a
right, which is what I had (bullet 2)
but then we're back to the cache file bug
b
you were still setting the sso_session property:
Copy code
[profile myprofile]
sso_session = myprofile # here
sso_account_id = [sso-account-id]
sso_role_name = [sso-role]
sso_region = [sso-region]
sso_start_url = [sso-start-url]
sso_registration_scopes = sso:account:access
region = [region]
output = json
a
so - and this is where I need to read more docs - to update the dependencies, that would involve updating our stack definition to get to the latest?
b
yep,
npm install @pulumi/aws --update
The long and short of all this really is that there are bugs in sso-session and sso profile management and it’s finicky. as i mentioned before, this is the main upstream issue: https://github.com/aws/aws-sdk-go-v2/issues/2241
a
so, I ran the update but I am still getting the same error 😕
Copy code
Diagnostics:
  pulumi:pulumi:Stack (infra-dev-us-east-2):
    error: Error: invocation of aws:index/getPartition:getPartition returned an error: unable to validate AWS credentials - see <https://pulumi.io/install/aws.html> for details on configuration
        at Object.callback (/Users/alex/git/rownd/infrastructure/node_modules/@pulumi/runtime/invoke.ts:159:33)
        at Object.onReceiveStatus (/Users/alex/git/rownd/infrastructure/node_modules/@grpc/grpc-js/src/client.ts:338:26)
        at Object.onReceiveStatus (/Users/alex/git/rownd/infrastructure/node_modules/@grpc/grpc-js/src/client-interceptors.ts:426:34)
        at Object.onReceiveStatus (/Users/alex/git/rownd/infrastructure/node_modules/@grpc/grpc-js/src/client-interceptors.ts:389:48)
        at /Users/alex/git/rownd/infrastructure/node_modules/@grpc/grpc-js/src/call-stream.ts:276:24
        at processTicksAndRejections (node:internal/process/task_queues:77:11)
    error: Error: invocation of aws:index/getRegion:getRegion returned an error: unable to validate AWS credentials - see <https://pulumi.io/install/aws.html> for details on configuration
        at Object.callback (/Users/alex/git/rownd/infrastructure/node_modules/@pulumi/runtime/invoke.ts:159:33)
        at Object.onReceiveStatus (/Users/alex/git/rownd/infrastructure/node_modules/@grpc/grpc-js/src/client.ts:338:26)
        at Object.onReceiveStatus (/Users/alex/git/rownd/infrastructure/node_modules/@grpc/grpc-js/src/client-interceptors.ts:426:34)
        at Object.onReceiveStatus (/Users/alex/git/rownd/infrastructure/node_modules/@grpc/grpc-js/src/client-interceptors.ts:389:48)
        at /Users/alex/git/rownd/infrastructure/node_modules/@grpc/grpc-js/src/call-stream.ts:276:24
        at processTicksAndRejections (node:internal/process/task_queues:77:11)

  aws:iam:Role (dev-us-east-2-eksRole-role):
    error: unable to validate AWS credentials - see <https://pulumi.io/install/aws.html> for details on configuration
b
provider’s are stored in state as well, so you need to run a successful
pulumi up
after running
npm install --update
use https://github.com/jaxxstorm/aws-sso-creds to get some temporary credentials to update the stack provider
a
how would I get to that now that it is broken in a way I can't work around?
b
or alternatively you can configure your aws config like this:
Copy code
[default]
region = [region]
output = json
[profile myprofile]
sso_account_id = [sso-account-id]
sso_role_name = [sso-role]
sso_region = [sso-region]
sso_start_url = [sso-start-url]
sso_registration_scopes = sso:account:access
region = [region]
output = json
and it’ll work
a
let me try reverting my config to that
that configuration no longer works
b
oh? does
aws sts get-caller-identity
work?
a
it does not
b
you likely need to remove
sso_registration_scopes
that’s a session specific property
a
I did that. My config has the exact format and keys that you shared just now
b
mind sharing it?
a
I have tried it with and without the # at the end of the start url
Copy code
[default]
region = [region]
output = json

[profile profile-name]
sso_account_id = [sso-acount-number]
sso_role_name = [sso-role-name]]
sso_region = [sso-region]
sso_start_url = [sso-start-url]
sso_registration_scopes = [sso-scope]
region = [region]
output = json
b
Copy code
sso_registration_scopes = [sso-scope]
you still have this property in there. Remove it, and reauth
a
oh, the one you shared had it, so I thought I needed it
b
yes that was my mistake
a
I removed all the cache files, removed the sso_registration_scopes, re-authorized through aws sso login --profile profile-name, then ran the sts get caller identity command and it is still broken
I have tried it with both the # at the end of the url and without it
ok, I left off
--profile
on the sts call, my mistake
sorry
b
there must be a misconfiguration in your profile, without being able to know all the required values, I can’t help much
ah, yes
I recommend using something like https://github.com/johnnyopao/awsp to switch profiles
a
ah, there it goes. I got a working preview command 👍
b
the cache file problem will now likely been gone, as well
a
I'll get this pushed shortly, I'm just writing everything up for the rest of my team. Thank you for your help, Lee. I really appreciate your time and patience helping me sort through this 🎉