https://pulumi.com logo
s

sparse-optician-70334

09/12/2023, 2:01 PM
How can I attach to a pre-existing IAM role further policies (some need to reference specific ARNs or Buckets/resources) which are created in Pulumi? I thought about using Jinja2 but this gets a bick mess with apply/and Output[T]
b

breezy-caravan-29021

09/12/2023, 2:37 PM
can't you just create additional iam policies + rolePolicyAttachments ?
s

sparse-optician-70334

09/12/2023, 2:49 PM
Good point. The existing ones in fact are already managed by pulumi - so I guess adding more is how to properly get the Output[T] dependent policy Json to populate based on previously/upstream pulumi created roles
@billowy-army-68599 you mentioned that you have a lot of regular #aws knowledge - may I ask you for help on this one around outputs? It is about the databricks topic: for an IAM role that pulumi is creating I want to
Copy code
from pulumi_aws_native import iam, s3


my_bucket = s3.Bucket("my_bucket")

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "{{ role_arn }}"
            },
            "Action": [
                "s3:ListBucket",
                "s3:GetObject",
                "s3:PutObject",
                "s3:DeleteObject",
                "s3:PutBucketOwnerControl"
            ],
            "Resource": [
                "arn:aws:s3:::{{ bucket_name }}",
                "arn:aws:s3:::{{ bucket_name }}/*"
            ]
        }
    ]
}
1. attach the following policy https://docs.databricks.com/en/administration-guide/account-settings-e2/credentials.html#option-1-default-deployment-policy (I am already doing this) 2. attach a 2nd inline policy to the pulumi managed role to give S3 access to this particular bucket I am currently exploring Jinja and the JSON templates - but this is somehow getting stuck in apply/Output[T] hell for me.
b

billowy-army-68599

09/12/2023, 3:03 PM
this is a common apply problem, it’s detailed in this doc here: https://www.pulumi.com/docs/concepts/inputs-outputs/#outputs-and-json
i would highly recommend reading that doc in its entirety and this blog post as well https://leebriggs.co.uk/blog/2021/05/09/pulumi-apply
s

sparse-optician-70334

09/12/2023, 3:05 PM
I read the docs already and am still stuck. Let me read the blog post. If it is OK for you I would send you a gist afterwards if I am still stuck
b

billowy-army-68599

09/12/2023, 3:07 PM
if you want to share the s3 bucket and the policy you’d like to attach, I can fix it for you
s

sparse-optician-70334

09/12/2023, 3:15 PM
Many thanks! Here you go: https://gist.github.com/geoHeil/ffa54ed441590f62b4c457ca64b4fc7b If it is easier I can also set a Zoom link. Many thanks for offering to take a look.
b

billowy-army-68599

09/12/2023, 3:20 PM
You really don’t need to mess around with jinja templates
s

sparse-optician-70334

09/12/2023, 3:22 PM
I would love not to. I am all ears for a better way. How?
b

billowy-army-68599

09/12/2023, 3:24 PM
give me a few hours and I’ll put a full example together
s

sparse-optician-70334

09/12/2023, 3:24 PM
many thanks
I was able to feed:
Copy code
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:ListBucket",
        "s3:GetObject",
        "s3:PutObject",
        "s3:DeleteObject",
        "s3:PutBucketOwnerControl"
      ],
      "Resource": [
        "arn:aws:s3:::{{ bucket_name }}",
        "arn:aws:s3:::{{ bucket_name }}/*"
      ]
    }
  ]
}
I still am very curious to learn how it works in a better way. However, ideally, I can also figure out how: cannot create mws workspaces: MALFORMED_REQUEST: Failed storage configuration validation checks: List,Put,PutWithBucketOwnerFullControl,Delete is fixed. I had hoped that feeding the policy would solve this as well.
b

billowy-army-68599

09/12/2023, 5:56 PM
@sparse-optician-70334 this is a full example of how to create a databricks workspace in AWS: https://github.com/jaxxstorm/pulumi-examples/tree/main/python/aws/databricks There are helper methods in the databricks provider that ease the creation of the bucket policy and cross account role. I am happy to update the example with explicit examples for how to create these yourself if you wish, but copying this code should get you up and running with databricks. Note: the AWS VPC will cost money because it provisions a NAT Gateway
s

sparse-optician-70334

09/12/2023, 8:00 PM
thanks
I will check this out in detail tomorrow
Do I read correctly that in principle https://github.com/jaxxstorm/pulumi-examples/blob/main/python/aws/databricks/__main__.py#L72 databricks.get_aws_bucket_policy(bucket=root_bucket.bucket) is the missing piece?
b

billowy-army-68599

09/12/2023, 8:19 PM
that creates a polict document to attach to an S3 bucket, yes
s

sparse-optician-70334

09/13/2023, 7:35 AM
The wrapper:
assume_role_policy = databricks.get_aws_assume_role_policy(external_id=account_id)
is super convenient. In other cases where not available, would you still suggest to not drop back to jinja? how would an approach look like there?
Adding the policy on the bucket with the function instead of giving the user permission with the policy seems to solve the issue - many thanks
However, when trying to add an user pulumi fails with: * cannot create user: HTTP method POST is not supported by this URL for user = User('user', user_name = 'user@mail.com', active = True, allow_cluster_create = True, opts=pulumi.ResourceOptions(depends_on=[workspace]), )
Can it be that the
Copy code
<https://accounts.cloud.databricks.com>
is the wrong URL here? I.e. that the workspace would be required instead? But this would not make sense as this is an account-level operation
b

billowy-army-68599

09/13/2023, 1:41 PM
I unfortunately need to move my focus into another customers issues, generally this level of support is reserved for enterprise customers.
s

sparse-optician-70334

09/13/2023, 3:41 PM
understood
may I ask one more question? @billowy-army-68599 In your code you have:
Copy code
cross_account_role_policy = databricks.get_aws_cross_account_policy()
cross_account_role_policy_applied = aws.iam.RolePolicy(
    "databricks-policy",
    role=iam_role.name,
    policy=cross_account_role_policy.json,
)

creds = databricks.MwsCredentials(
    f"{prefix}-{ascii_env}-db-credentials",
    credentials_name=f"{prefix}-{ascii_env}-db-credentials",
    account_id=account_id,
    role_arn=iam_role.arn,
    
)
But for me even with:
Copy code
opts=pulumi.ResourceOptions(
        depends_on=[iam_role, cross_account_role_policy_applied]
    ),
specified this fails with an not yet initialized underlying IAM role. This seems to be semi-reproducible and depend on race conditions based on how quickly certain resources create. I find it strange that the dependencies defined here are not honoured. It is fixed when running pulumi up a 2nd time.
b

billowy-army-68599

09/15/2023, 4:54 PM
hmm that looks like a bug in the databricks side, it isn’t waiting for the value. You could use an apply to wait for it to exist, but that’s quite an advanced use case
s

sparse-optician-70334

09/15/2023, 4:55 PM
Could you share a snippet how this would look like?
do you mean to
_ cross_account_role_policy_applied.id.apply(lamba x: x)
?
i.e. force wait for the dependency to be initiated?
b

billowy-army-68599

09/15/2023, 4:56 PM
yes, inside the
lambda
you can sleep for 30s
s

sparse-optician-70334

09/15/2023, 5:20 PM
thx
Instead of sleeping an arbitrary amount - is it possible to wait explicitly only as long as required for the resource to be come defined? i.e. block the execution flow but not unnecessarily / arbitrarily long?