This message was deleted Pulumi Community #aws

Join Slack

This message was deleted.

# aws

sparse-intern-71089

09/08/2023, 9:17 PM

This message was deleted.

billowy-army-68599

09/08/2023, 9:24 PM

You can definitely do it, it’s this resource: https://www.pulumi.com/registry/packages/databricks/api-docs/mwsworkspaces/

sparse-optician-70334

09/08/2023, 9:32 PM

thanks

sparse-optician-70334

09/08/2023, 9:33 PM

Are you aware of any templates deploying Databricks with Pulumi? (just getting started with it ) so I would appreciate any pointers to build from

billowy-army-68599

09/08/2023, 9:38 PM

nope, but if you let me know what you’re trying to do I can maybe help

sparse-optician-70334

09/09/2023, 5:32 AM

Thanks! This would be awesome. I have 3 AWS accounts dev/test/prod and want to create a databricks workspace in all 3 stacks. Then I want to create some autoscaling cluster in the workspace. Furthermore, I want to create some S3 buckets a) one for my data outputs b) any necessary ones for the databricks workspace to live happily in their account. Any useful information like URLs to sign in or bucket ids/names should be in the stack readme (https://www.pulumi.com/docs/pulumi-cloud/projects-and-stacks/?_ga=2.149016052.318359768.1694094860-180507132.1691596035#stack-readme). For the reademe: I was wollowing https://www.pulumi.com/docs/pulumi-cloud/projects-and-stacks/?_ga=2.149016052.318359768.1694094860-180507132.1691596035#stack-readme

Copy code

import pulumi
from pulumi_aws_native import s3

# Create an AWS resource (S3 Bucket)
bucket = s3.Bucket("my_bucket")

pulumi.export("my_bucket", bucket.id)

with open('./Pulumi.README.md') as f:
    pulumi.export('readme', f.read())

The readme with contents of:

Copy code

- main bucket ${bucket.id}
- my_bucket ${my_bucket}

Is neatly uploaded, however the ${} reference to the variable is never resolved and empty For the databricks workspace:

Copy code

workspace = mws.Workspaces("my_workspace", 
    account_id="<AccountId>", # Your Account Id
    aws_region="<AWSRegion>",  # The AWS region for the VPC
    credentials_id="<CredentialsId>",  # Your credential id
    storage_configuration_id="<StorageConfigurationId>",  # Your storage configuration id
    network_id="<NetworkId>",  # Your network id
    workspace_name="<WorkSpaceName>",)  # Name for your workspace

- how should I retrieve the AWS region? - https://www.pulumi.com/docs/concepts/config/ - should I us the require concept:

name = config.require('name');

? - or can these be auto-resolved from a configured AWS CLI? - how to set the following: - credentials_id - storage_configuration_id - network_id I think these might require setting up some more S3 buckets or VPCs for databricks? But am unsure what to put there. Regarding the cluster: I think https://docs.databricks.com/en/dev-tools/pulumi.html is already describing how to set up the cluster. But if you could include a mini dummy cluster example this would be very helpful as well.

sparse-optician-70334

09/09/2023, 5:37 AM

Ideally you could include how perhaps a unity enabled workspace can be created which also include the metastore

sparse-optician-70334

09/09/2023, 5:58 AM

Also: https://www.pulumi.com/registry/packages/databricks/api-docs/mwsworkspaces/ is confusing as it is mentioning GCP whereas I am interested in the AWS version.

sparse-optician-70334

09/09/2023, 6:02 AM

By the way the actual import seems to be.

from pulumi_databricks import MwsWorkspaces

sparse-optician-70334

09/09/2023, 6:03 AM

The pulumi AI wants me to execute:

Copy code

from pulumi_databricks.mws import MwsWorkspaces, MwsCredentials, MwsStorageConfigurations

but this fails with undefined imports

sparse-optician-70334

09/09/2023, 6:30 AM

I am exploring a bit further together with pulumi AI. When looking at the credentials_id it looks like: creds = MwsCredentials("creds", credentials_name="my-credentials", role_arn="arnawsiam:123456789012role/my-role") is needed. However, I am unsure what proper role/arn and sufficient permissions to choose here.

sparse-optician-70334

09/09/2023, 7:06 AM

https://docs.databricks.com/en/administration-guide/account-settings-e2/credentials.html are the manual steps and their terraform pendents though. In particular for step 3 (Step 3: Create a credential configuration in Databricks) they mention certain settings which need to be defined on the databricks side. Can these also be set automatically via Pulumi?

sparse-optician-70334

09/11/2023, 5:38 PM

@billowy-army-68599 can you help me further?

billowy-army-68599

09/11/2023, 5:42 PM

unfortunately databricks is not my area of expertise.

billowy-army-68599

09/11/2023, 5:42 PM

just reading the thread now

billowy-army-68599

09/11/2023, 5:43 PM

is this your first foray into databricks with AWS?

sparse-optician-70334

09/11/2023, 5:44 PM

yes

billowy-army-68599

09/11/2023, 5:44 PM

I have a lot of AWS experience but no databricks experience I’m afraid,

billowy-army-68599

09/11/2023, 5:46 PM

i’ve added creating an example to my todo list, but it could take some time

sparse-optician-70334

09/11/2023, 5:46 PM

Understood - I will try to go step-by-step for https://docs.databricks.com/en/administration-guide/account-settings-e2/credentials.html#step-1-create-a-cross-account-iam-role and turning each individual step into a pulumi automation script.

sparse-optician-70334

09/11/2023, 5:46 PM

Thanks!

sparse-optician-70334

09/12/2023, 7:17 AM

@billowy-army-68599 but the question regarding the readme topic why the fields do not render in the readme - can you answer that part? It is generic AWS related

2 Views

Open in Slack

Previous Next