Hey all, having a weird problem. I may be doing s...
# aws
a
Hey all, having a weird problem. I may be doing something wrong here? With the pulumi code bellow, if I am running it fresh for the first time I get an error saying that it can not delete the cluster, after it made all the stuff, it is even in AWS appropriately. If I run
pulumi up
again instead of updating the cluster it says the cluster already exists and errors out. so two questions: • Is this just because I have not had a fully successful run yet? • Am I doing something wrong here?
Copy code
const postgresqlCluster = new aws.rds.Cluster(`${name}-${env}-postgresql`, {
      clusterIdentifier: `${name}-${env}-us-east-1-aurora`,
      engine: aws.rds.EngineType.AuroraPostgresql,
      engineVersion: '16.4',
      availabilityZones: [
          "us-east-1a",
      ],
      databaseName: `${name}`,
      dbSubnetGroupName: rdsAuroraDbSubnetGroup.name,
      vpcSecurityGroupIds: [rdsAuroraSg.id],
      masterUsername: "postgresql",
      masterPassword: rdsConfig.require("rootPassword"),
      backupRetentionPeriod: 5,
      preferredBackupWindow: "07:00-09:00",
  });

const postgresqlInstance = new aws.rds.ClusterInstance(`${name}-${env}-postgresql-instance`, {
      clusterIdentifier: postgresqlCluster.id,
      identifier: `${name}-${env}-postgresql-instance`,
      instanceClass: 'db.r5.large',
      engine: aws.rds.EngineType.AuroraPostgresql,
      engineVersion: '16.4',
      publiclyAccessible: false,
  });
Waiting on the delete to finish and I will attach the errors.
This is with nothing installed on AWS that I am getting this on the first run.
Copy code
Diagnostics:
  pulumi:pulumi:Stack (peopleticker-dev):
    error: update failed

  aws:rds:Cluster (common-dev-postgresql):
    error: deleting urn:pulumi:dev::peopleticker::aws:rds/cluster:Cluster::common-dev-postgresql: 1 error occurred:
    	* RDS Cluster final_snapshot_identifier is required when skip_final_snapshot is false
    error: deleting urn:pulumi:dev::peopleticker::aws:rds/cluster:Cluster::common-dev-postgresql: 1 error occurred:
    	* RDS Cluster final_snapshot_identifier is required when skip_final_snapshot is false
    error: deleting urn:pulumi:dev::peopleticker::aws:rds/cluster:Cluster::common-dev-postgresql: 1 error occurred:
    	* RDS Cluster final_snapshot_identifier is required when skip_final_snapshot is false
    error: deleting urn:pulumi:dev::peopleticker::aws:rds/cluster:Cluster::common-dev-postgresql: 1 error occurred:
    	* RDS Cluster final_snapshot_identifier is required when skip_final_snapshot is fals
l
Since it's attempting to delete something, this isn't a fresh-for-the-first-time run. There is a cluster in your state and it's trying to delete it.
What's in your state?
a
hmmm, let me go and remove it all again. brb. I clearly have something in the state. When you delete something and run pulumi refresh, doesn't that update your state with the new real world?
l
No, not necessarily. You should check your state afterwards.
a
Okay let me do that. I made an assumption.
l
Plus, the error in there is saying that AWS is refusing to delete the DB because the DB is configured to make a final snapshot but the snapshot identifier hasn't been provided. That's not a Pulumi error.
a
Right, i was confiused as to why pulumi was trying to delete it right after creating it successfully.
It does create the cluster. then it tries to make the instances , then I get that error. I will look at the state first though because I bet I will see some wierdness there.
It was leaving stuff in the state. I cleaned it out. Let's see if that jsut fixes all the weird problems.
l
Have you double-checked the RDS instance? That error message implies to me that AWS refused to delete something. But I suppose Pulumi might have a sanity check that it run against the state config... 🤷
a
I did. It was gone. I deleted everything and ran pulumi refresh. I did that because pulumi would not update the cluster when I made chagnes to it. The first error I was getting was can not create cluster because it already exists.
It looks like it all built successfully this time, let me try updating the config again like I did before.
I may have just gotten myself into a bad state all around?
l
Yep. Check that skip_final_snapshot parameter, you'll want that to be true if you're not taking snapshots.
a
Copy code
++  └─ aws:rds:Cluster   common-dev-postgresql  **creating failed**     [diff: ~availabilityZones]; 1 error

Diagnostics:
  aws:rds:Cluster (common-dev-postgresql):
    error: 1 error occurred:
    	* creating RDS Cluster (common-dev-us-east-1-aurora): DBClusterAlreadyExistsFault: DB Cluster already exists
    	status code: 400, request id: c91e5be2-cf7c-4e0e-8e20-bdfcb4f01899
hmmm, still doing it.
and actually, I did not change anything. I ran pulumi up there and it said it needed to replace the instance and the cluster. Said the identifier changed from the state.
l
Are you checking the right account and region?
a
This is after I build it again with pulumi up. It worked this time. It created the cluster and the DB. Then I ran pulumi up again after that succeeded.
this is a different problem now
This is the original problem. In an attempt to fix this problem I got my state messed up. That is working and fixed after I cleaned the state manually.
l
Ok. Well,
availabilityZones
is marked int he docs as a replacement-triggering property. And you've set the name, which prevents Pulumi's suffix being used. So you need to add the delete-before-replace opt, or don't set the name.
And if you choose to not set the name, you need to run
pulumi destroy
with the old code first, before changing it.
a
I overlooked that completely.
l
The problem is that Pulumi wants to create the new cluster before destroying the old one, but it's using an AWS property that's causing a conflict.
Generally, do not set the name. Let Pulumi do it. It's for a good reason.
(Aside: don't put "obvious" info into a resource name, like the stack, region or type of resource; all that stuff can be figured out at a glance. Just put the important info in the name: maybe it'd be the purpose? The product name? Something about the resource)
a
unfortunately my predicessors did not break any of this out for the core infra, the pulumi code has 5000 lines of code just in the core. Pulumi destroy is difficult. Having to delete thing meticulously. as to the naming, we are using a microstack approach, and that is the naming scheme the guys before me used. There are over 5,400 resources in these stacks, so the naming scheme is set in stone at this point. until I can refactor this stuff. It is a huge web of dependencies though. We put the stack name, which is the environment, and the service name.
I assume there is no easy way to change all those names in one go.
I guess the db instance itself does not need the identifier, as long as the cluster it belongs to is named right. I can start there.
l
Just remove the
name
property. Since the name arg is right, Pulumi will make a name property correctly.