I’m trying to deploy a pulumi stack that was alrea...
# getting-started
h
I’m trying to deploy a pulumi stack that was already created by another developer. I’m on teh teams version and the operation failed b/c my mac force restarted on me. Since pulumi did not successfully run, I delete the EKS cluster in the AWS Console. I’ve since been trying to force pulumi to delete all the remaining elements and start from scratch but whether destroying the stack, updating the state, or removing the pending operations, pulumi cli won’t successfully do anything.
Copy code
$  pulumi destroy                                                                                                                                                 git:peer_to_rds*
Previewing destroy (ironnetcybersecurity/jcc)

View Live: <https://app.pulumi.com/ironnetcybersecurity/salamander/jcc/previews/7ac2218e-7bbe-4d24-9f10-27a93537ba82>
Do you want to perform this destroy? yes
Destroying (ironnetcybersecurity/jcc)

View Live: <https://app.pulumi.com/ironnetcybersecurity/salamander/jcc/updates/114>

     Type                             Name             Status                  Info
     pulumi:pulumi:Stack              salamander-jcc   **failed**              1 error; 1 warning
 -   └─ kubernetes:core/v1:Namespace  global-services  **deleting failed**     1 error

Diagnostics:
  pulumi:pulumi:Stack (salamander-jcc):
    warning: Attempting to deploy or update resources with 1 pending operations from previous deployment.
      * urn:pulumi:jcc::salamander::kubernetes:core/v1:Namespace::global-services, interrupted while deleting
    These resources are in an unknown state because the Pulumi CLI was interrupted while
    waiting for changes to these resources to complete. You should confirm whether or not the
    operations listed completed successfully by checking the state of the appropriate provider.
    For example, if you are using AWS, you can confirm using the AWS Console.

    Once you have confirmed the status of the interrupted operations, you can repair your stack
    using 'pulumi refresh' which will refresh the state from the provider you are using and
    clear the pending operations if there are any.

    Note that 'pulumi refresh' will not clear pending CREATE operations since those could have resulted in resources
    which are not tracked by pulumi. To repair the stack and remove pending CREATE operation,
    use 'pulumi stack export' which will  export your stack to a file. For each operation that succeeded,
    remove that operation from the "pending_operations" section of the file. Once this is complete,
    use 'pulumi stack import' to import the repaired stack.
    error: update failed

  kubernetes:core/v1:Namespace (global-services):
    error: Timeout occurred polling for 'global-services'







$ pulumi cancel                                                                                                                                                  git:peer_to_rds*
This will irreversibly cancel the currently running update for 'jcc'!
Please confirm that this is what you'd like to do by typing ("jcc"): jcc
error: [409] Conflict: The Update has already completed

$ pulumi refresh
l
Since
pulumi refresh
isn't fixing this, you'll need to resolve it manually. Export your stack, delete your pending operations, import it again.
Occasionally you also have to find a resource that's been marked as deleted and tidy that up. There'll be two copies of it, one with
"deleted": true
, Delete that copy.
h
yeah I’ve already removed pending operations and I don’t see either of this currently existing in that stack file.
Copy code
blue ➤ pulumi stack export | grep "deleted"                                                                                                                           git:peer_to_rds*
blue ➤ pulumi stack export | grep "pending"                                                                                                                           git:peer_to_rds*
                    "acceptStatus": "pending-acceptance",
I’ve already tried to delete the individual resources inside of my state, but it doesn’t find some of the resources it says it is trying to delete
Copy code
Diagnostic:
    kubernetes:core/v1:ConfigMap (grampus-cluster-nodeAccess):
    warning: configured Kubernetes cluster is unreachable: unable to load schema information from the API server: Get "<https://C1143BA1BB162FB02862B32FB3AF21B0.yl4.us-west-1.eks.amazonaws.com/openapi/v2?timeout=32s>": dial tcp: lookup <http://C1143BA1BB162FB02862B32FB3AF21B0.yl4.us-west-1.eks.amazonaws.com|C1143BA1BB162FB02862B32FB3AF21B0.yl4.us-west-1.eks.amazonaws.com> on 10.40.13.197:53: no such host
    error: Preview failed: failed to read resource state due to unreachable cluster. If the cluster has been deleted, you can edit the pulumi state to remove this resource

error: No such resource "urn:pulumi:jcc::salamander::eks:index:Clusterore/v1:ConfigMap::grampus-cluster-nodeAccess" exists in the current state
l
To remove manually-deleted resources from state, you can use
pulumi state delete
(Or edit the file.. but
pulumi state delete
is easier 🙂 )
h
the problem is that the state does not have those resources in it by the urn
Copy code
blue ➤ pulumi state delete -y urn:pulumi:jcc::salamander::eks:index:Cluster$kubernetes:core/v1:ConfigMap::grampus-cluster-nodeAccess                                  git:peer_to_rds*
error: No such resource "urn:pulumi:jcc::salamander::eks:index:Clusterore/v1:ConfigMap::grampus-cluster-nodeAccess" exists in the current state
l
You're not escaping properly. Wrap the whole thing in single quotes.
Before: dex:Cluster$kubernetes:core After: dex:Clusterore
h
is there any way to just force delete all of this successfully so I can start over from scratch? I’m now having to go down and individual deleting the resources which have some crazy dependencies that make it difficult to do correctly
l
You can remove the stack and state in Pulumi, and manually remove the resources from AWS. I'm afraid the problem here has been caused by the force restart / manual resource deletion. Attempting to manage Pulumi-managed resources is always fraught.
h
can I delete the resources that are associated with that stack before i remove that stack (that pulumi still knows about)
l
Yes. Once you've cleared out the bogus resources, then
pulumi destroy
will do the rest.
You may need to clean up a few resources to get Pulumi happy, but it will be able to finish the job from there.
h
can I print the state to a file so I can mark them all at once that need to be deleted?
the issue I have now is I want to delete as many resources for this as stack as possible so I am not charged for them, but finding them all inside of AWS Console will not be possible. I want to let pulumi find all of the resources it knows about and delete those, even if it is not aware of some of those resources. I’ve edited some of the pulumi state and deleted some of the unknown resources but I can’t tell which are left anymore b/c they have downstream dependencies
i think this continued but now it happened to 3 new resource types urnpulumijcc:salamandereksindex:Cluster$eksindexRandomSuffix::jcc-cluster-cfnStackName, interrupted while creating * urnpulumijcc:salamandereksindex:Cluster$eksindexServiceRole$awsiam/roleRole::jcc-cluster-instanceRole-role, interrupted while creating * urnpulumijcc:salamandereksindex:Cluster$awsec2/securityGroupSecurityGroup::jcc-cluster-eksClusterSecurityGroup, interrupted while creating
l
When this last happened to me, I used a resolution loop something like this: 1. Run
pulumi preview
to identify a problem (which was always, Pulumi thought resource X existed but really it didn't). 2. If there wasn't a problem, run
pulumi destroy
and end. 3. Else delete the problem resource(s) from state (
pulumi state delete...
). 4. Go to 1.
h
thanks this eventually started working!
It’s making me wonder if we should refactor to use the AWS Native or CDK pulumi packages instead of these terraform providers
l
What terraform providers? There's no terraform involved. Also, issues with Pulumi's state getting mixed up because of crashing during a deployment happen no matter what provider you use. This happens every now and then, unfortunately. The only real mitigation is to reduce the size of your stack. Small stacks take less time to recover.