this might be a better place for this <https://pul...
# automation-api
b
b
I know that some of the Automation API SDKs launched without the ability to run
Up
commands concurrently within the same process. I'm not sure if
Destroy
would also be caught by that since it runs against your state and doesn't do much in-process. Hopefully someone else can answer
l
can you share your code?
b
sure not in completion, but enough
Copy code
func removeOldPulumiStacks(projectName string) {
	ctx := context.Background()

	ws, err := auto.NewLocalWorkspace(ctx, auto.Project(workspace.Project{
		Name:    tokens.PackageName(projectName),
		Runtime: workspace.NewProjectRuntimeInfo("go", nil),
	}))
	if err != nil {
		fmt.Println("Failed creating workspace from project name:", projectName)
		fmt.Println(err.Error())
		return
	}

	stacks, err := ws.ListStacks(ctx)
	if err != nil {
		fmt.Println("Failed listing stacks for project:", projectName)
		fmt.Println(err.Error())
		return
	}

	sevenDaysAgo := time.Now().Add(-7 * time.Hour * 24).UTC()
	ignoreStacks := []string{"prod", "build", "stage", "production", "staging"}

	var wg sync.WaitGroup
OUTER:
	for _, stack := range stacks {
		// Do nothing for stacks in our ignoreStacks slice
		// This is just an extra precaution
		for _, s := range ignoreStacks {
			if strings.Contains(stack.Name, s) {
				continue OUTER
			}
		}

		if stack.UpdateInProgress {
			continue
		}

		lastUpdated, err := time.Parse(time.RFC3339, stack.LastUpdate)
		if err != nil {
			fmt.Printf("Error parsing last update time for stack:\n%+v\n", stack)
			fmt.Println(err.Error())
			return
		}

		if lastUpdated.Before(sevenDaysAgo) && (strings.Contains(stack.Name, "ephemeral") || strings.Contains(stack.Name, "test")) {
			wg.Add(1)
			go deleteStack(ctx, stack, ws, &wg)
		}
	}

	wg.Wait()
}

func deleteStack(ctx context.Context, stack auto.StackSummary, ws auto.Workspace, wg *sync.WaitGroup) {
	defer wg.Done()
	fmt.Println("Deleting stack:", stack.Name)

	s, err := auto.SelectStack(ctx, stack.Name, ws)
	if err != nil {
		fmt.Printf("Error selecting stack:\n%+v\n", stack)
		fmt.Println(err.Error())
		return
	}

	_, err = s.Destroy(ctx)
	if err != nil {
		fmt.Printf("Error destroying stack:\n%+v\n", stack)
		fmt.Println(err.Error())
		return
	}

	err = ws.RemoveStack(ctx, stack.Name)
	if err != nil {
		fmt.Printf("Failed removing stack:\n%+v\n", stack)
		fmt.Println(err.Error())
		return
	}
	fmt.Println("Removed stack:", stack.Name)
}
if I remove the
go
then all runs as expected
we have quite a few stacks so this would be very helpful. I could also do it in batches if it has anything to do with the sheer number of them
lmk how else I can help. or if you figure anything out. thx
b
I wonder if you've got race conditions relating to
pulumi stack select
. The
master
branch still has the workspace selecting the stack before taking an action instead of using the
--stack
argument, which I know was part of a recent changeset
b
hmm, is there a newer version you want me to try?
definitely something related to parallelism. the error reads to me like a json encoding issue. which I have no idea why they would be "racing"
b
Sorry I could've sworn I was getting pinged on a PR about the
--stack
change but now I can't find it, and it doesn't look like it's in 3.0 yet either
Here it is: https://github.com/pulumi/pulumi/pull/6415 . Still don't know if that's related though I'm just spitballing
b
np i'll check out in morning. thx for responsiveness. maybe evan figures it out 🙏
l
Are you using 3.0 or 2.x? One additional thing to try if you are still hitting the error after bumping your SDK and CLI to 3.x would be to only run the destroys concurrently but serialize stack select (pseudo code):
Copy code
for _, s := range stacknames {
  stack := auto.SelectStack()...
 go stack.Destroy()...
}
r
@bored-oyster-3147 good catch on the stack select thing. I’m not sure what happened there but that change should’ve been in the changeset
b
3.0 auto API complains about anything less
destroy runs select
I can try it
also I believe speed improvement would be quite minimal if that's the only part I did in parallel. the select and removal of stack takes time.
Copy code
▶ pulumi version  
v3.0.0
@bored-oyster-3147 that actually makes more sense to me now. so all of them are trying to select stack with the CLI so the cli basically is competing for which sta ck is actually selected
b
yes it's just the CLI under the hood, so if the stack select is still a separate command and hasn't been changed to
--stack
directly on the
destroy
command than it is still affecting global state of the workspace and is thus a race condition
b
it get reverted or something? nvm, I see it's not merged on master
if yall plan on adding this sometime. then I can just leave the scaffolding for parallelism in my code. and update once it's released.
b
@red-match-15116's comment implies there must've been some oversight. It was just a mistake, should've been part of 3.0. Will wait for a response
🤘 1
b
thx for help. always have been impressed with responsiveness from pulumi team.
r
Lol second time I’ve had to say this in two days but I’m not a dude and I would please not assume that everyone on this slack is a dude. I use they/them pronouns - she/her is still fine but I’m def not a dude.
👍 1
b
I am not omnipresent. Sorry I offended you.
r
Okay now that that’s over, yes it was definitely an oversight we’re working on getting 3.1 out with some commits that were accidentally dropped for 3.0. Working on it right now.
b
thx @red-match-15116.
r
Hey! 3.1.0 is out which includes the stack selection fixes. Wanna give this another try?
b
yup will try now thx
@red-match-15116 where should I grab it from. I usually just get from AUR or the curl command prompted by the cli itself 🙂
ah see it in github release
r
Yeah they usually land on this page: https://www.pulumi.com/docs/get-started/install/versions/ but it’s still being updated I think. You’ll also want v3.1.0 of the go sdk.
b
kk got new error with just CLI update.
stderr: error: [409] Conflict: Another update is currently in progress.
updating go SDK now
I think it worked!
running 3.1 CLI with 3.0 sdk I think borked 1 stack to
pending_operations
when I ctrl-c. but that seems like it.
thanks @red-match-15116. we are good to go
r
Awesome!
b
So now it seems all the destroys did get triggered. but they all are failing I think every single one. and the sad thing is my code checks if last updated > 7 days ago XD. now I can't tell stale from not stale have to look at stdout of what stacks were signaled to destroy
give me a second. they are taking a while I want to see aprox how many fail and if I can get any more metrics. It does seem at least some have destroyed. what is interesting though is I call stackSelect --> destroy ---> remove. in serial and wouldn't have expected my functions to return until the stack is completely removed
if I run
pulumi destroy
or
pulumi stack rm
I believe it blocks until completion
yeah it seems most of them failed to destroy. then running
pulumi destroy -s my stack
seems to believe most of them have
pending_operations
this could be from me mistakenly running
3.1
CLI with
3.0
sdk. I will test with spinning up quite a few fresh test stacks
r
Thanks @busy-soccer-65968. Let me know what you find. I’ll try running some tests too.
👍 1
b
okay @red-match-15116 I believe the destroy stack failures are because the stacks have
pending_operations
from messing up the first try. I made 5 test stacks all with resources and then tested my code against them and all 5 deleted in parallel and also blocked until Stackselect --> destroy ---> remove completed per stack. and I verified inside web GUI. My only real suggestion would be maybe investigate failed destroys for `pending_operations`` otherwise I think i'll write off the problem as my own mistake. Thanks again for all the help. I'll be sure to let you all know if I find anything else fishy.
👍🏽 1
time to go through by hand and remove pending_ops 😢
r
Yeah the pending operations can be kinda painful, but there is a way to do it with automation api
You can do a
stack.Export()
, remove the pending operations key and then do
stack.Import()
👍 1
b
yah, tbh it's ~ 50 stacks. and copy and pasting stack names them all and writing new code would take longer than manually doing it for me. also, all the sub resources are already deleted from a separate process so I can just force remove the stack without worrying about dangling resources
good to know though. i'm excited to use the automation api more. pretty neat
removing CLI dependency would be awesome. then I can have a
FROM scratch
docker container to run this in
r
yah, tbh it’s ~ 50 stacks. and copy and pasting stack names them all and writing new code would take longer than manually doing it for me. also, all the sub resources are already deleted from a separate process so I can just force remove the stack without worrying about dangling resources
Just spitballing here, you could list stacks: https://pkg.go.dev/github.com/pulumi/pulumi/sdk/v3/go/auto#LocalWorkspace.ListStacks And then loop through and export: https://pkg.go.dev/github.com/pulumi/pulumi/sdk/v3/go/auto#LocalWorkspace.ExportStack remove the pending ops and then import: https://pkg.go.dev/github.com/pulumi/pulumi/sdk/v3/go/auto#LocalWorkspace.ImportStack
(Not saying do this now, just saying it’s possible)
b
totally! the challenge is knowing which ones are the ones I wanted to deleted. my criteria was last updated older than 7 days ago. but after failed destroy it is last updated to the destroy attempt. mixed in with lots of other stacks
👍🏽 1
anyhow, thanks again. keep on being awesome 🍻
🍻 2
partypus 8bit 2