How should I handle errors best? I have an inline...
# automation-api
p
How should I handle errors best? I have an inline program in typescript that deploys a GKE cluster. Which errored out because i forgot to set the zone (fyi, it is marked as optional in the docs if you always pass in a location, but anyway, small effort to set it) But the weird thing is my application “hangs”. The in progress spinner in the pulumi site keeps spinning and my “job” stalled. I restarted my application so that is “solved” but the pulumi site is still spinning. I would expect if the program errors, that the
await stack.up()
would throw?
after 18 minutes it did stop spinning at the pulumi site, but i need to be able to catch and act on failures right?
It is not stopping in a different stack, already 32 minutes. After restarting i ofcourse get
[409] Conflict: Another update is currently in progress.
And the automation api does not have a stack.cancel(). hmmm, orchestrating this is not trivial based on the few examples in the announcement blogpost
l
Can you share your code? I wonder if there is an uncaught exception somewhere. Do you have a catch on the block containing automation api code? Something like this: https://github.com/pulumi/automation-api-examples/blob/main/nodejs/inlineProgram-ts/index.ts#L99
We have an open issue to implement cancel/import/export for nodejs: https://github.com/pulumi/pulumi/issues/5531 In the meantime, if you create a
Pulumi.yaml
file that matches your project, you can use those commands manually. In go, I have a "stack reset" command that I sometimes implement that looks like this: https://gist.github.com/EvanBoyle/f5e7c77f94851238d93efbefba1debfc This cancels any active updates, runs export/import to get rid of pending deletes, and then refreshes resources in the stack to get the state synced up.
p
Hey Evan, i shared all my code and how it works in support request #399
can you access it there?
I do have a try catch, but the issue is that stack.up() stalls when there is an exception inside the program. In this case provider auth issues. It all works when all the configs align. In the pulumi dashboard i see the error, but
await stack.up
keeps running, waiting and doing nothing.
It is all resolved when the error would bubble up instead of staying inside the “program”
We use (going to, building now, in the next week or so, replacing a more manual helmfile setup) the Automation API for on demand infra creation in various dynamic configurations. I need to be able to handle any failures cleanly. It is ok if they fail (well no, but it happens) but in no way can a job get stuck for hours until I restart our platform and manually execute pulumi commands on my computer.