https://pulumi.com logo
#automation-api
Title
# automation-api
w

worried-city-86458

06/16/2021, 7:15 PM
@bored-oyster-3147 following up from https://github.com/pulumi/pulumi/pull/7299, I did a quick test forcing an error which didn't work as expected and the next run has hung.
This is the forced failure:
Copy code
Changes:
 
    Type                                                                   Name                                                                 Operation
>   pulumi:pulumi:StackReference                                           pharos/aws-eks/alpha                                                 read
-   kubernetes:core:ServiceAccount                                         kube-system/aws-load-balancer-controller                             delete
-   kubernetes:<http://rbac.authorization.k8s.io:ClusterRoleBinding|rbac.authorization.k8s.io:ClusterRoleBinding>                cert-manager-controller-issuers                                      delete
-   kubernetes:<http://rbac.authorization.k8s.io:Role|rbac.authorization.k8s.io:Role>                              kube-system/cert-manager:leaderelection                              delete
-   kubernetes:<http://rbac.authorization.k8s.io:RoleBinding|rbac.authorization.k8s.io:RoleBinding>                       cert-manager/cert-manager-webhook:dynamic-serving                    delete
-   kubernetes:<http://rbac.authorization.k8s.io:Role|rbac.authorization.k8s.io:Role>                              kube-system/cert-manager-cainjector:leaderelection                   delete
-   kubernetes:<http://admissionregistration.k8s.io:ValidatingWebhookConfiguration|admissionregistration.k8s.io:ValidatingWebhookConfiguration> cert-manager-webhook                                                 delete
-   kubernetes:<http://rbac.authorization.k8s.io:ClusterRoleBinding|rbac.authorization.k8s.io:ClusterRoleBinding>                cert-manager-controller-orders                                       delete
-   kubernetes:<http://rbac.authorization.k8s.io:ClusterRole|rbac.authorization.k8s.io:ClusterRole>                       cert-manager-controller-clusterissuers                               delete
-   kubernetes:<http://rbac.authorization.k8s.io:ClusterRole|rbac.authorization.k8s.io:ClusterRole>                       cert-manager-controller-ingress-shim                                 delete
-   kubernetes:apps:Deployment                                             cert-manager/cert-manager                                            delete
-   kubernetes:<http://rbac.authorization.k8s.io:ClusterRole|rbac.authorization.k8s.io:ClusterRole>                       cert-manager-controller-approve:cert-manager-io                      delete
-   kubernetes:<http://rbac.authorization.k8s.io:ClusterRoleBinding|rbac.authorization.k8s.io:ClusterRoleBinding>                cert-manager-controller-clusterissuers                               delete
-   kubernetes:<http://rbac.authorization.k8s.io:ClusterRoleBinding|rbac.authorization.k8s.io:ClusterRoleBinding>                aws-load-balancer-controller-rolebinding                             delete
-   kubernetes:<http://admissionregistration.k8s.io:MutatingWebhookConfiguration|admissionregistration.k8s.io:MutatingWebhookConfiguration>   aws-load-balancer-webhook                                            delete
-   kubernetes:<http://apiextensions.k8s.io:CustomResourceDefinition|apiextensions.k8s.io:CustomResourceDefinition>               kube-system/aws-load-balancer-selfsigned-issuer                      delete
-   kubernetes:<http://rbac.authorization.k8s.io:ClusterRoleBinding|rbac.authorization.k8s.io:ClusterRoleBinding>                external-dns                                                         delete
-   kubernetes:core:Service                                                cert-manager/cert-manager                                            delete
-   kubernetes:<http://rbac.authorization.k8s.io:ClusterRole|rbac.authorization.k8s.io:ClusterRole>                       cert-manager-webhook:subjectaccessreviews                            delete
-   kubernetes:<http://rbac.authorization.k8s.io:ClusterRoleBinding|rbac.authorization.k8s.io:ClusterRoleBinding>                cert-manager-webhook:subjectaccessreviews                            delete
-   kubernetes:core:Service                                                alpha/internet-gateway                                               delete
-   kubernetes:<http://rbac.authorization.k8s.io:ClusterRole|rbac.authorization.k8s.io:ClusterRole>                       cert-manager-edit                                                    delete
-   kubernetes:apps:Deployment                                             kube-system/external-dns                                             delete
-   kubernetes:<http://rbac.authorization.k8s.io:RoleBinding|rbac.authorization.k8s.io:RoleBinding>                       kube-system/aws-load-balancer-controller-leader-election-rolebinding delete
-   kubernetes:<http://networking.k8s.io:Ingress|networking.k8s.io:Ingress>                                   alpha/internal-gateway                                               delete
-   kubernetes:<http://rbac.authorization.k8s.io:ClusterRole|rbac.authorization.k8s.io:ClusterRole>                       aws-load-balancer-controller-role                                    delete
-   kubernetes:<http://rbac.authorization.k8s.io:ClusterRoleBinding|rbac.authorization.k8s.io:ClusterRoleBinding>                cert-manager-controller-ingress-shim                                 delete
-   kubernetes:<http://rbac.authorization.k8s.io:ClusterRoleBinding|rbac.authorization.k8s.io:ClusterRoleBinding>                cert-manager-cainjector                                              delete
-   kubernetes:<http://rbac.authorization.k8s.io:ClusterRole|rbac.authorization.k8s.io:ClusterRole>                       cert-manager-cainjector                                              delete
-   kubernetes:<http://admissionregistration.k8s.io:MutatingWebhookConfiguration|admissionregistration.k8s.io:MutatingWebhookConfiguration>   cert-manager-webhook                                                 delete
-   kubernetes:core:Service                                                cert-manager/cert-manager-webhook                                    delete
-   kubernetes:<http://rbac.authorization.k8s.io:ClusterRole|rbac.authorization.k8s.io:ClusterRole>                       cert-manager-controller-certificates                                 delete
-   kubernetes:apps:Deployment                                             kube-system/aws-load-balancer-controller                             delete
-   kubernetes:apps:Deployment                                             cert-manager/cert-manager-cainjector                                 delete
-   kubernetes:<http://rbac.authorization.k8s.io:ClusterRoleBinding|rbac.authorization.k8s.io:ClusterRoleBinding>                cert-manager-controller-challenges                                   delete
-   kubernetes:core:Service                                                alpha/internal-gateway                                               delete
-   kubernetes:<http://rbac.authorization.k8s.io:RoleBinding|rbac.authorization.k8s.io:RoleBinding>                       kube-system/cert-manager-cainjector:leaderelection                   delete
-   kubernetes:core:ServiceAccount                                         cert-manager/cert-manager-cainjector                                 delete
-   kubernetes:core:Service                                                kube-system/aws-load-balancer-webhook-service                        delete
-   kubernetes:<http://rbac.authorization.k8s.io:ClusterRole|rbac.authorization.k8s.io:ClusterRole>                       external-dns                                                         delete
-   kubernetes:<http://rbac.authorization.k8s.io:Role|rbac.authorization.k8s.io:Role>                              cert-manager/cert-manager-webhook:dynamic-serving                    delete
-   kubernetes:<http://rbac.authorization.k8s.io:ClusterRoleBinding|rbac.authorization.k8s.io:ClusterRoleBinding>                cert-manager-controller-certificates                                 delete
-   kubernetes:<http://rbac.authorization.k8s.io:ClusterRole|rbac.authorization.k8s.io:ClusterRole>                       cert-manager-controller-orders                                       delete
-   kubernetes:<http://rbac.authorization.k8s.io:ClusterRole|rbac.authorization.k8s.io:ClusterRole>                       cert-manager-controller-issuers                                      delete
-   kubernetes:core:ServiceAccount                                         cert-manager/cert-manager-webhook                                    delete
-   kubernetes:<http://rbac.authorization.k8s.io:ClusterRoleBinding|rbac.authorization.k8s.io:ClusterRoleBinding>                cert-manager-controller-approve:cert-manager-io                      delete
-   kubernetes:<http://apiextensions.k8s.io:CustomResourceDefinition|apiextensions.k8s.io:CustomResourceDefinition>               kube-system/aws-load-balancer-serving-cert                           delete
-   kubernetes:<http://rbac.authorization.k8s.io:Role|rbac.authorization.k8s.io:Role>                              kube-system/aws-load-balancer-controller-leader-election-role        delete
-   kubernetes:<http://rbac.authorization.k8s.io:RoleBinding|rbac.authorization.k8s.io:RoleBinding>                       kube-system/cert-manager:leaderelection                              delete
-   kubernetes:<http://rbac.authorization.k8s.io:ClusterRole|rbac.authorization.k8s.io:ClusterRole>                       cert-manager-controller-challenges                                   delete
-   kubernetes:<http://rbac.authorization.k8s.io:ClusterRoleBinding|rbac.authorization.k8s.io:ClusterRoleBinding>                cert-manager-controller-certificatesigningrequests                   delete
-   kubernetes:<http://rbac.authorization.k8s.io:ClusterRole|rbac.authorization.k8s.io:ClusterRole>                       cert-manager-view                                                    delete
-   kubernetes:core:ServiceAccount                                         cert-manager/cert-manager                                            delete
-   kubernetes:core:ServiceAccount                                         kube-system/external-dns                                             delete
-   kubernetes:core:Service                                                kube-system/external-dns                                             delete
-   kubernetes:<http://rbac.authorization.k8s.io:ClusterRole|rbac.authorization.k8s.io:ClusterRole>                       cert-manager-controller-certificatesigningrequests                   delete
-   kubernetes:<http://admissionregistration.k8s.io:ValidatingWebhookConfiguration|admissionregistration.k8s.io:ValidatingWebhookConfiguration> aws-load-balancer-webhook                                            delete
-   kubernetes:<http://networking.k8s.io:Ingress|networking.k8s.io:Ingress>                                   alpha/internet-gateway                                               delete
-   kubernetes:apps:Deployment                                             cert-manager/cert-manager-webhook                                    delete
-   kubernetes:<http://apiextensions.k8s.io:CustomResourceDefinition|apiextensions.k8s.io:CustomResourceDefinition>               <http://challenges.acme.cert-manager.io|challenges.acme.cert-manager.io>                                      delete
-   kubernetes:<http://apiextensions.k8s.io:CustomResourceDefinition|apiextensions.k8s.io:CustomResourceDefinition>               <http://issuers.cert-manager.io|issuers.cert-manager.io>                                              delete
-   kubernetes:<http://apiextensions.k8s.io:CustomResourceDefinition|apiextensions.k8s.io:CustomResourceDefinition>               <http://clusterissuers.cert-manager.io|clusterissuers.cert-manager.io>                                       delete
 
Diagnostics:
 
pharos/k8s/alpha (pulumi:pulumi:Stack)
error: Running program 'D:\Devel\Mps\devops-gemini-pulumi\Gemini\bin\Debug\gemini.dll' failed with an unhandled exception:
Scriban.Syntax.ScriptRuntimeException: InternalGateway.yaml(5,17) : error : The variable or function `envName` was not found
   at void Scriban.TemplateContext.CheckVariableFound(ScriptVariable variable, bool found)
   at object Scriban.TemplateContext.GetValue(ScriptVariableGlobal variable)
   at object Scriban.Syntax.ScriptVariableGlobal.GetValue(TemplateContext context)
   at async ValueTask<object> Scriban.TemplateContext.GetOrSetValueAsync(ScriptExpression targetExpression, object valueToSet, bool setter)
   at async ValueTask<object> Scriban.TemplateContext.GetValueAsync(ScriptExpression target)
   at async ValueTask<object> Scriban.Syntax.ScriptVariable.EvaluateAsync(TemplateContext context)
   at async ValueTask<object> Scriban.TemplateContext.EvaluateAsync(ScriptNode scriptNode, bool aliasReturnedFunction) x 2
   at async ValueTask<object> Scriban.Syntax.ScriptExpressionStatement.EvaluateAsync(TemplateContext context)
   at async ValueTask<object> Scriban.TemplateContext.EvaluateAsync(ScriptNode scriptNode, bool aliasReturnedFunction) x 2
   at async ValueTask<object> Scriban.Syntax.ScriptBlockStatement.EvaluateAsync(TemplateContext context)
   at async ValueTask<object> Scriban.TemplateContext.EvaluateAsync(ScriptNode scriptNode, bool aliasReturnedFunction) x 2
   at async ValueTask<object> Scriban.Syntax.ScriptPage.EvaluateAsync(TemplateContext context)
   at async ValueTask<object> Scriban.TemplateContext.EvaluateAsync(ScriptNode scriptNode, bool aliasReturnedFunction) x 2
   at async ValueTask<object> Scriban.Template.EvaluateAndRenderAsync(TemplateContext context, bool render)
   at async ValueTask<string> Scriban.Template.RenderAsync(TemplateContext context)
   at async void Pulumi.Deployment+Runner+<>c__DisplayClass10_0.<WhileRunningAsync>g__HandleCompletion|0(?)+HandleCompletion(?) in /_/sdk/dotnet/Pulumi/Deployment/Deployment.Runner.cs:line 137
   at async Task<int> Pulumi.Deployment+Runner.WhileRunningAsync() in /_/sdk/dotnet/Pulumi/Deployment/Deployment.Runner.cs:line 177
 
Resources:
    - delete 61
    28 unchanged
 
Duration: 13s
Preview completed AFAICT
It still shows a bunch of deletes queued up as a result - would that happen if I ran up instead? I guess I'll have to try an actual update, once I can work out why it now hangs, and see what happens.
b

bored-oyster-3147

06/16/2021, 7:20 PM
are you using a pre-release version?
w

worried-city-86458

06/16/2021, 7:21 PM
Yes, latest alpha.
Basically, I ran preview with forced error, then I ran preview again which has hung - still spinning 25m later.
b

bored-oyster-3147

06/16/2021, 7:24 PM
I would like to verify that the latest alpha includes the changeset that I made
It says it went out 2 hours ago but I don't know what commit it was built from
w

worried-city-86458

06/16/2021, 7:26 PM
I can navigate to source and see your changes
b

bored-oyster-3147

06/16/2021, 7:29 PM
well I'm not sure why it would be hanging, especially on preview
was this an existing stack? Did you by chance already have pending deletes in it?
w

worried-city-86458

06/16/2021, 7:31 PM
It was an existing stack that was fully baked. All I did was tweak it to force an error.
b

bored-oyster-3147

06/16/2021, 7:32 PM
well if you can get a repro let me know
w

worried-city-86458

06/16/2021, 7:32 PM
Still hanging. Strange it hasn't timed out.
Had to kill the debug session:
Copy code
Changes:
 
    Type                         Name                 Operation
>   pulumi:pulumi:StackReference pharos/aws-eks/alpha read
 
Diagnostics:
 
pharos/k8s/alpha (pulumi:pulumi:Stack)
error: transport is closing
 
Resources:
    21 unchanged
 
Duration: 35m27s
@bored-oyster-3147 debugging into it it seems the exception is never thrown
i.e. it returns a preview result
b

bored-oyster-3147

06/16/2021, 7:50 PM
what does
LocalRuntimeService
look like
that implies there was no
CommandException
w

worried-city-86458

06/16/2021, 8:06 PM
@bored-oyster-3147
_callerContext.ExceptionDispatchInfo is null
Want to do a quick interactive session to debug it while I share my screen in slack?
b

bored-oyster-3147

06/16/2021, 8:10 PM
I'm busy at the moment - is that EDI instance null? that would cause an exception to not be thrown
do you by chance have anything in your inline program that would cause the exception to not bubble out of it?
w

worried-city-86458

06/16/2021, 8:18 PM
No, not catching at that scope.
b

bored-oyster-3147

06/16/2021, 8:21 PM
but the EDI instance is still null?
w

worried-city-86458

06/16/2021, 8:30 PM
Deployment.Runner.WhileRunningAsync.HandleCompletion
sees the exception, which is
Scriban.Syntax.ScriptRuntimeException
Calls
HandleExceptionAsync
which logs it and returns 32
Deployment.RunInlineAsync
then has
null
exceptionDispatchInfo
and returns 1 in lambda, returns
null
at end
LanguageRuntimeService.Run
then returns
new RunResponse()
b

bored-oyster-3147

06/16/2021, 8:32 PM
what is different about this exception
is it thrown in an apply or something?
w

worried-city-86458

06/16/2021, 8:35 PM
It does happen inside
Output.Create
b

bored-oyster-3147

06/16/2021, 8:36 PM
and are you using a inline program delegate or the generic TStack?
w

worried-city-86458

06/16/2021, 8:36 PM
Copy code
// gateways
new ConfigGroup("internal-gateway",
    new ConfigGroupArgs { Yaml = RenderTemplate("InternalGateway.yaml", ReadResource, new { Aws = AwsConfig }) },
    new ComponentResourceOptions { Provider = k8sProvider });
new ConfigGroup("internet-gateway",
    new ConfigGroupArgs { Yaml = RenderTemplate("InternetGateway.yaml", ReadResource, new { Aws = AwsConfig }) },
    new ComponentResourceOptions { Provider = k8sProvider });
I'm deriving from
Pulumi.Stack
(not directly)
ConfigGroupArgs.Yaml
is
InputList<string>
RenderTemplate
returns
Output<string>
b

bored-oyster-3147

06/16/2021, 8:38 PM
so you are using
PulumiFn.Create<TStack>
?
w

worried-city-86458

06/16/2021, 8:39 PM
Yes...
PulumiFn Create(IServiceProvider serviceProvider, Type stackType)
Copy code
var stackName = $"{Config.Pulumi.Organization.Name}/{settings.Environment.ToLower()}";
var stackArgs = new InlineProgramArgs(info.ProjectName, stackName, PulumiFn.Create(ServiceProvider, info.StackType))
{
    Logger = LoggerFactory.CreateLogger<Pulumi.Deployment>()
};
var stack = await LocalWorkspace.CreateOrSelectStackAsync(stackArgs);
b

bored-oyster-3147

06/16/2021, 8:40 PM
so not
PulumiFn.Create<TStack>
ok just making sure I'm looking in the right place
w

worried-city-86458

06/16/2021, 8:41 PM
No, the stack type is selected based on the "resources" to deploy, so not using the generic method.
b

bored-oyster-3147

06/16/2021, 9:02 PM
OK I have a failing test
👍 1
w

worried-city-86458

06/16/2021, 9:45 PM
My gut feel is
Pulumi.Deployment.Runner.RunAsync
and
WhileRunningAsync
are swallowing the exception when they should not: https://github.com/pulumi/pulumi/blob/258fb00bc2ecbd489af6d694a2204468cc7ca729/sdk/dotnet/Pulumi/Deployment/Deployment.Runner.cs#L61-L64 https://github.com/pulumi/pulumi/blob/258fb00bc2ecbd489af6d694a2204468cc7ca729/sdk/dotnet/Pulumi/Deployment/Deployment.Runner.cs#L179-L183 i.e. the try catch should be removed or rethrow (They do log the exception so maybe not removed)
Then any immediate exceptions in the stack ctor, or deferred via outputs, should propagate and be captured.
b

bored-oyster-3147

06/16/2021, 9:51 PM
yes exceptions in the in-flight tasks are being swallowed
And finally
LanguageRuntimeService.Run
where it would no longer be
null
and so return the "bail" response: https://github.com/pulumi/pulumi/blob/258fb00bc2ecbd489af6d694a2204468cc7ca729/sdk/dotnet/Pulumi.Automation/Runtime/LanguageRuntimeService.cs#L61
The bail response only sets the exception message, so the inner exception handler should still be called to log the full exception and rethrow.
b

bored-oyster-3147

06/16/2021, 10:17 PM
I went about it a little differently. Namely because
IRunner
is used by local programs too, and rethrowing in
IRunner
would cause other issues there
w

worried-city-86458

06/16/2021, 10:18 PM
Sounds good. I think we understand it either way.
b

bored-oyster-3147

06/16/2021, 10:21 PM
Also because the main issue here was the exit code not being threaded through for inline programs. At the very least we should've seen
CommandException
without an inline host exception, which was the first thing I fixed. Then needed to do some work to capture an aggregate of in-flight exceptions so that the explicit exception could bubble up
what a pain in the ass though. Me not touching
Deployment_Runner
sooner coming back to bite me
Glad you caught that!
🍺 1
w

worried-city-86458

06/16/2021, 10:45 PM
FWIW, I like your fix. LGTM.
🙌 1
b

bored-oyster-3147

06/22/2021, 6:27 PM
That PR to fix the swallowed exceptions was merged btw!
w

worried-city-86458

06/22/2021, 11:35 PM
Yeah, I'm already using the latest alpha. Much better now!