<@UQ8K13T7X> <@UB39JFCKC> <@U018CPPTTAQ> alright t...
# contribute
b
@lemon-agent-27707 @tall-librarian-49374 @tall-needle-56640 alright the stack lifecycle operations are now implemented here . Would like to see some tests written but I'm not sure how you guys run tests for pulumi stacks? Anything else you'd like to see in this PR before publishing it for review? I believe it has everything except the: • workspace cancel/import/export • remote program support
l
b
alright ill work on fleshing out some tests before publishing
OK I have local workspace tests written. All are passing with the exception of the inline program test - the issue I am running into is with the HTTPS handshake. I am generating a self signed cert but it is still failing. Any idea what pulumi CLI is expecting from the cert? Any specific certificate extensions or values?
I'm checking out the
grpc-dotnet
repo to see if I can find an equivalent to the
grpc.credentials.createInsecure()
call we are making in the type script implementation but I don't think there is an equivalent.
Alright nevermind it is running with HTTP. Idk if that is desired but can return to it if not. @tall-needle-56640 there are unit tests now if you'd like to pull and take another look. Currently the only test that is failing is the inline program test, called
LocalWorkspaceTests.StackLifecycleInlineProgram
. It is failing I think because of something in
Deployment_Inline.cs
. The exception is as follows:
Note that I ran into issues running these unit tests locally on my workstation because I had some lingering pulumi credentials pointing to a custom backend. In order to replicate what the pulumi team is doing in their CI/CD you will want to make sure that the only credentials you have setup are credentials that are pointing to a pulumi cloud account.
The issue is that
Deployment.RunAsync
actually creates another instance of
Deployment
using a different constructor than
Deployment_Inline
and it is expecting these environment variables. We could just set those but that doesn't seem like the right way to go about it... something is very off to me that another
Deployment
needs to be constructed
OK I was able to get around that by slightly modifying our Deployment entrypoint. Currently we have the inline engine running but for some reason when the
PulumiFn
runs and instantiates an instance of
Pulumi.Config
that config object does not contain values that we had just saved to config. So the inline program is returning 1 output instead of 3. I'm pausing here I have to work on something else today.
l
Runtime options need to be set manually within the language host gRPC server when a request comes in. You'll likely need something similar to this: https://github.com/pulumi/pulumi/blob/master/sdk/nodejs/x/automation/server.ts#L87-L94
t
Also,
Deployment.RunAsync
runs with a singleton Deployment object. The new constructor in
Deplotment_Inline
is to move away from singletons.
b
@tall-needle-56640 well that didn't work, because the nested
CreateRunner()
method creates a new
Deployment
instance and that wasn't being passed the
RuntimeSettings
. It is now working because the first instance of
Deployment
that we create is the instance that is used for the Runner. @lemon-agent-27707 I think we are getting the runtime options where they need to be. But the file you linked does do something with
runtime.setAllConfig
that I think is our missing piece. Not sure what that looks like in dotnet
l
Adding some internal code to be able to set
AllConfig
would be a start.
b
OK all tests are passing right now with the singleton setup. Obviously that has concurrency issues and wouldn't work with a web API. The biggest issue right now with moving away from the singleton is this:
t
My code handled that. Why did you remove it?
That seems to also be what broke the config.
b
if that is the case, I am sorry. but I don't think that is the case. I did not remove anything that you had working. If we look at your merge: https://github.com/pulumi/pulumi/pull/5761/files/32a13a027e96967954de67b54d098a99b8ef0cf9..902ba1d8f1f51d6961bc234cd6e8d4b44973cf97 we are not doing any config settings here. we are also instantiating a Deployment and then calling RunInstanceAsync which calls the static implementation of RunAsync, which then called CreateRunner, which created a new Deployment (that didn't receive runtime settings), which created a new DeploymentInstance, which continued to use the static singleton.
t
I'm still trying to understand all the changes, but in this code you can see that we had an instance of
Deployment
to work with, but it was replaced by a static call that uses the singleton Deployment instance.
b
Copy code
private static IRunner CreateInlineRunner(RuntimeSettings? settings)
        {
            // Serilog.Log.Logger = new LoggerConfiguration().MinimumLevel.Debug().WriteTo.Console().CreateLogger();

            Serilog.Log.Debug("Deployment.RunInline called.");
            lock (_instanceLock)
            {
                if (_instance != null)
                    throw new NotSupportedException("Deployment.Run can only be called a single time.");

                Serilog.Log.Debug("Creating new inline Deployment.");
                var deployment = new Deployment(settings);
                Instance = new DeploymentInstance(deployment);
                return deployment._runner;
            }
        }
this code is code that I added when I was experimenting with passing the runtime settings through to the deployment that is used to create the isntnace. that is not code that was part of your PR
originally we were just using the same code that was always used to create the Runner. which used the static instance
t
Right, the referenced changes aren't from my PR, they show the changes from my PR.
The new code does not create a Runner, it uses the same runner from the same deployment instance.
I have pushed up changes of it working now - all tests passing - but it is using the singleton functionality. So from here we can work on trying to figure out how to deal with that Singleton. The problem is that the objects that access the static singleton
DeploymentInstance
in the
PulumiFn
- namely
ComponentResources
and
Resources
, are exposed types that are instantiated by the consumer. So they can't very well pass in a
DeploymentInstance
themselves. So we need to figure out a way for those types to access their instance.
t
But that's exactly what I'm talking about. Here's my code that avoids the singleton issue:
Copy code
var deployment = new Deployment(runInfo);
await deployment.RunInstanceAsync(() => _program());
And here's the new code that reintroduces it:
Copy code
await Deployment.RunInlineAsync(settings, this._program);
b
my whole point was that this code:
Copy code
await deployment.RunInstanceAsync(() => _program());
never avoided the singleton issue.
the singleton is not the Deployment itself.
t
How did it not?
b
the singleton that is a problem, is the static property
Instance
on
Deployment
both of those methods you post two comments up still end up referencing
Deployment.Instance
throughout. The singleton problem remains. Me changing it to
RunInlineAsync
was just a way for me to pass
RuntimeSettings
to the
Deployment
constructor.
Did you see how this code:
Copy code
var deployment = new Deployment(runInfo);
await deployment.RunInstanceAsync(() => _program());
Eventually reaches this code: https://github.com/orionstudt/pulumi/blob/040bb710df2b2f5be4557582d993270e46937be8/sdk/dotnet/Pulumi/Deployment/Deployment_Run.cs#L150 Does that make sense?
The problem with that was that we needed this line: https://github.com/orionstudt/pulumi/blob/040bb710df2b2f5be4557582d993270e46937be8/sdk/dotnet/Pulumi/Deployment/Deployment_Run.cs#L161 to be the one that passed
RuntimeSettings
through to
Deployment
. So all I did with my
RunInlineAsync
change was defer the construction of
Deployment
so that we were passing
RuntimeSettings
through at the right place.
unfortunately calls to
get Deployment.Instance
are littered throughout the existing exposed types so I think it will take a bit more work to move away from the singleton. We've made good progress though and your language runtime code was helpful. I'm sorry that was so confusing I'm sure I could've been more concise
t
Yeah, I thought I worked around that issue. I'm seeing the problem.
l
@tall-librarian-49374 can you comment on this. Working around global state would be ideal to enable parallel execution. For nodejs and python, the changes required to enable this are too invasive for the time being so we only allow one update per process for those SDKs. What would need to change in dotnet to enable multiple concurrent updates?
t
@bored-oyster-3147 Though doesn't
CreateInlineRunner
suffer from the same problem as
CreateRunner
? They still end up sharing a
Deployment
and a
Runner
. I think we're going to need to move away from the static methods and use something like this:
Copy code
internal Task<int> RunInstanceAsync(Action action)
    => RunAsync(() =>
    {
        action();
        return ImmutableDictionary<string, object?>.Empty;
    });

internal Task<int> RunInstanceAsync(Func<IDictionary<string, object?>> func)
    => RunInstanceAsync(() => Task.FromResult(func()));

internal Task<int> RunInstanceAsync(Func<Task<IDictionary<string, object?>>> func, StackOptions? options = null)
    => _runner.RunAsync(func, options);
b
We essentially need (in order to keep existing APIs) to have some kind of concurrent dictionary of deployment instances that we gate access to that the consumer constructed types can hit to get their instance. But even then we still have the problem of how do those types receiv whatever identifier indicates which instance is their own..
t
What would need to change in dotnet to enable multiple concurrent updates?
There’s a lot of reliance on shared state, currently. I haven’t looked at this in details since last April but it’s not trivial.
I guess 3.0 could be a good moment to take a stab on that
t
Even without parallel execution, we still have issues to resolve since A)`Deployment.RunAsync` get its settings from the environment, and B) Using a single instance of
Deployment
currently means that all deployments would share all settings. So maybe we don't address the global state right away, but we still need to make sure that there can be multiple
Deployment
instances, but that they can only run one at a time. Does that sound right?
b
Hmmm Dan I don’t think it is getting its settings from environment with the current language host implementation. Also currently the Deployment instance only lives as long as a single call to LanguageRuntimeService.Run. Can you maybe clarify what you mean with those 2 problems you listed? The only issue I see currently is the parallel execution
If we’re OK with a single executing deployment per process for now than I think this PR can be published
alright @lemon-agent-27707 & @tall-librarian-49374 I have marked this PR ready for review. It's just going into
pulumi/auto/dotnet
. I'm not entirely sure what is going on in the workflows that is causing it to fail, it seems unrelated to what I've done with the exception of the "update CHANGELOG.md" requirement. I can update that as part of this PR if you would like but I didn't know if I should since this PR isn't going into
master
.
t
Ah, I didn't see the second line:
Copy code
await Deployment.RunInlineAsync(settings, this._program);
Deployment.Instance = null!;
But doesn't that also introduce a race condition in
CreateRunner()
? See below (with logging removed):
Copy code
private static IRunner CreateRunner(Func<Deployment> deploymentFactory)
{
	lock (_instanceLock)
	{
		if (_instance != null)
			throw new NotSupportedException("Deployment.Run can only be called a single time.");

		var deployment = deploymentFactory();
		Instance = new DeploymentInstance(deployment);
		return deployment._runner;
	}
}
1. Thead A calls CreateRunner and gets the lock 2. Thread B calls CreateRunner and is blocked at the lock 3. Thread A releases the lock 4. Before thread A nulls out the instance variable, thread B gets the lock 5. Thread B throws an exception
b
hmmm well we also have the semaphore we are using in
LanguageRuntimeService
. I don't think the 2nd thread will be waiting in
CreateRunner
, it should be waiting in
LanguageRuntimeService
right?
t
There is locking in the
LanguageRuntimeService
, but this is a separate lock, so it should work correctly on its own.
b
The only entrypoint for Automation API is through
LanguageRuntimeService
. That lock does work correctly on its own when accessed from outside of the Automation API.
t
Yeah, I suppose that is true... It still smells, but I believe you're right.
b
I don't think there's a better way to do it as long as we have that static singleton - short of setting it to null inside the
lock
, but we can't
await
inside a
lock
block so I don't think we can get away with that. We need to set it to null at some point so that Up can be called again. I think the semaphore will suffice. At the moment though you made me realize it is a problem that
LanguageRuntimeService
is not a singleton. We need to instantiate it statically on the
XStack
object so that we are passing the same instance of it through to every web host we spin up, otherwise a subsequent call will get its own instance of
InlineLanguageHost
which will new up its own instance of
LanguageRuntimeService
.
or we just make the semaphore static, probably better
t
Oh OK. I has assumed there would only be one web host. Agreed on static semaphore
b
Wonder if it is worth experimenting with AsyncLocal or ThreadLocal or one of those types on the Deployment.Instance in order to achieve parallel execution..
Well I was right. I got parallel execution working using
AsyncLocal<T>
.
AsyncLocal<T>
is, as far as I can tell, the perfect use-case for us because we are setting
Deployment.Instance
once at the very top of asynchronous execution of
Deployment.RunAsync
and then never setting it again. I wrote a test for it and I believe this made the semaphore in
LanguageRuntimeService
obsolete so I removed that as well. The test is a bit verbose but what can ya do.
t
Doesn't
lock (_instanceLock)
still prevent parallel execution?
b
only for those 5 lines. Which is fine I think so we still have a single place where we are setting
Deployment.Instance
the point is the next thread won't throw on
_instance != null
because it will be null for the next thread
functionality remains the same for console apps
could probably just strip the instance locking out with no loss of functionality since our entrypoints into
Deployment
are all async so they should all get their own instance
verified that the instance locking is no longer necessary. Can check that in but I'm trying to pause commits so they can review when they get the chance
l
Sorry for being quiet on my end. Have had a busy week and am planning on reviewing the PR on Friday.
b
All good I am currently working on merging with current master and updating the PR to point to master so that we got the new workflows & so that it is g2g once you are able to review. I think I just got that working.
👍 1
l
Sorry, I did not have time to get through the full review today. This is at the top of my list for Tuesday morning (holiday weekend here)
👌 1
b
Same here. Enjoy the long weekend!