We'd like to use the official PostgreSQL provider ...
# general
c
We'd like to use the official PostgreSQL provider targeting a PG instance within a Kubernetes cluster -- an instance which itself is managed with the Pulumi Kubernetes provider. We'd like to use on-the-fly Kubernetes port forwarding for such a scenario. How would one go about doing this within the same stack? • We're exploring creating a custom provider "based on" the existing one. In simplistic OOP terms we'd "override"/wrap provider methods to set up the port forwarding before the "base class" call and tear it down afterwards. We're using TypeScript. However, it's unclear if there is any SDK to target another resource provider directly, or if the relevant gRPC interface is meant to be used for any other client than the Pulumi engine. • We could use different stacks and leverage the Pulumi automation API (we're already using it for many other scenarios). However in this case this stack boundary doesn't feel right/natural in our architecture. Do you think the scenario is common enough to be addressed in the way we want (one stack), or should we work harder to prove ourselves wrong on this point (justify multiple stacks)? • We could open up direct/public access to the PG instance. It's likely possible to do it sufficiently safely but again it doesn't feel like we'd be doing it for the right reasons. Could we be wrong about that? Thanks for reading, any input/opinions would be greatly appreciated.
l
I'm not sure what language you're in, but this sounds similar to a use case I have, though I'm using AWS SSM instead of K8s port forwarding. But I think the solution is the same.
Which is essentially to "mid Pulumi script" just bring up the tunnel you need.
For SSM this is quite ugly -- basically spawn a process and control the port it picks then use that hostname for subsequent connections. But for K8s you might be able to keep it in-process if the SDK lets you port-forward directly. If that makes sense?
Snippet for SSM general idea in TypeScript/Node:
Copy code
// Normal Pulumi code setting up the bastion (in your case K8s cluster)
  // ...

  // Spawn a tunnel (spawn here from Node's child_process) using the details
  // from the setup. bastion is an EC2 instance with SSM agent in my case. I
  // return the same resource to appease the type checker/have something I can
  // dependsOn later if need be.
  const bastionTunnel = bastion.id.apply(id => {
    spawn(
      "aws",
      [
        "--profile",
        config.aws.profile,
        "--region",
        config.aws.region,
        "ssm",
        "start-session",
        "--target",
        id,
        "--document-name",
        "AWS-StartPortForwardingSessionToRemoteHost",
        "--parameters",
        `{"host":["my-database-host"],"portNumber":["5432"], "localPortNumber":["9876"]}`,
      ],
    )

    return account.bastion
  })

  // Pulumi code that's going to set up the database. Pass in localhost:9876. Of course you
  // would likely not hardcode the port, but find a free one, then pass the URL in as a variable
  // or something
  new pg.Database("foo", { host: "localhost:9876", ... ])
c
@lively-crayon-44649
I'm not sure what language you're in
(TypeScript)
Which is essentially to "mid Pulumi script" just bring up the tunnel you need.
Indeed that sounds very much like what we need, what extension points/hooks did you use for that? (We couldn't think of anything else than the provider API, but we're not married to that idea for sure.) For reference, could you expand on what you mean by Pulumi script? I first imagined you mean the "up" process, but maybe you mean the declarative program execution? (Sounds like you indeed mean the Pulumi program execution from your last message -- digesting.)
l
Also I think local command
.run
will be better but I've not actually deployed this to prod yet -- just something I was playing around with and seems to work
c
OK this is kind of amazing to me (thanks for taking the time to dig up your code BTW). Is there really such a guarantee that the provider will run synchronously "during" instantiation of Pulumi resources? I initially thought the full resource graph was created and only then passed in one go for execution of the providers.
l
Well what's going to happen (n00b understanding) is that Pulumi runs the code to collect the resource definitions. But by using
apply
or a local command you are encoding a dependency between later resources and the tunnel you set up.
So it will establish the tunnel, then go on to traverse later resources (that may use it). That said, I don't think this code is perfect by a long shot -- for instance in my case I don't think I can "know" the tunnel is set up before continuing, because a synchronous
spawn
will block until process exit (which won't happen since the tunnel is long-running) and an asynchronous one will return immediately. So see how you get on I guess.
c
OK I think I gotcha yes -- probably not an explicit dependency but rather a temporal one due to the fact that it blocks the process running the Pulumi program (which in turn ensures that it will be up by the time the "up" process runs). This begs the question: have you thought about how to tear the tunnel down yet? I suspect that if you tear it down after instantiating the relevant resources in the Pulumi program, the approach would fail (i.e. it will actually be created and torn down before Pulumi even attempts to create the resources -- I might be very wrong and sort of hope that I am.)
l
I think you are wrong since Pulumi runs the code twice (I think -- again not an expert by a country mile)
Once for
preview
, once for
up
But if that's false -- I would cheat
Have the tunnel be torn down by virtue of process exit
(Or at least, I'd try that 😂 It might not work)
c
Thanks for the link to the
run
package BTW, I did not know about it.
l
That's not me actually -- I asked my question a couple of weeks ago and was lucky to receive help from Josh https://pulumi-community.slack.com/archives/C84L4E3N1/p1662470025640619
(FWIW I don't think you want remote commands, but they are also potentially useful to know about)
c
Cheers for the ref -- and hey spreading the knowledge is very appreciated in and of itself!
(FWIW I don't think you want remote commands, but they are also potentially useful to know about)
I do think the same yeah; best used with moderation.
I'm processing ideas in https://github.com/pulumi/pulumi-kubernetes/issues/1857 BTW. It does sound like OP didn't have a solution for tearing down the connection/tunnel either.
The tunnel is still open after
pulumi up
, and it will start a new one during the Previewing update stage.
Representing the connection/tunnel as a resource which other resources can depend on is an interesting approach too but I can't imagine that there would be a way to easily destroy them when necessary. I'll also consider your idea of cheating a little bit (e.g. try to register a hook with the runtime, nodejs in our case, to be called on process exit, which could just work). If you ever try out an approach feel free to ping in this thread (even a long time from now), I'd be happy to hear about it. And if anyone has any input for a.. erh, "_clean_" solution to setup+teardown, please chime in.
l
Adding such custom tasks to the Pulumi resource lifecycle is something we still have in our backlog: https://github.com/pulumi/pulumi/issues/9397 Do upvote if you are interested in having this.
c
@limited-rainbow-51650 Thanks for that reference, indeed that sounds very useful. We're using the Pulumi automation API in the present scenario though so we actually already do have good control over what to run before or after the
up
call. I think our tricky issue comes from the fact that we can only establish this "port forward" connection to the target resource (Postgres pod) after it is created -- it does not exist yet before the
up
call. As such, I don't know that the hooks proposed there would solve the scenario. (I suspect an XY problem https://en.wikipedia.org/wiki/XY_problem, probably related to our desire to keep both the creation of the Postgres service and its databases in the same stack as alluded to in my original message, but I am having a hard time convincing myself that this is such a bad idea. Implicit resources dependency management via Pulumi inputs and outputs is awesome. In contrast, orchestrating the setup of multiple stacks in the correct order feels like going in the wrong direction. It's a recurring issue we have with multiple internal services which need bootstrapping once they're up so we're paying reasonable attention to how we solve this given we'll probably use the same strategy many times.)
l
If you are using Pulumi Automation @colossal-caravan-76991 then you might find https://github.com/clstokes/pulumi-automation-sdk-ssh-tunnel useful
c
@lively-crayon-44649 Thanks for that sample! It's indeed an approach we're considering, but given the scenario that would require us to split into multiple oddly-delimited stacks (the conundrum explained above). We might just do that in the end if we have no other choice. Meanwhile we've decided to explore using resources like the Command resource you've pointed us to in order to establish and close the port forwards/tunnels before and after the target resources, using Pulumi dependencies to order that. Seeing such procedural resources in a declarative model doesn't feel too right, but the Command resource gives us some confidence that this may currently be unavoidable in certain edge cases. We hope to maybe think up a feature proposal for Pulumi (perhaps around language-agnostic resource provider extension/inter-calls) or, should anything like that be considered out-of-scope for the project, then perhaps see if the resource providers would adopt some conventions around generic ways to proxy connections when relevant (perhaps basic SOCKS support or something else). Either of those would yield back a fully declarative model, which seems better in the long run.