:wave: Hello pulumi! I have landed here after bei...
# general
c
šŸ‘‹ Hello pulumi! I have landed here after being bruised and brutalized by cloudformation, terraform, and terragrunt… I’ve been trying to come up with an elegant way to deploy our fairly simple infrastructure elegantly across peer regions, and I’m hoping pulumi can help 😬
šŸ‘ 3
w
šŸ‘‹ welcome! I hope we are able to help! Are you encountering any issues so far or have specific questions? Some questions on what you are doing: What kind of infra are you trying to deploy? What is your multi-region strategy? Hot/hot, hot/cold, just an easy ability to redeploy to a new region as DR?
Thanks for giving Pulumi a try!
c
I don’t know what formal names my strategy might fall under, but I’m building out a service that cannot have a single point of failure (e.g. a region being degraded or down). So that means my regions are PEERS with no primary region (at this point, most of the internet has left the room). Deployments will take advantage of some canary stuff — i.e. drain traffic from one region, push out new code there, bring back traffic slowly and if the errors don’t blow the threshold push code to the other region. So far things are progressing… MRAPs got created ok, but some snags with global DDB tables and the encryption settings that I’m working through… the copilot suggestions don’t seem to reflect what’s going on 100%… but I’ll see if I can do a destroy/up cycle and get things back on track.
w
Do you happen to know what your rpo and rto are? If you have an rto within minutes then I would imagine a hot/hot deployment is needed. Then you would be pretty easily able to load balance traffic between instances (in different regions or otherwise) during failovers or blue/green deployments. It sounds like you are already going down this path which is good. In regards to pulumi helping, we are quite a bit more flexible as iac using general purpose langs vs DSL based solutions which makes it easier to integrate dynamically configured deployments with easy to read code (yay real conditionals!). I believe we have some docs/material on blue/green style setups. I can try to find them if that would be helpful.
To really make the regions ā€œpeersā€, you could configure your lb to also split traffic to them during normal operations if you wish. However, I’m not sure if there are advantages to that so long as automated failover works and there could be some disadvantages from a performance standpoint if users aren’t routed correctly.
c
I’ve been hashing this out for a while now and I think my approach is sound and correct — it might take longer than is practical to explain the reasons and caveats, but my POC worked perfectly under load. But getting a multi-region infra into an IaC solution has proven far more challenging than I expected (I’m a developer and not normally tasked with devops/SRE stuff). I don’t even know what RPO and RTO are.
w
totally hear you, its crazy how difficult it is! RPO and RTO are terms to help put requirements in place for your disaster recovery process (why you would want to be multi region outside of location and scale reasons). RPO = recovery point objective which is "how much data loss are we willing to accept" and it drives how often you need to do things like snapshot the db. RTO = recovery time objective which is "how often are we allowed to be down for" which drives how you setup youre failover processes. Its cheaper to just run one instance, so if your RTO is 1+ days than probably fine to just do that as you can get someone to manually fix things in that window. If your RTO is 1 min, than you need a hot/hot multi-region setup because you cant even wait for the other region to get deployed to or spin up a new instance. Figuring out how to actually setup the infra to meet those requirements is already difficult with a lot of concerns (data replication, networking, session stickiness, etc). Addressing how to then get it into IaC adds some more up front complexity but pays a lot of dividends when you want to continue iterating on that infra without breaking things.
Is there a specific part of getting your soln into iac you are struggling with?
c
Oh man… I could uncork pretty hard on how bad terraform is at abstracting regions. Terragrunt offered some simplifications but it really totally failed to solve what were the biggest pain points IMO. As a developer, I still cannot believe that those are The Toolsā„¢. Egregiously bad experience trying to use them.
w
and now you know why we created Pulumi!
Tf and other DSL based tools are great but they keep needing to add all these odd "tricks" (like using count = isTrue ? 1 : 0) to handle more complex setups like multi-region. Starting with a general purpose language means you have a lot more flexibility at your hands
I find it somewhat similar to how things evolved on the FE. beforehand there were all these special syntax to do things such as produce an array of html elements from an array of data objects. What ppl really wanted was to just use
map
and
forEach
and other functions they already know from writing JS. Now most of the popular FE frameworks work that way but it didnt start that way (looking at you Angular)
c
Yeah, I can REALLY see why you created this. And THANK YOU. So far this morning I’ve gotten further in 3 hours than I did in a week and a half of terragrunt…
šŸ™Œ 1
šŸš€ 1