found an odd issue with AKS k8s clusters on azure....
# general
b
found an odd issue with AKS k8s clusters on azure.. posted an issue in pulumi-azure, but not sure if it's pulumi or the TF provider: https://github.com/pulumi/pulumi-azure/issues/182
c
@brave-angle-33257 a cluster that’s replaced is not updated in place.
It’s replaced in the same way that you’d replace a battery, say.
You replace it with a new one.
b
i think there are a couple things going on.. first, i dont think a tag change should do any replacing/creating/deleting, no ?
just change the tag
and if anything is "replaced", it should be there before and after the operation, right?
after the operation, it's not there anymore
so that's a delete in my opinion.. not a replacement
(but for a tag it should just do an update)
c
In that case, you’d have one resource definition that corresponds to two resources in azure, though, right?
That’s weird. Each
new Aks
is meant to correspond to a single cluster.
If you drill into
details
, it will show you precisely which operations are happening.
It will create replacement, replace, then delete the old thing.
b
ok.. so 2 things
#1 - should a tag change be doing a replacement cluster
c
Yeah, so for this, it might have to, yes.
b
#2 - the issue with it being deleted is probably because i'm setting the
name
property to be static, so it's not adding any random characters
c
replacement always results in a deletion of the old version.
setting
name
statically will just determine in which order this happens. If you set
name
, it gets deleted first; if you let pulumi name the object, the replacement is created first, and then the old cluster is deleted.
b
on my example you can see that my statically named cluster is no longer present after the "replacement"
c
Yes. That’s by design, right?
b
? if i replace a battery, as you mentioned
when im done, is there a battery ?
yes..
c
you throw it away lol
b
i think maybe we have different thoughts on what replace means
replace means to me, take out old one, throw it away
c
Ok, well, in any event, replace here does imply a deletion.
b
take new one, put it back into place of what you just threw away
replace yes implies deletion, but would also imply a "put back" right?
c
new Aks(...)
should refer to one cluster in the steady state.
put back where?
b
Copy code
Type                                         Name                                  Plan        Info
         pulumi:pulumi:Stack                          aks-backend-aks-backend-x-stage-eus2
     +-  ├─ azure:containerservice:KubernetesCluster  Backend-X-Stage-Eus2-Aks              replace     [diff: ~tags]
     +   └─ azure:role:Assignment                     b18a5cc9-3007-0006-0000-000000000000  create
after that runs, what would you expect the state of
Backend-X-Stage-Eus2-Aks
to be?
c
I would expect it to be a single AKS cluster.
b
so would i
c
A new one, built from the ground up.
b
its non-existent
there is no
Backend-X-Stage-Eus2-Aks
resource
c
Oh.
b
haha, yes
c
That’s not good.
Sorry, I totalyl misread this quesiton.
my b.
b
no prob
i can see that maybe new tags require new resource, sure
but id still expect there to be a resource!
c
let me look more closely so I can make sure to answer this correctly…
Yep this is a bug.
b
ok no prob, I updated my issue on github with a new comment, i can see how it's not completely clear
c
we will fix very soon.
b
cool thanks! Maybe I'll ask azure wth a cluster needs to be torn down to change tags.. that seems weird also.. but yea i dont see an 'update' command on the az cli either for tags
but, seems like 2 diff issues
appreciate the help 👍
c
It could be many things — pulumi engine scheduling delete erroneously, TF doing something weird, or Azure doing someting weird.
We’ve had bugs before that are similar, and ended up being in Azure. Like at some point Azure ASGs refused to boot up more than one Linux machine, but reported success…
b
yea.. azure seems to be full of weird gremlins
c
The tag thing, I’ll have to look into. TF usually does a pretty good job of saying when something has to be replaced.
b
let me just go to console and try to add a tag and see what happens
c
I was just about to do that lol
b
before:

https://s3-us-west-2.amazonaws.com/billeci-screenshots/Tags_-_Microsoft_Azure_2019-02-14_14-23-59.png

c
docs seem to indicate this is a
PATCH
and that would be odd if it was not in-place
alright I’ll add that context to the bug.
b

https://s3-us-west-2.amazonaws.com/billeci-screenshots/Tags_-_Microsoft_Azure_2019-02-14_14-24-54.png

doesn't seem to need a replace/delete etc
cool
i've hit a couple of these.. once I was told to go to TF github
another time was MS API github lol
c
Yeah, I would be very surprised if this required replacement, just going by the API spec.
PATCH
request == replacement? I think not.
b
and now one for pulumi! finally achieved quorum 🙂
yea
sorta would defeat purpose of a "tag" if it was required to rebuild the resource
c
indeed.
b
if it ends it to be a TF thing, let me know if you'd like me to post it there
c
Depending how deep it is, we tend to try to just fix them and upstream them. But we’ll see.
We do this with helm charts too. If someone has a bad experience we just fix + upstream if we can.
b
cool.. ive been doing aws for years with cloudformation and some custom stuff that was very similar to pulumi
then moved to azure (new job) and used pulumi for a full rebuild
now starting into our k8s adventure on aks
we'll probably endup using pulumi for everything, infra/cluster deploys and job deployment if i have my way
c
That will be an adventure, I’m sure.
b
yes 😕 not a huge fan of azure so far
c
I started our kube stuff, I’ll be interested to hear what you think.
b
"undifferentiated heavy lifting" should be their motto
c
They know who they’re selling to. They know you can’t compete in the cloud unless you have every feature and it’s better to have every feature than a few good features.
b
cool, yea im the infrastructure guy, we have a devops person who is more k8s saavy, im working on the identity/bootstrap stuff right now
which is sort of a special hell in azure world compared to the elegance of iam at aws
now that i have an existing resource in place (destroy/up).. i shall get back to it
c
we have a secret page called Kubernetes the Prod Way: https://pulumi.io/quickstart/k8s-the-prod-way/architecture.html
it’s mostly focused on GCP and AWS for now, but eventually it will become a “real thing” and include Azure.
b
awesome, i'll have a look
thank you for that!
w
FYI - I just dropped a note on https://github.com/pulumi/pulumi-azure/issues/182 about the underlying problem here as well as a workaround.
b
cool thank you, I got the email and checked it out already 👍
I just remembered I owe you guys an issue post for the 16.9 keyvault azure
i'll do that now