I am setting up neo4j cluster using helm chart on ...
# general
s
I am setting up neo4j cluster using helm chart on gke in Pulumi. Helm values setup neo4j with node selector to pool1 but it errors out complaining there is no node exists with labels. I also checked this is not node selector label values as other deployments in Pulumi work with same, neo4j helm setup without pulumi also accepts node selector so not sure what is happening with Pulumi when it should be same. Here is the code gist and error trace below.
Copy code
error: Error: invocation of kubernetes:helm:template returned an error: failed to generate YAML for specified Helm chart: failed to create chart from template: execution error at (neo4j-cluster-core/templates/_helpers.tpl:73:16): No node exists in the cluster which has all the below labels (.Values.nodeSelector)
     %smap[<http://cloud.google.com/gke-nodepool:primarynodes-f059c87|cloud.google.com/gke-nodepool:primarynodes-f059c87>]
        at Object.callback (workspace/code_play/typescript_projects/dev/gke-cluster/k8s/node_modules/@pulumi/runtime/invoke.ts:172:33)
        at Object.onReceiveStatus (/workspace/code_play/typescript_projects/dev/gke-cluster/k8s/node_modules/@grpc/grpc-js/src/client.ts:338:26)
        at Object.onReceiveStatus (/workspace/code_play/typescript_projects/dev/gke-cluster/k8s/node_modules/@grpc/grpc-js/src/client-interceptors.ts:426:34)
        at Object.onReceiveStatus (/workspace/code_play/typescript_projects/dev/gke-cluster/k8s/node_modules/@grpc/grpc-js/src/client-interceptors.ts:389:48)
        at /workspace/code_play/typescript_projects/dev/gke-cluster/k8s/node_modules/@grpc/grpc-js/src/call-stream.ts:276:24
        at processTicksAndRejections (node:internal/process/task_queues:77:11)
I have also tried transforms to no help as well.
b
Where is the code for your cluster? You don’t seem to have a node pool with the right label on it
s
Here is the code for gke cluster provisioning , it is pretty standard provisioning two node pools
I am able to set up node selector in helm values without Pulumi, and the node pool label works with Pulumi deploy, so it is only with Helm setup above that getting node pool error
b
You’re not adding labels to the nodes though
So when deploying with helm and not Pulumi the pod probably never gets scheduled
Pulumi is just telling you about it
s
Labels get assigned by gke when node pool is created and is also visible when we do kubectl get nodes --show-labels. When deploying with Helm and not Pulumi, using the node selector the chart gets deployed with no errors.
Any reason why this should error out with Pulumi even though the labels are assigned to. node, also I can use similar labels for deployment with Pulumi which works
b
You need to check the event logs. Pulumi is just calling the API
s
will check the event logs, also if the helm setup a secret which is not labeled unique, how do I avoid the duplicate resource issue? Should I use transforms or any other option
d
The chart needs Tiller, which pulumi stubs. Use helm.Release instead
You should also define your own labels instead of relying on the pool name. I found pulumi to be pretty aggressive when it comes to changing settings, in that it causes a pool recreation more than it should (I use Google-native to work around this)
As for duplicate secret. It sounds like you don't have anything you care about losing yet, you should do a helm uninstall if you have lingering resources from your experiments
s
will try with helm.Release and setting a custom label. For the duplicate resource, it is the neo4j core (3 cores) and the chart uses same name for secret while other resources are labeled with resource-1, resources-2 etc. Should I apply a transform in this care or how do you get past the duplicate resource issue
Also if in case of transform, would that work with helm.Release?
d
It's worth reading their guidance on how to deploy multiple nodes. I find reading the default Values.yml to be helpful when working with helm
I'm out anyway. No sleep last night, and pretty much 5 hours over the weekend. Have a good one!
s
will do that, thanks for you help and do get some rest 🙂
d
Did you get to the bottom of this?
s
Yes, the new label also did not help. I ended up setting a taint on the node pool and then passing that in Helm.Chart, tolerations weren’t recognized in the helm.Release.
This was not accepted in the values for helm.Release, would be good to have that work
Copy code
podSpec: {
        "tolerations": {
              key: "dedicated",
              operator: "Equal",
              value: "neo",
              effect: "PreferNoSchedule",
        }
      },
d
Is this with the node pool already existing, or are they being created at the same time in the same
up
? The label suggestion was to make future changes to the nodepool less painful, as pools can end up being recreated, which changes the name. Glad you got it working though
s
This was with the node pool existing but glad to have work around. Would be good to have Pulumi be more reliable in some of the above
d
It's a documented limitation, which is why both Chart and Release exist. Chart is replicated helm with bits missing. Release actually runs helm
To clarify, you set neo4j.podspec.tolerations with Release, and it didn't work?
s
yes that is correct
I understand there are limitations, other than that using Pulumi has been great.
b
Chart is replicated helm with bits missing
This isn’t correct. Chart uses
helm template
to render charts into Kubernetes manifests so it doesn’t support things like helm hooks but allows transformations. Helm release uses the helm release API
This was not accepted in the values for helm.Release, would be good to have that work
Can you share your full
helm.Release
code please?
this sort of stuff is usually a nested values error
d
The docs say "equivalent to helm template". Does this need updating?
b
no that’s correct
d
Kk, thanks for clarifying
s
Here is the code with tolerations on helm.Release and taints on the node pool, hopefully you catch something I am doing wrong.
d
Their Values.yml has it as a list of objects, instead of just an object. Can't see anything else out of that
s
I tried array option but that was not accepted in helm.Release for tolerations
will try different options and see, like upgrading the helm with that
got this working and I am trying out the ingress patch after enabling SSA provider. The steps I am doing is creating an ingress followed with an issuer (cert-manager) after which I am patching the ingress with tls and annotations to have created issuer. While running pulumi I am getting an error that there are no backend rules must be specified so do we need to specify the rules again if they were defined while creating the ingress. Also are there any examples, I have set the metadata and tls values
Copy code
resource default/api-ingress was not successfully created by the Kubernetes API server : Ingress.extensions "api-ingress-dev" is invalid: spec: Invalid value: []networking.IngressRule(nil): either `defaultBackend` or `rules` must be specified
d
Can we see both the creation and the patch, please? Original Ingress creation from the other thread: https://gist.github.com/seeker815/1abb0f94adfbf342a39c3534b350b4c8
Is the error happening on the Ingress resource, or the IngressPatch?
s
It is happening on the patch, after adding the rules it times out waiting for reply, here is the ingress and patch code.
d
Oh... Yeah, the error is actually spot on. You need to specify the namespace
s
aah my bad, other than that in patch what else should I add, do we need to re-add the rules?
also I think skipawait to be added in case it times out?
d
You should be able to drop the rules on your patch after setting the namespace
You're missing clusterProvider on your Ingress btw
s
alright will try that, thanks again for spotting the mistake
d
I set this on all my stacks now, to prevent accidents from happening, like it using my local kubeconfig instead of the one specified in my infra stack https://www.pulumi.com/blog/disable-default-providers/#how-to-disable-default-providers
You really don't need to specify dependsOn as well. If you use outputs from a resource as an input to another resource, pulumi knows there's a dependency. The exception here would be your
issuer
resource, which still needs an explicit dependsOn on certmanager. Given you have skipAwait defined there, I'm unsure what the behavior would be, as the CRDs may or may not be loaded in time
Though it's alright to be explicit. Comes down to personal/team preference in the end
I'm super interested in if this code works from scratch. If it does, you should definitely post it as an example, as it demonstrates multi-stage resource creation in pulumi. Not sure I've seen it in the wild yet
s
yes this makes a good example and I wasn’t sure it would work in the first run even though it did in a different namespace and timed out.
my only concern is if gcp lb delays and that doesnt’ map
d
I believe you can extend the timeout. K8s is for eventual consistency. + gcp LB can be slow to create/update sometimes. I've yet to test the new Gateway controller though. Perhaps that fairs better, but still alpha/beta status for cert-manager support
s
Ingress + patch worked creating the certificate and ingress, however making any changes to the patch or the ingress gives me the error even though we have the rules specified in both ingress and patch. Trying to remove the resource also doesn’t work but gives the same error,
Copy code
Ingress.extensions "api-ingress-dev" is invalid: spec: Invalid value: []networking.IngressRule(nil): either `defaultBackend` or `rules` must be specified
there could be an issue with this being immutable after adding server side apply, I have also added the flag in case the multiple controllers are causing this. How do we find the urn of the resource so that I can do force delete in the state or is there a better option?
Copy code
"<http://pulumi.com/patchForce|pulumi.com/patchForce>": "true",`
pulumi stack -u lists the urn, thinking whats the best way to fix the above error rather than delete