This isn't related to Pulumi but maybe there are s...
# kubernetes
This isn't related to Pulumi but maybe there are some Kubernetes experts here who can help. I need a custom load balancing strategy beyond what IPVS offers. Each request is a long-lived task and I need to direct it to an available worker. I already have a way to
for an available worker, but I can't figure how to integrate that logic with Kubernetes. I could try to use health checks to hide the busy pods but would like a cleaner solution. Any ideas?
This isn't best handled at the k8s layer either. Your workers should be in a separate deployment from one your clients are directly communicating with, and they should take jobs off of a queue; your service (let's call it "slowboy") should, upon receiving a request, put a job on the queue, and the job spec should include a taskId. When the work is done, the worker should hit an endpoint on slowboy and include the taskId in addition to the result payload; slowboy can now finally respond to the original request with the task result. This isn't the only way to do it, and as long as you intend to have long requests as the interface it's going to be suboptimal, but that's like the minimal transformation to what you have that I think gets you in a better place. You could use redis lists for your queue if you don't care about persisting jobs for long periods of time and can tolerate them getting lost. The heavyweight backend for your job queue semantics would be something like kafka, but I wouldn't introduce that for just this. If you do want to introduce kafka to your stack, check out upstash serverless kafka