Hi I m trying to destroy a stack but there is a security gro Pulumi Community #aws

Hi. I’m trying to destroy a stack but there is a s...

lemon-plumber-12041

06/10/2024, 6:32 PM

Hi. I’m trying to destroy a stack but there is a security group that has dependants. This causes the program to stall for a looooooong time. Can’t I put a timeout on this? And how can I resolve the issue?

lemon-plumber-12041

06/10/2024, 7:01 PM

There is also a problem, when starting an ECS service with for example the wrong docker image. If it can’t find the docker image, it will keep trying for probably 15min.

lemon-plumber-12041

06/10/2024, 7:05 PM

And if I cancel the deployment mid way using

pulumi cancel

, everything gets fucked, and the next time I run

pulumi up

it tries to create a whole new cluster, service and everything

great-sunset-355

06/10/2024, 8:57 PM

keep in mind, pulumi cancel does not stop ecs deployment, you should use circuit breaker but it has catch (you need at least 2 containers) and other issues https://github.com/aws/containers-roadmap/issues/1247

great-sunset-355

06/10/2024, 8:59 PM

as much as awsx looks like a shortcut it is very opinionated and can break a lot of things, I prefer writing my own components. ATM I am migrating huge portion of AWSX code to aws classic

lemon-plumber-12041

06/10/2024, 9:48 PM

Damn. That’s bad news. I might be better off just manually creating the security groups, load balancers, etc. The goal was to just get ECS hosting with deployments from github actions working ASAP.

lemon-plumber-12041

06/10/2024, 10:28 PM

This health check fails for some reason, and yet the pulumi deployment keeps waiting. Surely there is a solution to this problem…

lemon-plumber-12041

06/10/2024, 10:28 PM

great-sunset-355

06/11/2024, 5:32 AM

ECS and ASAP do not mix well. Are you sure it is container healthcheck failing (one you provided in the screenshot) and not LB health check? Because service has to provide a response to LB so that it does not get deregistered from TG

lemon-plumber-12041

06/11/2024, 9:14 AM

I don’t see any reason why the health check would fail. I was quite stumped. Makes more sense if it’s the LB. Any other way you know of that would result in an unhealthy status? I don’t know why it wouldn’t respond to the LB though.

lemon-plumber-12041

06/11/2024, 9:16 AM

Really, the biggest issue is not being able to have a reasonable timeout. It makes it impossible to iterate on the system design.

lemon-plumber-12041

06/11/2024, 9:20 AM

I may just give up on Pulumi for ECS deployments all together, but if Pulumi Classic can be configured with timeouts (seconds, not minutes), then I might still consider it a viable option.

great-sunset-355

06/11/2024, 10:20 AM

depends on what is your goal, generally any ECS deployment takes good 5 minutes due to rolling updates and stuff. It is possible to make it slightly faster - but you do not want to rely on ECS to tell you application-level problems because it is painfully slow you can achieve a bit faster results with these but to really tweak the deployment, you have to understand all moving parts between ECS and LB and time it takes your application to become responsive to the first request.

Copy code

healthCheck: {
          path: args.healthCheckPath,
          timeout: args.healthCheckTimeoutSeconds,
          interval: args.healthCheckIntervalSeconds,
        },
        deregistrationDelay: args.deregistrationDelay,

also you could drop wait for steady state or configure timeous but I do not know how, however they were added here: https://github.com/hashicorp/terraform-provider-aws/pull/25641 I am not sure what you are working on but you may get faster dev cycle with AppRunner or GCP CloudRun

lemon-plumber-12041

06/11/2024, 10:25 AM

AppRunner looks good for this kind of work, actually. I will consider it. Thanks a lot for the help!

great-sunset-355

06/11/2024, 10:40 AM

Apprunner is pretty straight forward however it has serious limitation that you can only have a single container, not a Task/Pod then it looks something like this

Copy code

const appRunner = new aws.apprunner.Service(
    "svc",
    {
      serviceName: "svc",
      sourceConfiguration: {
        autoDeploymentsEnabled: false,
        imageRepository: {
          imageRepositoryType: "ECR_PUBLIC",
          imageConfiguration: {
            runtimeEnvironmentSecrets: { ...pars },
            runtimeEnvironmentVariables: {
              SYSTEM_REQUIREMENT_CHECK_ENABLED: "false",
              ALPINE_DATABASE_MODE: "external",
              ALPINE_DATABASE_DRIVER: "org.postgresql.Driver",
              LOGGING_LEVEL: "INFO",
            },
            port: "8080",
          },
          imageIdentifier: args.imageTag,
        },
      },
      instanceConfiguration: {
        cpu: "4 vCPU",
        memory: "8 GB",
        instanceRoleArn: instanceRole.arn,
      },
      networkConfiguration: {
        egressConfiguration: {
          egressType: "VPC",
          vpcConnectorArn: connector.arn,
        },
      },
    },
    { dependsOn: [...secrets] }
  );

lemon-plumber-12041

06/11/2024, 10:56 AM

Seems to have decent auto scaling capabilities though… https://docs.aws.amazon.com/apprunner/latest/dg/manage-autoscaling.html And can be scaled across multiple availability zones https://aws.amazon.com/blogs/containers/architecting-for-resiliency-on-aws-app-runner/ If there aren’t 2 or more tasks, how does it then handle failed deployments? Surely an old instance will keep running and accepting traffic until a new instance becomes healthy?

2 Views

Open in Slack

Previous Next