Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement graceful ASG shutdowns #376

Open
jpswinski opened this issue Feb 12, 2024 · 1 comment
Open

Implement graceful ASG shutdowns #376

jpswinski opened this issue Feb 12, 2024 · 1 comment

Comments

@jpswinski
Copy link
Member

When a cluster scales in, the terminating nodes are very quickly shutdown without any consideration for processing requests that they may be running. Current requests just error out, and in the case of proxied requests, the results are just lost.

The desired solution is to implement a graceful shutdown using ASG lifecycle hooks - specifically the terminate hook. The notification should be consumed by the terminating node, and it can then stop registering to the orchestrator and then wait 10 minutes. Or ideally, if the node has the ability to determine if any requests are running on it, then it could wait until all requests have completed before sending the continue command to the lifecycle (this would avoid the global 10 minute wait).

@jpswinski
Copy link
Member Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant