-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement graceful ASG shutdowns #376
Comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
When a cluster scales in, the terminating nodes are very quickly shutdown without any consideration for processing requests that they may be running. Current requests just error out, and in the case of proxied requests, the results are just lost.
The desired solution is to implement a graceful shutdown using ASG lifecycle hooks - specifically the terminate hook. The notification should be consumed by the terminating node, and it can then stop registering to the orchestrator and then wait 10 minutes. Or ideally, if the node has the ability to determine if any requests are running on it, then it could wait until all requests have completed before sending the continue command to the lifecycle (this would avoid the global 10 minute wait).
The text was updated successfully, but these errors were encountered: