Alleviate race conditions in roll restart reconciler #694

Refactor the code to seperate each steps clearly. In the first loop there was 2 operations: - Check if there is a pending update - Check if all pods are ready However, we can break in some conditions and skip the health checking. To simplify the maintability of the scope, it is now split in 2 separate loops: - first check if restarts are pending - then check that all pools are healthy Signed-off-by: Geoffrey Beausire <g.beausire@criteo.com>

Refactor the order or restart to simplify the reconcile function and make it more deterministic. Before we could maybe restart stuck pods then perform roll restart. Now, stuck pods are always restarted first. Also, stuck pods are restarted before checking the number of ready replicas, because if we check before we will never reach this point. Finally, we don't process a rolling restart if any pods stuck was deleted, to avoid performing any dangerous actions. Signed-off-by: Geoffrey Beausire <g.beausire@criteo.com>

When a Roll Restart is triggered with at least 2 pools, a race condition can trigger the roll restart of a pods in each pools. This can lead to a red cluster. Normally to prevent this from happening, there are 3 checks: - check the status.ReadyReplicas of all sts before moving forward. - for each nodePool, check that that all replicas are ready by listing pods directly - before deleting a pod, a check is made on the OpenSearch to see if the cluster is healthy In practice, it is not enough. Considering the rollRestart of 2 nodePool: - data - masterData The following sequence can happen: - a rollRestart is triggered - reconcile function is called - data and masterData have all their pods ready - data pod is deleted; pods is terminating (NOT terminated yet) - reconcile function is recalled - data and masterData have all their pods ready from the status.ReadyReplicas point of view (because it didn't see the change yet) - data is seen as unhealthy thanks to CountRunningPodsForNodePool - masterData is seen as healthy because all its pods are ready - Opensearch is still healthy, because the deleted pod is not terminated yet - A pod in masterData is restarted - Cluster is red! This commit make sure we check readiness of all nodePool using CountRunningPodsForNodePool before trying to restart any pools. Signed-off-by: Geoffrey Beausire <g.beausire@criteo.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Alleviate race conditions in roll restart reconciler #694

Alleviate race conditions in roll restart reconciler #694

Commits on Apr 8, 2024

Alleviate race conditions in roll restart reconciler #694

Are you sure you want to change the base?

Alleviate race conditions in roll restart reconciler #694

Commits on Apr 8, 2024