-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add unit tests for CLUSTER NODES
parsing
#112
Conversation
Signed-off-by: Björn Svensson <bjorn.a.svensson@est.tech>
Fix memory issues when the parsing of replicas fails. Signed-off-by: Björn Svensson <bjorn.a.svensson@est.tech>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -1042,8 +1041,19 @@ static dict *parse_cluster_nodes(valkeyClusterContext *cc, valkeyReply *reply) { | |||
error: | |||
sdsfreesplitres(part, count_part); | |||
sdsfreesplitres(slot_start_end, count_slot_start_end); | |||
if (nodes_name != NULL) { | |||
/* Only free parsed replicas since the `nodes` dict owns primary nodes. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a little odd and maybe fragile.
What about the corner case primaries without any slots? Does the nodes
dict own them too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a bit odd.
The nodes
dict owns all valkeyClusterNodes
after the parsing is complete, these are nodes with-or-without slots. This dict holds key=address, value=primary node, and each primary owns a list of replicas.
Deleting the nodes
dict will delete all its objects, primaries and their known replicas, when there is an error while parsing.
But then there is the temporary nodes_name
dict which holds references to all currently parsed cluster nodes.
The found issue is that this dict is the actual owner of an replica node, that has a primary, which is not yet parsed.
So, these are the replica nodes that needs to be deleted while the parsing fails.
I have a coming PR where the cluster nodes parsing is refactored.
@@ -170,14 +171,14 @@ void test_alloc_failure_handling(void) { | |||
|
|||
// Connect | |||
{ | |||
for (int i = 0; i < 128; ++i) { | |||
for (int i = 0; i < 148; ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the stack-allocated iterator uses fewer allocations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the config to also parse replicas in this testcase, This required a couple of more allocations to succeed.
Includes OOM testing of replica parsing and fixes for memory issues when parsing fails.
Fixes #33