Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add replace member changes. Perist membership details. #216

Merged
merged 1 commit into from
Oct 23, 2024

Conversation

sanebay
Copy link
Contributor

@sanebay sanebay commented Oct 1, 2024

No description provided.

src/lib/homestore_backend/hs_pg_manager.cpp Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

codecov-commenter commented Oct 16, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 21.81818% with 43 lines in your changes missing coverage. Please review.

Project coverage is 65.18%. Comparing base (acb04e8) to head (68eb07d).
Report is 23 commits behind head on main.

Files with missing lines Patch % Lines
src/lib/homestore_backend/hs_pg_manager.cpp 7.31% 37 Missing and 1 partial ⚠️
...ib/homestore_backend/replication_state_machine.cpp 0.00% 2 Missing ⚠️
src/include/homeobject/pg_manager.hpp 50.00% 1 Missing ⚠️
src/lib/homestore_backend/hs_blob_manager.cpp 0.00% 1 Missing ⚠️
src/lib/memory_backend/mem_pg_manager.cpp 83.33% 0 Missing and 1 partial ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #216      +/-   ##
==========================================
- Coverage   68.69%   65.18%   -3.52%     
==========================================
  Files          30       32       +2     
  Lines        1581     1772     +191     
  Branches      163      191      +28     
==========================================
+ Hits         1086     1155      +69     
- Misses        408      522     +114     
- Partials       87       95       +8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@xiaoxichen xiaoxichen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 from my side and waiting for the testing result

@xiaoxichen
Copy link
Collaborator

As discussed in the meeting, we need to re-think the flow with the learner changes. My feeling is we need to wait till the learner flip to follower (voter) then do the step 3

// Step 3. Append log entry to mark the old member is out and new member is added.

and setp 4

            // Step 4. Remove the old member. Even if the old member is temporarily
            // down and recovers, nuraft mesg see member remove from cluster log
            // entry and call exit_group() and leave().

Likely we need to hook up to the

void RaftStateMachine::commit_config

to get notification on cluster member change especially learner flipped to follower.

Copy link
Collaborator

@xiaoxichen xiaoxichen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please if you can rebase.

src/lib/homestore_backend/hs_pg_manager.cpp Outdated Show resolved Hide resolved
src/lib/homestore_backend/hs_pg_manager.cpp Show resolved Hide resolved

uint32_t i{0};
for (auto const& m : pg->pg_info_.members) {
hs_pg->pg_sb_->members[i].id = m.id;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HomeObject struct pg_members is identical to homestore::replica_member_info as of now.

As homestore::replica_member_info is the only information that follower can get regarding a PG, I doubt there is a possibility HomeObject::pg_members can diverse from homestore::replica_member_info

Thinking about consolidate them together, not blocking, more for discussion

@sanebay
Copy link
Contributor Author

sanebay commented Oct 22, 2024

As discussed in the meeting, we need to re-think the flow with the learner changes. My feeling is we need to wait till the learner flip to follower (voter) then do the step 3
why should we wait ? why not treat as normal follower ?

@xiaoxichen
Copy link
Collaborator

why should we wait ? why not treat as normal follower ?

I think it is same when the out_member has down. But if the out_member is healthy , i.e PG move for balancing/tech refresh etc, keeping the redundancy has advantage. If we dont wait till new_member catch up before deleting out_member, we made us degrade to 2 replica.

Copy link
Collaborator

@xiaoxichen xiaoxichen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@sanebay
Copy link
Contributor Author

sanebay commented Oct 23, 2024

why should we wait ? why not treat as normal follower ?

I think it is same when the out_member has down. But if the out_member is healthy , i.e PG move for balancing/tech refresh etc, keeping the redundancy has advantage. If we dont wait till new_member catch up before deleting out_member, we made us degrade to 2 replica.

Created a story for this. eBay/HomeStore#573 This has to be optional depending on use case. Sometimes, its better to evict immediately to free up space.

@xiaoxichen
Copy link
Collaborator

feel free to merge and as you said that can track by other ticket.

@sanebay sanebay merged commit ff926bb into eBay:main Oct 23, 2024
25 checks passed
@sanebay sanebay deleted the replace_member branch October 23, 2024 17:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants