Skip to content

Commit

Permalink
[fix](load) Fix the issue of high-concurrency single-replica load get…
Browse files Browse the repository at this point in the history
…ting stuck (#42297)

In high-concurrency single-replica load, the tablet_writer_add_block RPC
may occupy the _heavy_work_pool completely, causing the
response_slave_tablet_pull_rowset RPC to have no available threads for
processing. As a result, tablet_writer_add_block waits indefinitely for
a response from the slave tablet, leading to the import getting stuck
until it times out.

response_slave_tablet_pull_rowset is relatively lightweight, so it can
be handled by the _light_work_pool.
  • Loading branch information
liaoxin01 authored Oct 23, 2024
1 parent f9ea8f8 commit e338e1d
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion be/src/service/internal_service.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1966,7 +1966,7 @@ void PInternalServiceImpl::_response_pull_slave_rowset(const std::string& remote
void PInternalServiceImpl::response_slave_tablet_pull_rowset(
google::protobuf::RpcController* controller, const PTabletWriteSlaveDoneRequest* request,
PTabletWriteSlaveDoneResult* response, google::protobuf::Closure* done) {
bool ret = _heavy_work_pool.try_offer([txn_mgr = _engine.txn_manager(), request, response,
bool ret = _light_work_pool.try_offer([txn_mgr = _engine.txn_manager(), request, response,
done]() {
brpc::ClosureGuard closure_guard(done);
VLOG_CRITICAL << "receive the result of slave replica pull rowset from slave replica. "
Expand Down

0 comments on commit e338e1d

Please sign in to comment.