Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replacing collapsing merge tree #7

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

youennL-cs
Copy link

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Implementation of a new MergeTree engine: ReplacingCollapsingMergeTree.

It is similar to ReplacingMergeTree, but have an extra sign column (possible values to discuss: 1 / -1 (or 1 / 0 depending if we can avoid to return rows with -1 sign))
No matter the operation on the data, version must be increased. If two inserted rows have the same version number, the last inserted is the one kept.

More details:
It allows to easily "update" a row by inserting row with (arbitrary) greater version ("replacing" merge algorithm collapses all rows with the same key into one row with the greatest version)
It allows to easily "delete" a row by inserting row with (arbitrary) greater version and -1 sign ("replacing" merge will leave only one row with sign = -1 and then it will be filtered out)
It allows to "update" key columns by deleting a row (as described above) and inserting new one
"Replacing" merge algorithm removes duplicates as well, so it's safe to insert multiple rows with the same version.

To discuss:
How to filter out data:
First solution: We can easily do it in the query
Second solution: Set sign to 0, so when we sum we don't count this rows and KPIs go well
Last solution: Find how to filter out a rows in CH source code and return an ReplacingCollapsingMergeTree row only if sign != -1.

Information about CI checks: https://clickhouse.com/docs/en/development/continuous-integration/

@@ -0,0 +1,18 @@
d1 1 5

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have 3 times the same d1?
Why do we have 3 times d2, with different values? etc

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This "reference" file is the expected output of the {same name}.sql
If we look at the sql file, we will see 3 calls to select * from test; that's why. The three output are following without blank/newline in-between.
So you should have 2 times:

d1	1	5
d2	1	1
d3	1	1
d4	1	3
d5	1	1
d6	-1	2

and the last insert add overlapping data with new ones, so the output is a little different with

d1	1	5
d2	1	3
d3	1	3
d4	1	3
d5	1	1
d6	-1	2

new versions of d2 and d3 are in this new batch.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I somehow missed that there're 3 SELECT calls 👍

@@ -0,0 +1,12 @@
CREATE TABLE test (uid String, sign Int8, version UInt32) ENGINE = ReplacingCollapsingMergeTree(sign, version) Order by (uid);

INSERT INTO test (*) VALUES ('d1', 1, 1), ('d2', 1, 1), ('d6', 1, 1), ('d4', 1, 1), ('d6', -1, 2), ('d3', 1, 1), ('d1', -1, 2), ('d5', 1, 1), ('d4', -1, 2), ('d1', 1, 3), ('d1', -1, 4), ('d4', 1, 3), ('d1', 1, 5);
Copy link

@gontarzpawel gontarzpawel Aug 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we shuffle the input values so that we have:

('d1', 1, 3),('d1', -1, 2),('d1', 1, 1)

?

Copy link
Author

@youennL-cs youennL-cs Aug 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can, I just want to check that if we insert values in "random" order, it doesn't affect the output.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can still be random, I was just curious about sorting (but it has to work otherwise next tests wouldn't work)

youennL-cs pushed a commit that referenced this pull request Sep 22, 2022
…yFunc()

clang-15 reports [1]:

<details>

<summary>ASan report</summary>

```
    ==1==ERROR: AddressSanitizer: stack-use-after-return on address 0x7f1d04c4eb20 at pc 0x000031c4803c bp 0x7f1d05e19a00 sp 0x7f1d05e199f8
    READ of size 8 at 0x7f1d04c4eb20 thread T200 (QueryPullPipeEx)
        #0 0x31c4803b in DB::GetPriorityForLoadBalancing::getPriorityFunc(DB::LoadBalancing, unsigned long, unsigned long) const::$_3::operator()(unsigned long) const build_docker/../src/Common/GetPriorityForLoadBalancing.cpp:42:40
        #1 0x31c4803b in decltype(static_cast<DB::GetPriorityForLoadBalancing::getPriorityFunc(DB::LoadBalancing, unsigned long, unsigned long) const::$_3&>(fp)(static_cast<unsigned long>(fp0))) std::__1::__invoke<DB::GetPriorityForLoadBalancing::getPriorityFunc(DB::LoadBalancing, unsigned long, unsigned long) const::$_3&, unsigned long>(DB::GetPriorityForLoadBalancing::getPriorityFunc(DB::LoadBalancing, unsigned long, unsigned long) const::$_3&, unsigned long&&) build_docker/../contrib/libcxx/include/type_traits:3640:23
        #2 0x31c4803b in unsigned long std::__1::__invoke_void_return_wrapper<unsigned long, false>::__call<DB::GetPriorityForLoadBalancing::getPriorityFunc(DB::LoadBalancing, unsigned long, unsigned long) const::$_3&, unsigned long>(DB::GetPriorityForLoadBalancing::getPriorityFunc(DB::LoadBalancing, unsigned long, unsigned long) const::$_3&, unsigned long&&) build_docker/../contrib/libcxx/include/__functional/invoke.h:30:16
        #3 0x31c4803b in std::__1::__function::__default_alloc_func<DB::GetPriorityForLoadBalancing::getPriorityFunc(DB::LoadBalancing, unsigned long, unsigned long) const::$_3, unsigned long (unsigned long)>::operator()(unsigned long&&) build_docker/../contrib/libcxx/include/__functional/function.h:230:12
        #4 0x31c4803b in unsigned long std::__1::__function::__policy_invoker<unsigned long (unsigned long)>::__call_impl<std::__1::__function::__default_alloc_func<DB::GetPriorityForLoadBalancing::getPriorityFunc(DB::LoadBalancing, unsigned long, unsigned long) const::$_3, unsigned long (unsigned long)>>(std::__1::__function::__policy_storage const*, unsigned long) build_docker/../contrib/libcxx/include/__functional/function.h:711:16
        #5 0x31c38b07 in std::__1::__function::__policy_func<unsigned long (unsigned long)>::operator()(unsigned long&&) const build_docker/../contrib/libcxx/include/__functional/function.h:843:16
        #6 0x31c38b07 in std::__1::function<unsigned long (unsigned long)>::operator()(unsigned long) const build_docker/../contrib/libcxx/include/__functional/function.h:1184:12
        #7 0x31c38b07 in PoolWithFailoverBase<DB::IConnectionPool>::getShuffledPools(unsigned long, std::__1::function<unsigned long (unsigned long)> const&) build_docker/../src/Common/PoolWithFailoverBase.h:174:39

      This frame has 2 object(s):
        [32, 40) 'pool_size.addr' <== Memory access at offset 32 is inside this variable
        [64, 88) 'ref.tmp' (line 18)
```

</details>

  [1]: https://s3.amazonaws.com/clickhouse-test-reports/41046/adea92f847373d1fcfd733d8979c63024f9b80bf/integration_tests__asan__[1/3].html

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
youennL-cs pushed a commit that referenced this pull request Sep 22, 2022
…s and parallel KILL

Right now it is possible to call QueryStatus::addPipelineExecutor() when
the executors_mutex already acquired, it is possible when the query was
cancelled via KILL QUERY.

Here I will show some traces from debugger from a real example, where
tons of ProcessList::insert() got deadlocked.

Let's look at the lock owner for one of the threads that was deadlocked
in ProcessList::insert():

    (gdb) p *mutex
    $2 = {
      __data = {
        __owner = 46899,
      },
    }

And now let's see the stack trace of the 46899:

    #0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:103
    #1  0x00007fb65569b714 in __GI___pthread_mutex_lock (mutex=0x7fb4a9d15298) at ../nptl/pthread_mutex_lock.c:80
    #2  0x000000001b6edd91 in pthread_mutex_lock (arg=0x7fb4a9d15298) at ../src/Common/ThreadFuzzer.cpp:317
    #3  std::__1::__libcpp_mutex_lock (__m=0x7fb4a9d15298) at ../contrib/libcxx/include/__threading_support:303
    #4  std::__1::mutex::lock (this=0x7fb4a9d15298) at ../contrib/libcxx/src/mutex.cpp:33
    #5  0x0000000014c7ae63 in std::__1::lock_guard<std::__1::mutex>::lock_guard (__m=..., this=<optimized out>) at ../contrib/libcxx/include/__mutex_base:91
    #6  DB::QueryStatus::addPipelineExecutor (this=0x7fb4a9d14f90, e=0x80) at ../src/Interpreters/ProcessList.cpp:372
    #7  0x0000000015bee4a7 in DB::PipelineExecutor::PipelineExecutor (this=0x7fb4b1e53618, processors=..., elem=<optimized out>) at ../src/Processors/Executors/PipelineExecutor.cpp:54
    #12 std::__1::make_shared<DB::PipelineExecutor, std::__1::vector<std::__1::shared_ptr<DB::IProcessor>, std::__1::allocator<std::__1::shared_ptr<DB::IProcessor> > >&, DB::QueryStatus*&, void> (__args=@0x7fb63095b9b0: 0x7fb4a9d14f90, __args=@0x7fb63095b9b0: 0x7fb4a9d14f90) at ../contrib/libcxx/include/__memory/shared_ptr.h:963
    #13 DB::QueryPipelineBuilder::execute (this=0x7fb63095b8b0) at ../src/QueryPipeline/QueryPipelineBuilder.cpp:552
    #14 0x00000000158c6c27 in DB::Connection::sendExternalTablesData (this=0x7fb6545e9d98, data=...) at ../src/Client/Connection.cpp:797
    #27 0x0000000014043a81 in DB::RemoteQueryExecutorRoutine::operator() (this=0x7fb63095bf20, sink=...) at ../src/QueryPipeline/RemoteQueryExecutorReadContext.cpp:46
    ClickHouse#32 0x000000000a16dd4f in make_fcontext () at ../contrib/boost/libs/context/src/asm/make_x86_64_sysv_elf_gas.S:71

And also in the logs you can see very strange things for this thread:

    2022.09.13 14:14:51.228979 [ 51145 ] {1712D4E914EC7C99} <Debug> Connection (localhost:9000): Sent data for 1 external tables, total 11 rows in 0.00046389 sec., 23688 rows/sec., 3.84 KiB (8.07 MiB/sec.), compressed 1.1070121092649958 times to 3.47 KiB (7.29 MiB/sec.)
    ...
    2022.09.13 14:14:51.719402 [ 46899 ] {7c90ffa4-1dc8-42fd-938c-4e307c244394} <Debug> executeQuery: (from 10.101.15.181:42478) KILL QUERY WHERE query_id = '1712D4E914EC7C99' (stage: Complete)
    2022.09.13 14:14:51.719488 [ 46899 ] {7c90ffa4-1dc8-42fd-938c-4e307c244394} <Debug> executeQuery: (internal) SELECT query_id, user, query FROM system.processes WHERE query_id = '1712D4E914EC7C99' (stage: Complete)
    2022.09.13 14:14:51.719754 [ 46899 ] {7c90ffa4-1dc8-42fd-938c-4e307c244394} <Trace> ContextAccess (default): Access granted: SELECT(user, query_id, query) ON system.processes
    2022.09.13 14:14:51.720544 [ 46899 ] {7c90ffa4-1dc8-42fd-938c-4e307c244394} <Trace> InterpreterSelectQuery: FetchColumns -> Complete
    2022.09.13 14:14:53.228964 [ 46899 ] {7c90ffa4-1dc8-42fd-938c-4e307c244394} <Debug> Connection (localhost:9000): Sent data for 2 scalars, total 2 rows in 2.6838e-05 sec., 73461 rows/sec., 68.00 B (2.38 MiB/sec.), compressed 0.4594594594594595 times to 148.00 B (5.16 MiB/sec.)

How is this possible? The answer is fibers and query cancellation
routine. During cancellation of async queries it going into fibers again
and try to do this gracefully. However because of this during canceling
query it may call QueryStatus::addPipelineExecutor() from
QueryStatus::cancelQuery().

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
youennL-cs pushed a commit that referenced this pull request Oct 25, 2022
…verflow

UBSAN report:

    $ UBSAN_SYMBOLIZER_PATH=/usr/bin/llvm-symbolizer-14 UBSAN_OPTIONS=print_stacktrace=1 ./unit_tests_dbms --gtest_filter=*Gorilla*
    ../src/Compression/tests/gtest_compressionCodec.cpp:1216:47: runtime error: signed integer overflow: 23 * 100000000 cannot be represented in type 'int'
        #0 0x14f67fd1 in auto (anonymous namespace)::$_6::operator()(int) const::'lambda'(auto)::operator()<int>(auto) const build_docker/../src/Compression/tests/gtest_compressionCodec.cpp:1216:47
        #1 0x14f67fd1 in (anonymous namespace)::CodecTestSequence (anonymous namespace)::generateSeq<long, (anonymous namespace)::$_6::operator()(int) const::'lambda'(auto), int, int>((anonymous namespace)::$_6::operator()(int) const::'lambda'(auto), char const*, int, int) build_docker/../src/Compression/tests/gtest_compressionCodec.cpp:394:36
        #2 0x14f67fd1 in auto (anonymous namespace)::GCompatibilityTestSequence<long>() build_docker/../src/Compression/tests/gtest_compressionCodec.cpp:1224:12
        #3 0x14f3c7f3 in (anonymous namespace)::gtest_GorillaCodecTestCompatibility_EvalGenerator_() build_docker/../src/Compression/tests/gtest_compressionCodec.cpp:1227:1
        #4 0x14f6bdb5 in testing::internal::ParameterizedTestSuiteInfo<(anonymous namespace)::CodecTestCompatibility>::RegisterTests() build_docker/../contrib/googletest/googletest/include/gtest/internal/gtest-param-util.h:553:45
        #5 0x27d87988 in testing::internal::ParameterizedTestSuiteRegistry::RegisterTests() build_docker/../contrib/googletest/googletest/include/gtest/internal/gtest-param-util.h:726:24
        #6 0x27d87988 in testing::internal::UnitTestImpl::RegisterParameterizedTests() build_docker/../contrib/googletest/googletest/src/gtest.cc:2805:34
        #7 0x27d87988 in testing::internal::UnitTestImpl::PostFlagParsingInit() build_docker/../contrib/googletest/googletest/src/gtest.cc:5492:5
        #8 0x27d9d002 in void testing::internal::InitGoogleTestImpl<char>(int*, char**) build_docker/../contrib/googletest/googletest/src/gtest.cc:6499:22
        #9 0x14fd5495 in main build_docker/../src/Coordination/tests/gtest_coordination.cpp:2189:5
        #10 0x7f8c29005209  (/lib/x86_64-linux-gnu/libc.so.6+0x29209) (BuildId: 71a7c7b97bc0b3e349a3d8640252655552082bf5)
        #11 0x7f8c290052bb in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x292bb) (BuildId: 71a7c7b97bc0b3e349a3d8640252655552082bf5)
        #12 0x14ce356d in _start (/work1/azat/tmp/42190/unit_tests_dbms+0x14ce356d) (BuildId: 482550e3f8d45f06e8c7f8147f427ee798c1f645)

    SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../src/Compression/tests/gtest_compressionCodec.cpp:1216:47 in

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
youennL-cs pushed a commit that referenced this pull request Nov 29, 2022
…reated

ASAN report:

    Code: 586. DB::ErrnoException: Cannot create file: /src/.clickhouse_history, errno: 2, strerror: No such file or directory. (CANNOT_CREATE_FILE)
    =================================================================
    ==1==ERROR: AddressSanitizer: heap-use-after-free on address 0x6240000208f0 at pc 0x000030d22ade bp 0x7ffff2ff3f70 sp 0x7ffff2ff3f68
    READ of size 8 at 0x6240000208f0 thread T2
        #0 0x30d22add in DB::ProcessList::insert() build_docker/../src/Interpreters/ProcessList.cpp:89:36
        #1 0x31411018 in DB::executeQueryImpl() build_docker/../src/Interpreters/executeQuery.cpp:516:60
        #2 0x3140e1ab in DB::executeQuery() build_docker/../src/Interpreters/executeQuery.cpp:1083:30
        #3 0x3364391e in DB::LocalConnection::sendQuery() build_docker/../src/Client/LocalConnection.cpp:119:21
        #4 0x3367bab0 in DB::Suggest::fetch() build_docker/../src/Client/Suggest.cpp:141:16
        #5 0x336820eb in void DB::Suggest::load<DB::LocalConnection>()::'lambda'()::operator()() const build_docker/../src/Client/Suggest.cpp:118:17

    0x6240000208f0 is located 2032 bytes inside of 7056-byte region [0x624000020100,0x624000021c90)
    freed by thread T0 here:

        #0 0xe381ef2 in operator delete(void*, unsigned long) (/wrk/clickhouse-asan+0xe381ef2) (BuildId: 6ea6d1a5d2d5a164f60f0fd8230936305bc8d9d0)
        #1 0x335509fe in DB::ClientBase::~ClientBase() build_docker/../src/Client/ClientBase.cpp:293:25
        #2 0x1f809bd5 in mainEntryClickHouseLocal(int, char**) build_docker/../programs/local/LocalServer.cpp:804:5
        #3 0xe3856ad in main build_docker/../programs/main.cpp:482:12
        #4 0x7ffff7dc0082 in __libc_start_main /build/glibc-SzIz7B/glibc-2.31/csu/../csu/libc-start.c:308:16

    previously allocated by thread T0 here:
        #0 0xe38128d in operator new(unsigned long) (/wrk/clickhouse-asan+0xe38128d) (BuildId: 6ea6d1a5d2d5a164f60f0fd8230936305bc8d9d0)
        #1 0x2f34a7f3 in std::__1::__unique_if<DB::ContextSharedPart>::__unique_single std::__1::make_unique[abi:v15003]<DB::ContextSharedPart>() build_docker/../contrib/libcxx/include/__memory/unique_ptr.h:714:28
        #2 0x2f34a7f3 in DB::Context::createShared() build_docker/../src/Interpreters/Context.cpp:603:32
        #3 0x1f7f901d in DB::LocalServer::processConfig() build_docker/../programs/local/LocalServer.cpp:535:22
        #4 0x1f7f4d92 in DB::LocalServer::main() build_docker/../programs/local/LocalServer.cpp:419:5
        #5 0x3af24ffe in Poco::Util::Application::run() build_docker/../contrib/poco/Util/src/Application.cpp:334:8
        #6 0x1f809bca in mainEntryClickHouseLocal(int, char**) build_docker/../programs/local/LocalServer.cpp:803:20
        #7 0xe3856ad in main build_docker/../programs/main.cpp:482:12
        #8 0x7ffff7dc0082 in __libc_start_main /build/glibc-SzIz7B/glibc-2.31/csu/../csu/libc-start.c:308:16

    Thread T2 created by T0 here:
        #0 0xe32fedc in pthread_create (/wrk/clickhouse-asan+0xe32fedc) (BuildId: 6ea6d1a5d2d5a164f60f0fd8230936305bc8d9d0)
        #1 0x336806df in std::__1::__libcpp_thread_create[abi:v15003](unsigned long*, void* (*)(void*), void*) build_docker/../contrib/libcxx/include/__threading_support:376:10
        #2 0x336806df in std::__1::thread::thread<void DB::Suggest::load<DB::LocalConnection>()::'lambda'(), void>() build_docker/../contrib/libcxx/include/thread:311:16
        #3 0x3367ff5b in void DB::Suggest::load<DB::LocalConnection>(std::__1::shared_ptr<DB::Context const>, DB::ConnectionParameters const&, int) build_docker/../src/Client/Suggest.cpp:110:22
        #4 0x3357fee9 in DB::ClientBase::runInteractive() build_docker/../src/Client/ClientBase.cpp:2066:22
        #5 0x1f7f5264 in DB::LocalServer::main() build_docker/../programs/local/LocalServer.cpp
        #6 0x3af24ffe in Poco::Util::Application::run() build_docker/../contrib/poco/Util/src/Application.cpp:334:8
        #7 0x1f809bca in mainEntryClickHouseLocal(int, char**) build_docker/../programs/local/LocalServer.cpp:803:20
        #8 0xe3856ad in main build_docker/../programs/main.cpp:482:12
        #9 0x7ffff7dc0082 in __libc_start_main /build/glibc-SzIz7B/glibc-2.31/csu/../csu/libc-start.c:308:16

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

2 participants