QoS / Kafka Topic Prioritization w/ ClickHouse Kafka Connect Sink #337
-
Our use case is using Kafka and an intermediate step for our events before persisting in ClickHouse to offer backpressure and avoid causing performance issues on the ClickHouse cluster for other transactions. We are currently using self-managed ClickHouse and Kafka clusters. We have no issues publishing our events to Kafka, but curious if we are able to use the CH sink (https://www.confluent.io/hub/clickhouse/clickhouse-kafka-connect) as a way to move these over to CH. We have one tricky requirement where certain events/topics need to take priority over others, sort of like quality of service. Do you know how we could approach using the sink to satisfy this requirement? We were thinking of maybe running multiple instances of the sink with different configurations to try to satisfy the different priorities or QoS across the events/topics. As a fallback solution, we could certainly write our own custom Kafka consumer, but we are trying to avoid this complexity if possible. Any thoughts or ideas? |
Beta Was this translation helpful? Give feedback.
Replies: 10 comments 12 replies
-
Hmm - in the sense that high priority messages in the queue need to be inserted before any others? Or something else? |
Beta Was this translation helpful? Give feedback.
-
Basically would it be possible to process certain topics at a higher
throughput than the lower priority topics. Our thought was configuring
multiple CH Kafka Connect Sinks and having more partitions/tasks/threads on
the higher priority topics to have greater throughput to CH. Is this
possible, and if so, can you share how to configure your CH Kafka Connect
Sinks to accomplish this?
…On Fri, Mar 1, 2024 at 3:22 PM Paultagoras ***@***.***> wrote:
Hmm - in the sense that high priority messages in the queue need to be
inserted before any others? Or something else?
—
Reply to this email directly, view it on GitHub
<#337 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AANO4ZJFNK3PQNMUPVSC46LYWD5TNAVCNFSM6AAAAABEAJJUYSVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4DMNBYGMYDA>
.
You are receiving this because you authored the thread.Message ID:
<ClickHouse/clickhouse-kafka-connect/repo-discussions/337/comments/8648300
@github.com>
|
Beta Was this translation helpful? Give feedback.
-
We have about 5-6TBs/day of traffic in production. So, what might be a good
number of workers to use? We can start with two connectors, high and low
priority.
The following is our configuration, so I am guessing tsks.max is at least
one of the settings we need to change:
connector.class=com.clickhouse.kafka.connect.ClickHouseSinkConnector
tasks.max=1
topics=<topic_name>
ssl=true
jdbcConnectionProperties=?sslmode=STRICT
security.protocol=SSL
hostname=<hostname>
database=<database_name>
password=<password>
ssl.truststore.location=/tmp/kafka.client.truststore.jks
port=8443
value.converter.schemas.enable=false
value.converter=org.apache.kafka.connect.json.JsonConverter
exactlyOnce=true
username=default
schemas.enable=false
…On Sat, Mar 2, 2024 at 1:01 AM Paultagoras ***@***.***> wrote:
Well there's kind of a few different ways to do this, I think. You can
have multiple instances of the connector running at once, covering multiple
topics, so the simplest way would be to have:
- High Priority Connector instance (for all of the High Priority
Topics) w/ some high number of workers (it's customizable)
- Low Priority Connector instance (for everything else) w/ some lower
number of workers
It kind of depends on how much data are you thinking about - do you have
an estimate in mind?
—
Reply to this email directly, view it on GitHub
<#337 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AANO4ZMXIK2V4TE7B6M5CN3YWGBM3AVCNFSM6AAAAABEAJJUYSVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4DMNJQGMYDE>
.
You are receiving this because you authored the thread.Message ID:
<ClickHouse/clickhouse-kafka-connect/repo-discussions/337/comments/8650302
@github.com>
|
Beta Was this translation helpful? Give feedback.
-
Good question on # of messages. I have an email out to the ops team on
this, but we might have to gather this as part of load testing.
Thanks for sharing these docs. I will take a look at those, and you are
right, we will need to figure out the Confluent/Kafka config first e.g.,
partitions, which will inform things like tasks.max,
…On Sun, Mar 3, 2024 at 4:16 AM Paultagoras ***@***.***> wrote:
Interesting, so roughly 70MB/second? Do you know how many messages that
would be, roughly?
tasks.max is definitely one - the general recommendation is 1:1 with
Partitions:Tasks, another is to set
consumer.override.max.poll.records=5000 to better batch records (the
default behavior is 500).
We actually happen to have a few different docs around this sort of thing
-
https://clickhouse.com/blog/measure-visaualize-minimize-kafka-latency-clickhouse
and
https://clickhouse.com/docs/en/integrations/kafka/clickhouse-kafka-connect-sink#tuning-performance
come to mind, but I'm sure there are others as well. Mind be worth taking a
peak though, they also have links to other docs (like from Confluent) about
Kafka Connect and optimizing connectors that are worthwhile to read.
—
Reply to this email directly, view it on GitHub
<#337 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AANO4ZLXU7OCJWPL5FXXA7DYWMA7TAVCNFSM6AAAAABEAJJUYSVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4DMNJWHEYDC>
.
You are receiving this because you authored the thread.Message ID:
<ClickHouse/clickhouse-kafka-connect/repo-discussions/337/comments/8656901
@github.com>
|
Beta Was this translation helpful? Give feedback.
-
Got it, so we will need to configure multiple connectors if we want to have
some with EO and some AO.
…On Wed, Mar 6, 2024 at 12:46 PM Paultagoras ***@***.***> wrote:
EO is pretty use-case specific, since most cases "at-least once" is more
than enough. That said, enabling it is on a connector level and applies to
all topics that connector handles.
—
Reply to this email directly, view it on GitHub
<#337 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AANO4ZLZBNOE3ZVN65VPBGLYW5XBDAVCNFSM6AAAAABEAJJUYSVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4DMOJYGM3DI>
.
You are receiving this because you authored the thread.Message ID:
<ClickHouse/clickhouse-kafka-connect/repo-discussions/337/comments/8698364
@github.com>
|
Beta Was this translation helpful? Give feedback.
-
I really appreciate all your help. We have been researching on the
Confluent/Kafka side on guidance for partitions and tasks, which is lining
up with what you have suggested.
The whole point of us fronting CH with Kafka is to implement backpressure
in case we get really high number of events and don't want to overwhelm CH.
Is there some way to get the current load or number of
connections/transactions on CH? If there is, we could use this value as a
way to throttle the messages getting processed in Kafka before sending to
CH.
I know this can't be managed in the sink connector, but we are planning on
writing our own streaming service to work with the sink.
…On Wed, Mar 6, 2024 at 12:56 PM Jonathan Hodges ***@***.***> wrote:
Got it, so we will need to configure multiple connectors if we want to
have some with EO and some AO.
On Wed, Mar 6, 2024 at 12:46 PM Paultagoras ***@***.***>
wrote:
> EO is pretty use-case specific, since most cases "at-least once" is more
> than enough. That said, enabling it is on a connector level and applies to
> all topics that connector handles.
>
> —
> Reply to this email directly, view it on GitHub
> <#337 (reply in thread)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AANO4ZLZBNOE3ZVN65VPBGLYW5XBDAVCNFSM6AAAAABEAJJUYSVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4DMOJYGM3DI>
> .
> You are receiving this because you authored the thread.Message ID:
> <ClickHouse/clickhouse-kafka-connect/repo-discussions/337/comments/8698364
> @github.com>
>
|
Beta Was this translation helpful? Give feedback.
-
Makes sense. Can the Kafka Connector Sink support batch or asynchronous
inserts on the CH side?
…On Mon, Mar 11, 2024 at 5:44 AM Paultagoras ***@***.***> wrote:
CH itself can handle a large volume of data, it's generally more a
question of batches - large groups, rather than frequent small inserts.
That's why I was curious about how many messages you expected 🙂
As far as connections/transactions - I believe if you go to the instance
details page, it has the various metrics around that (or leads to more,
depending)
—
Reply to this email directly, view it on GitHub
<#337 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AANO4ZIRFV33NZOY6ZTUDFTYXWRLDAVCNFSM6AAAAABEAJJUYSVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4DONBUG4ZTA>
.
You are receiving this because you authored the thread.Message ID:
<ClickHouse/clickhouse-kafka-connect/repo-discussions/337/comments/8744730
@github.com>
|
Beta Was this translation helpful? Give feedback.
-
Actually after further thought, it seems batch inserts are most likely what
we are looking for versus async. That is, if we can control when the Kafka
Connector Sync will flush records to CH, either based on # of events or
time. Is that possible?
…On Thu, Mar 14, 2024 at 9:22 AM Paultagoras ***@***.***> wrote:
You could either enable it on the user, or pass it as a clickhouse setting
- see
https://clickhouse.com/docs/en/optimize/asynchronous-inserts#enabling-asynchronous-inserts
—
Reply to this email directly, view it on GitHub
<#337 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AANO4ZJBNQRDSVX7CL32N6TYYG6CXAVCNFSM6AAAAABEAJJUYSVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4DOOBXG44TO>
.
You are receiving this because you authored the thread.Message ID:
<ClickHouse/clickhouse-kafka-connect/repo-discussions/337/comments/8787797
@github.com>
|
Beta Was this translation helpful? Give feedback.
-
Nice, fetch-min-bytes
<https://docs.confluent.io/platform/current/installation/configuration/consumer-configs.html#fetch-min-bytes>
looks
to be what we need on # of events. I am not sure on fetch-max-wait-ms
<https://docs.confluent.io/platform/current/installation/configuration/consumer-configs.html#fetch-max-wait-ms>
as
these are still really low values. For certain events / topics, there won't
be as high of messages per second as others, so we could have scenarios
where a fairly small batch is there and we would ideally want to wait for
more records to accumulate for that topci, but not past a max time, maybe
3-5 minutes. This is obviously much longer than the ms values of
fetch-max-wait-ms
<https://docs.confluent.io/platform/current/installation/configuration/consumer-configs.html#fetch-max-wait-ms>
so
wondering how we might be able to support that?
…On Fri, Mar 15, 2024 at 8:48 AM Paultagoras ***@***.***> wrote:
So fetch-max-wait-ms defaults to 500ms (half a second) so you would want
it to maybe 1000ms or 2000ms (can play around with the settings, I would
just be careful of heartbeat.interval.ms
<https://docs.confluent.io/platform/current/installation/configuration/consumer-configs.html#heartbeat-interval-ms>
because fetches that take too long might cause issues.
—
Reply to this email directly, view it on GitHub
<#337 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AANO4ZJL5VZWOET7OT3CCL3YYMC3TAVCNFSM6AAAAABEAJJUYSVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4DQMBTGA4DQ>
.
You are receiving this because you authored the thread.Message ID:
<ClickHouse/clickhouse-kafka-connect/repo-discussions/337/comments/8803088
@github.com>
|
Beta Was this translation helpful? Give feedback.
-
You're the man, Paul and you have been a huge help! We will test this out
and let you know how it goes. Unfortunately, we got a lot of snow out here
in Denver so I am going to do some shoveling and may not get to this until
next week. If that is the case, have a great weekend!
…On Fri, Mar 15, 2024 at 9:15 AM Paultagoras ***@***.***> wrote:
Correcting my answer slightly - I believe heartbeat and poll are separate
threads. I think for your use case, it would be worth setting
fetch-min-bytes and fetch-max-wait-ms to a higher value and seeing if that
suits 🙂
—
Reply to this email directly, view it on GitHub
<#337 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AANO4ZKZF352EHAZDDCRUG3YYMGBXAVCNFSM6AAAAABEAJJUYSVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4DQMBTGQ3DG>
.
You are receiving this because you authored the thread.Message ID:
<ClickHouse/clickhouse-kafka-connect/repo-discussions/337/comments/8803463
@github.com>
|
Beta Was this translation helpful? Give feedback.
Well there's kind of a few different ways to do this, I think. You can definitely have multiple instances of the connector running at once, covering multiple topics, so the simplest way would be to have:
It kind of depends on how much data are you thinking about - do you have an estimate in mind?