StorageKafka: extended configuration, parallel consumers, offset tracking by vavrusa · Pull Request #1654 · ClickHouse/ClickHouse · GitHub
Skip to content

StorageKafka: extended configuration, parallel consumers, offset tracking#1654

Merged
alexey-milovidov merged 7 commits into
ClickHouse:masterfrom
vavrusa:master
Dec 20, 2017
Merged

StorageKafka: extended configuration, parallel consumers, offset tracking#1654
alexey-milovidov merged 7 commits into
ClickHouse:masterfrom
vavrusa:master

Conversation

@vavrusa

@vavrusa vavrusa commented Dec 16, 2017

Copy link
Copy Markdown
Contributor

This contains many fixes and corrections for the Kafka engine.
Most notably it now supports extended configuration similarly to GraphiteMergeTree.
Now it also allows specification of consumer count to parallelize consumption of
multiple partitions both in materialized views and in SELECT queries.

The offsets are now committed in the insertSuffix() method after all rows
are successfully read. If an exception is thrown during reading, affected consumer
unsubscribes from all assignments and rejoins the consumer group to rewind offsets.
This means that the consumer won't lose messages in case of write failures.

The 07f8502 fixes build with clang 5.0 on macOS, however it also needs confluentinc/librdkafka#1592 to be merged first and then vendored.

This should fix #1402 and #1625

@vavrusa

vavrusa commented Dec 18, 2017

Copy link
Copy Markdown
Contributor Author

@alexey-milovidov

Copy link
Copy Markdown
Member

Cool!
I will glance through the modifications right now...

Comment thread dbms/src/Storages/StorageKafka.h Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have support for multiple consumers right now?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I ended up adding it. Let me update the comment.

Comment thread dbms/src/Storages/StorageKafka.cpp Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strange code.
It's not guaranteed, that capacity is equal to the value that was requested in reserve.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, let me fix that.

Comment thread dbms/src/Storages/StorageKafka.cpp Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1 (with intention to implicitly convert to max long) looks weird.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree 👍

Comment thread dbms/src/Storages/StorageKafka.cpp Outdated

@alexey-milovidov alexey-milovidov Dec 19, 2017

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it guaranteed that there is at least one available consumer at this point?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be guaranteed by the semaphore. There are as many semaphore slots as there are consumers. If the code reaches this line, it means that semaphore reserved at least one slot for this caller (otherwise it would throw an exception or returned nullptr).

@alexey-milovidov alexey-milovidov left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

- Make sure to use dynamic linking on macOS to avoid OpenSSL static linking bug
- Use -std=c++17 in CFLAGS for files in contrib
- Avoid bad support for thread_local on macOS with clang altogether
Previously the dependencies were updated only on DROP TABLE, so
detaching a materialized view and inserting to source table thrown an exception.
…king

This contains many fixes and corrections for the Kafka engine.
Most notably it now supports extended configuration similarly to GraphiteMergeTree.
Now it also allows specification of consumer count to parallelize consumption of
multiple partitions both in materialized views and in SELECT queries.

The offsets are now committed in the insertSuffix() method after all rows
are successfully read. If an exception is thrown during reading, affected consumer
unsubscribes from all assignments and rejoins the consumer group to rewind offsets.
This means that the consumer won't lose messages in case of write failures.
@vavrusa

vavrusa commented Dec 20, 2017

Copy link
Copy Markdown
Contributor Author

@alexey-milovidov alexey-milovidov merged commit fd260c3 into ClickHouse:master Dec 20, 2017
Ergus pushed a commit to Ergus/ClickHouse that referenced this pull request Jun 29, 2026
…es" (ClickHouse#105591)

This reverts the merge of PR ClickHouse#105591 (commit d562138), reversing changes
made to 6519f67.

ClickHouse#105591 broke 02354_vector_search_rescoring on master across all build configs
with a deterministic Code 44 ILLEGAL_COLUMN ("The _distance column is an
internal virtual column of vector search and cannot be referenced directly").

Root cause is a base-skew semantic conflict between two PRs merged 65 minutes
apart, neither of which saw the other in CI:
- ClickHouse#107985 (merged 10:05 UTC) added a hard ILLEGAL_COLUMN guard in both passes
  of useVectorSearch.cpp, making _distance internal-only. It closed customer
  incident clickhouse-core-incidents#1654 and added test
  02354_vector_search_incident1654 asserting the guard.
- ClickHouse#105591 (merged 11:10 UTC) added a feature that deliberately supports
  referencing _distance directly, with test queries and reference output that
  expect it to succeed.

The guard from ClickHouse#107985 pre-empts ClickHouse#105591's new code path, so ClickHouse#105591's own
tests fail on master. Reverting ClickHouse#105591 (the later, feature PR) restores master
to green while preserving the customer-incident ClickHouse#1654 guard. The product
decision of whether _distance should be user-referenceable is left to the
vector-search owners to reconcile and re-land.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SSL configuration for Kafka engine

2 participants