Docs: Add `asciiCJK` tokenizer to text index DDL example by rschu1ze · Pull Request #101993 · ClickHouse/ClickHouse · GitHub
Skip to content

Docs: Add asciiCJK tokenizer to text index DDL example#101993

Merged
rschu1ze merged 4 commits into
masterfrom
docs-asciicjk
Apr 8, 2026
Merged

Docs: Add asciiCJK tokenizer to text index DDL example#101993
rschu1ze merged 4 commits into
masterfrom
docs-asciicjk

Conversation

@rschu1ze

@rschu1ze rschu1ze commented Apr 7, 2026

Copy link
Copy Markdown
Member

Changelog category (leave one):

  • Documentation (changelog entry is not required)

Version info

  • Merged into: 26.4.1.667

@rschu1ze rschu1ze changed the title Docs: Add asciiCJI to text index DDL example Docs: Add asciiCJI tokenizer to text index DDL example Apr 7, 2026
@clickhouse-gh

clickhouse-gh Bot commented Apr 7, 2026

Copy link
Copy Markdown
Contributor

tokenizer = splitByNonAlpha
| splitByString[(S)]
| ngrams[(N)]
| asciiCJK

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and the next line are the relevant changes (everything else is cosmetics).

@clickhouse-gh clickhouse-gh Bot added the pr-documentation Documentation PRs for the specific code PR label Apr 7, 2026
@alexey-milovidov alexey-milovidov changed the title Docs: Add asciiCJI tokenizer to text index DDL example Docs: Add asciiCJK tokenizer to text index DDL example Apr 7, 2026
@alexey-milovidov

This comment was marked as resolved.

@alexey-milovidov

This comment was marked as resolved.

@rschu1ze

rschu1ze commented Apr 8, 2026

Copy link
Copy Markdown
Member Author

Merging from master to get rid of unrelated DocsCheck errors.

@rschu1ze

rschu1ze commented Apr 8, 2026

Copy link
Copy Markdown
Member Author

@Blargian Do you have an idea why the Docs Check failed in this PR?

[2026-04-08 11:02:57]     | [ERROR] Error: Unable to build website for locale en.
[2026-04-08 11:02:57]     |     at tryToBuildLocale (/opt/clickhouse-docs/node_modules/@docusaurus/core/lib/commands/build/build.js:78:15)
[2026-04-08 11:02:57]     |     at async /opt/clickhouse-docs/node_modules/@docusaurus/core/lib/commands/build/build.js:34:9
[2026-04-08 11:02:57]     |     ... 4 lines matching cause stack trace ...
[2026-04-08 11:02:57]     |     at async file:///opt/clickhouse-docs/node_modules/@docusaurus/core/bin/docusaurus.mjs:44:3 {
[2026-04-08 11:02:57]     |   [cause]: Error: Docusaurus found broken anchors!
[2026-04-08 11:02:57]     |   
[2026-04-08 11:02:57]     |   Please check the pages of your site in the list below, and make sure you don't reference any anchor that does not exist.
[2026-04-08 11:02:57]     |   Note: it's possible to ignore broken anchors with the 'onBrokenAnchors' Docusaurus configuration, and let the build pass.
[2026-04-08 11:02:57]     |   
[2026-04-08 11:02:57]     |   Exhaustive list of all broken anchors found:
[2026-04-08 11:02:57]     |   - Broken anchor on source page path = /docs/materialized-view/refreshable-materialized-view:
[2026-04-08 11:02:57]     |      -> linking to /docs/sql-reference/statements/system#refreshable-materialized-views
[2026-04-08 11:02:57]     |   
[2026-04-08 11:02:57]     |       at throwError (/opt/clickhouse-docs/node_modules/@docusaurus/logger/lib/logger.js:80:11)
[2026-04-08 11:02:57]     |       at reportBrokenLinks (/opt/clickhouse-docs/node_modules/@docusaurus/core/lib/server/brokenLinks.js:254:49)
[2026-04-08 11:02:57]     |       at handleBrokenLinks (/opt/clickhouse-docs/node_modules/@docusaurus/core/lib/server/brokenLinks.js:282:5)

The failure looks unrelated to the PR but it is also persistent (I merged from master already).

@Blargian

Blargian commented Apr 8, 2026

Copy link
Copy Markdown
Member

@Blargian Do you have an idea why the Docs Check failed in this PR?

[2026-04-08 11:02:57]     | [ERROR] Error: Unable to build website for locale en.
[2026-04-08 11:02:57]     |     at tryToBuildLocale (/opt/clickhouse-docs/node_modules/@docusaurus/core/lib/commands/build/build.js:78:15)
[2026-04-08 11:02:57]     |     at async /opt/clickhouse-docs/node_modules/@docusaurus/core/lib/commands/build/build.js:34:9
[2026-04-08 11:02:57]     |     ... 4 lines matching cause stack trace ...
[2026-04-08 11:02:57]     |     at async file:///opt/clickhouse-docs/node_modules/@docusaurus/core/bin/docusaurus.mjs:44:3 {
[2026-04-08 11:02:57]     |   [cause]: Error: Docusaurus found broken anchors!
[2026-04-08 11:02:57]     |   
[2026-04-08 11:02:57]     |   Please check the pages of your site in the list below, and make sure you don't reference any anchor that does not exist.
[2026-04-08 11:02:57]     |   Note: it's possible to ignore broken anchors with the 'onBrokenAnchors' Docusaurus configuration, and let the build pass.
[2026-04-08 11:02:57]     |   
[2026-04-08 11:02:57]     |   Exhaustive list of all broken anchors found:
[2026-04-08 11:02:57]     |   - Broken anchor on source page path = /docs/materialized-view/refreshable-materialized-view:
[2026-04-08 11:02:57]     |      -> linking to /docs/sql-reference/statements/system#refreshable-materialized-views
[2026-04-08 11:02:57]     |   
[2026-04-08 11:02:57]     |       at throwError (/opt/clickhouse-docs/node_modules/@docusaurus/logger/lib/logger.js:80:11)
[2026-04-08 11:02:57]     |       at reportBrokenLinks (/opt/clickhouse-docs/node_modules/@docusaurus/core/lib/server/brokenLinks.js:254:49)
[2026-04-08 11:02:57]     |       at handleBrokenLinks (/opt/clickhouse-docs/node_modules/@docusaurus/core/lib/server/brokenLinks.js:282:5)

The failure looks unrelated to the PR but it is also persistent (I merged from master already).

I needed to turn off the link checker so I can get around the chicken-or-egg problem here: #101611, but I'm not quite sure why it's now complaining about this link specifically if neither of those docs were modified. Fixing here in any case: ClickHouse/clickhouse-docs#5968

@Blargian

Blargian commented Apr 8, 2026

Copy link
Copy Markdown
Member

@rschu1ze rerun of the docs check should do the trick

@clickhouse-gh

clickhouse-gh Bot commented Apr 8, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 83.90% 83.90% +0.00%
Functions 90.90% 90.90% +0.00%
Branches 76.40% 76.40% +0.00%

Changed lines: 100.00% (38/38) · Uncovered code

Full report · Diff report

@rschu1ze rschu1ze added this pull request to the merge queue Apr 8, 2026
Merged via the queue into master with commit 4aa55a2 Apr 8, 2026
161 of 163 checks passed
@rschu1ze rschu1ze deleted the docs-asciicjk branch April 8, 2026 16:46
@robot-ch-test-poll4 robot-ch-test-poll4 added the pr-synced-to-cloud The PR is synced to the cloud repo label Apr 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-documentation Documentation PRs for the specific code PR pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants