Add support for <url_prefix> in http handler url by vitlibar · Pull Request #107492 · ClickHouse/ClickHouse · GitHub
Skip to content

Add support for <url_prefix> in http handler url#107492

Merged
vitlibar merged 11 commits into
ClickHouse:masterfrom
vitlibar:prefix-in-http-handler-url
Jun 20, 2026
Merged

Add support for <url_prefix> in http handler url#107492
vitlibar merged 11 commits into
ClickHouse:masterfrom
vitlibar:prefix-in-http-handler-url

Conversation

@vitlibar

@vitlibar vitlibar commented Jun 15, 2026

Copy link
Copy Markdown
Member

Changelog category (leave one):

  • New Feature

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Add support for new settings <url_prefix>, <full_url_prefix> in http handler url:

For example

<clickhouse>
    <http_handlers>
        <rule>
            <url_prefix>/test_prefix</url_prefix>
            ...

Here <url_prefix>/test_prefix</url_prefix> means all paths starting with /test_prefix are served.

The functionality is required for example for prometheus http handlers, where a whole bunch of paths starting with some base path is usually served.

Also for consistency added syntax <url_regexp>, <full_url_regexp>, <headers_regexp> (older syntax <url>regex: still works).

Version info

  • Merged into: 26.6.1.1070

@clickhouse-gh

clickhouse-gh Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

@clickhouse-gh clickhouse-gh Bot added the pr-feature Pull request with new product feature label Jun 15, 2026
Comment thread src/Server/HTTPHandlerRequestFilter.h Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds prefix: matching support for custom http_handlers URL rules, enabling handlers to be selected by a base-path prefix on path-segment (/) boundaries (and ignoring the query string).

Changes:

  • Implement prefix: parsing/matching for url/full_url request filters and move filter implementations out of the header into a .cpp.
  • Add an integration test and per-test server config to validate url prefix matching behavior.
  • Update server config comments and docs to describe the new matching mode.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/integration/test_http_handlers_config/test.py Adds integration coverage for url prefix: matching behavior.
tests/integration/test_http_handlers_config/test_url_prefix_handler/config.xml New test config enabling a prefix: url rule with a static handler.
src/Server/ReplicasStatusHandler.cpp Drops an unnecessary include after filter code moved out of the header.
src/Server/KeeperHTTPHandlerFactory.cpp Adjusts includes to avoid relying on transitive includes from the filter header.
src/Server/HTTPHandlerRequestFilter.h Refactors filter utilities into declarations + documentation of prefix: semantics.
src/Server/HTTPHandlerRequestFilter.cpp New implementation file: adds prefix: support and centralizes filter logic.
src/Server/HTTPHandler.cpp Adjusts includes to avoid relying on transitive includes from the filter header.
programs/server/config.xml Documents prefix: support in the example http_handlers configuration comment.
docs/en/operations/server-configuration-parameters/_server_settings_outside_source.md Documents prefix: behavior for http_handlers.

Comment thread programs/server/config.xml Outdated
Comment thread tests/integration/test_http_handlers_config/test.py
@vitlibar vitlibar force-pushed the prefix-in-http-handler-url branch from 813c6cd to 252d374 Compare June 15, 2026 12:12

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Comment thread programs/server/config.xml Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

Comment thread src/Server/HTTPHandlerRequestFilter.h Outdated
Comment thread programs/server/config.xml Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@vitlibar vitlibar requested a review from Copilot June 15, 2026 13:32

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@vitlibar

Copy link
Copy Markdown
Member Author

@azat Can you please take a look?

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.

Comment thread src/Server/KeeperHTTPHandlerFactory.cpp
@vitlibar vitlibar requested a review from azat June 15, 2026 13:51
@vitlibar

Copy link
Copy Markdown
Member Author

@groeneai Investigate CI failures

@vitlibar

Copy link
Copy Markdown
Member Author

@azat azat self-assigned this Jun 16, 2026
| Sub-tags | Definition |
|----------------------|---------------------------------------------------------------------------------------------------------------------------------------------------|
| `url` | To match the request URL, you can use the 'regex:' prefix to use regex match (optional) |
| `url` | To match the request URL path. The query string is ignored when matching. You can use the 'regex:' prefix for a regex match, or the 'prefix:' prefix to match a base path and everything below it on a path-segment boundary (e.g. 'prefix:/api/v1' matches /api/v1, /api/v1/ and /api/v1/write, but not /api/v1beta) (optional) |

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think better to introduce separate url_prefix, although we already have regex:, and not clear why we went this way.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'll change approach here from <url>prefix:/api/v1</url> to <url_prefix>/api/v1</url_prefix>, and the same for <full_url>

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks
Maybe while you are here you can also add a proper "url_regexp" (and mark existing as deperecated in the doc)

"test_prefix/write", # a sub-path
"test_prefix/a/b/c", # a deeper sub-path
"test_prefix?param=value", # query string is ignored
"test_prefix/write?param=value",

@azat azat Jun 16, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should /test_prefixfoo work? I guess no. Make sense to add a test

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, /test_prefixfoo doesn't match here, I'll add a test

Comment thread src/Server/HTTPHandlerRequestFilter.cpp Outdated
@vitlibar vitlibar changed the title Add support for "prefix:" in http handler url Add support for <url_prefix> in http handler url Jun 16, 2026
@vitlibar vitlibar requested a review from Copilot June 16, 2026 18:37

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 5 comments.

Comment thread src/Server/HTTPHandlerRequestFilter.cpp
Comment thread src/Server/HTTPHandlerRequestFilter.cpp
Comment thread src/Server/HTTPHandlerRequestFilter.h Outdated
Comment thread programs/server/config.xml
Comment thread src/Server/HTTPHandlerRequestFilter.cpp Outdated
@vitlibar vitlibar force-pushed the prefix-in-http-handler-url branch from 3ceb5a5 to cfea92e Compare June 16, 2026 20:27
Comment thread src/Server/HTTPHandlerFactory.h
Comment thread src/Server/HTTPHandlerRequestFilter.cpp Outdated

auto regex = compileRegex(expression);
if (hasAnyOfCapturingGroups(regex, group_names))
result.headers_name_with_regex.emplace(header_name, regex);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

headersFilter matches missing headers against an empty string via request.get(header_name, ""), so a predefined handler with <headers_regex><X>(?P<name>.*)</X></headers_regex> legitimately matches a request with no X header. This line then records that regex for PredefinedQueryHandler::customizeContext, but that path reads the header with request.get(header_name) and raises an exception after the rule has already matched. Please use the same defaulted lookup in the extraction path (or reject empty-matching header regexes) and add a small predefined-handler regression test with an absent header.

@vitlibar vitlibar Jun 17, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't touched headersFilter or PredefinedQueryHandler::customizeContext() in this PR, so I'll leave this for another PR

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix: #108324

@vitlibar

Copy link
Copy Markdown
Member Author

@groeneai Investigate CI failures

@groeneai

Copy link
Copy Markdown
Contributor

@vitlibar All 3 CI failures on 80f53ac are pre-existing, master-wide fuzzer/stress findings. None are caused by this PR. The diff touches only HTTP-handler files, and none of the three crash paths involve HTTP handling.

Stress test (amd_tsan) - Logical error: 'Unexpected exception in refresh scheduling' (STID 2508-3e7b). Crash in BackgroundSchedulePool refresh scheduling (Refreshable MV). Tracked in #106737 (open). The accompanying "Cannot start clickhouse-server" and "Check failed" rows are downstream of that abort.
report

Stress test (arm_debug) - Logical error: Not-ready Set is passed as the second argument for function (STID 0250-4e52), in FilterTransform::transform. Known recurring planner / Set-readiness family. Tracked in #102803 (open).
report

Stress test (arm_asan_ubsan) - Hung check failed, possible deadlock found. Stack is AST-fuzzer query execution (executeASTFuzzerQueries -> TCPHandler::runImpl), no HTTP-handler frames.
report

No SIGSEGV / heap-UAF among them. The red Mergeable Check is only these 3 stress jobs. All three are nondeterministic fuzzer/stress crashes outside this PR's code path, so re-running the stress jobs should clear them.

(session cron:clickhouse-worker-slot-2:20260617-202500)

@vitlibar vitlibar requested a review from azat June 19, 2026 12:58

@azat azat left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from handlers order looks OK

Comment thread src/Server/HTTPHandlerRequestFilter.cpp Outdated
Comment on lines +214 to +218
else if (filter_type == "url_regex")
filters.push_back(urlFilter(config, config_prefix + ".url_regex", HTTPRequestFilterMatchType::Regex));
/// URL path matched as a base path
else if (filter_type == "url_prefix")
filters.push_back(urlFilter(config, config_prefix + ".url_prefix", HTTPRequestFilterMatchType::Prefix));

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the "url_prefix" should goes before "url_regex", since AFAIU this defines order in which handlers will be handled, and prefix should be handled before (AFAIR nginx does the same)

Basically I would say "url_regex" should be the last (also looks like more common term is "regexp" in clickhouse)

@vitlibar vitlibar Jun 19, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rules are checked in the same order as they appear in the config. The order of checking conditions like filter_type == url_prefix or filter_type == url_regex doesn't really matter here. I'll rename url_regex to url_regexp

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I mean imagine we have the following in config:

<http_handlers>
    <rule>
        <methods>GET</methods>
        <url_regex>/test_regex/[0-9]+</url_regex>
        <handler>
            <type>static</type>
            <status>200</status>
            <response_content>regex handler matched</response_content>
        </handler>
    </rule>
    <rule>
        <methods>GET</methods>
        <url_prefix>/test_regex</url_prefix>
        <handler>
            <type>static</type>
            <status>200</status>
            <response_content>handler matched</response_content>
        </handler>
    </rule>
</http_handlers>

In this case for "/test/regex" I would expect that the "url_prefix" will match

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation says:

Rules are checked from top to bottom as defined, and the first match will run the handler.

So in your example the first rule (regex handler matched) will work

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, fine with me them.

@clickhouse-gh

clickhouse-gh Bot commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

LLVM Coverage Report

Metric Baseline Current Δ
Lines 85.20% 85.20% +0.00%
Functions 92.50% 92.50% +0.00%
Branches 77.40% 77.40% +0.00%

Changed lines: Changed C/C++ lines covered by tests: 142/153 (92.81%) | Lost baseline coverage (was covered on master, now uncovered in this PR): 1 line(s) · Uncovered code

Full report · Diff report

@vitlibar

vitlibar commented Jun 20, 2026

Copy link
Copy Markdown
Member Author

@vitlibar vitlibar added this pull request to the merge queue Jun 20, 2026
Merged via the queue into ClickHouse:master with commit b65793f Jun 20, 2026
164 of 166 checks passed
@vitlibar vitlibar deleted the prefix-in-http-handler-url branch June 20, 2026 14:14
@robot-ch-test-poll3 robot-ch-test-poll3 added the pr-synced-to-cloud The PR is synced to the cloud repo label Jun 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-feature Pull request with new product feature pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants