Add Live Debugger runtime benchmark coverage by watson · Pull Request #429 · DataDog/build-plugins · GitHub
Skip to content

Add Live Debugger runtime benchmark coverage#429

Open
watson wants to merge 1 commit into
masterfrom
watson/DEBUG-5787/add-bench
Open

Add Live Debugger runtime benchmark coverage#429
watson wants to merge 1 commit into
masterfrom
watson/DEBUG-5787/add-bench

Conversation

@watson

@watson watson commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

What and why?

This adds repeatable browser runtime benchmark coverage for dormant Live Debugger instrumentation. The goal is to give Live Debugger browser changes a controlled way to track whether instrumentation changes introduce measurable overhead in real browsers.

How?

Adds a Playwright benchmark under packages/tests/src/bench/liveDebuggerRuntime with Tiny and Hot workload shapes, baseline/control/instrumented variants, and SDK-like dormant probe hooks installed in the page.

Adds a custom reporter that computes conservative per-call overhead bounds, 95% confidence intervals, A/A diagnostics, moving-block bootstrap intervals, autocorrelation diagnostics, browser failures, raw JSON output, and a PR comment body.

Adds yarn workspace @dd/tests bench:live-debugger:runtime, a dedicated Playwright config, result artifacts, a non-blocking CI job, a shared Playwright setup action, and Live Debugger contributor docs explaining how to run and interpret the benchmark.

For more details, see the "Runtime benchmark" section in the added CONTRIBUTING.md file in this PR.

Validation

The benchmark runs in CI as a non-blocking job and uploads raw samples plus the generated report artifacts for review.

Benchmark note

The current benchmark numbers show a significant dormant-instrumentation overhead in Safari. Most of this is fixed in stacked PR #438.

watson commented Jun 22, 2026

Copy link
Copy Markdown
Contributor Author

@github-actions

github-actions Bot commented Jun 22, 2026

Copy link
Copy Markdown

Live Debugger Runtime Benchmark

SDK-loaded dormant-probe runtime overhead, measured against an uninstrumented bundle in the same browser session.

Browser Workload Quality Per-call overhead upper
chrome Hot clean <= 2.59 ns
chrome Tiny clean <= 0.02 ns
firefox Hot clean <= 9.70 ns
firefox Tiny clean <= 0.00 ns
safari Hot clean <= 17.29 ns
safari Tiny clean <= 13.74 ns
Full diagnostics
browser  workload  quality  per-call overhead upper  overhead upper           95% CI        A/A diag         block CI  acf(1)   baseline  instrumented                         samples
-------  --------  -------  -----------------------  --------------  ---------------  --------------  ---------------  ------  ---------  ------------  ------------------------------
chrome   Hot       clean                 <= 2.59 ns        <= 5.21%    2.55..2.58 ns  -0.01..0.01 ns    2.55..2.59 ns    0.04  50.832 ms     53.447 ms   102 (trim 20%, outliers 2.9%)
chrome   Tiny      clean                 <= 0.02 ns        <= 0.31%   -0.00..0.02 ns  -0.02..0.02 ns   -0.00..0.02 ns    0.03  86.865 ms     87.043 ms  102 (trim 20%, outliers 15.7%)
firefox  Hot       clean                 <= 9.70 ns       <= 38.68%    9.68..9.70 ns  -0.01..0.00 ns    9.68..9.70 ns    0.00  38.500 ms     53.400 ms   102 (trim 20%, outliers 4.9%)
firefox  Tiny      clean                 <= 0.00 ns        <= 0.00%  -0.07..-0.05 ns  -0.00..0.01 ns  -0.06..-0.05 ns   -0.22  98.240 ms     97.860 ms   102 (trim 20%, outliers 7.8%)
safari   Hot       clean                <= 17.29 ns       <= 69.14%  17.13..17.29 ns  -0.00..0.00 ns  17.13..17.29 ns   -0.10  38.400 ms     64.810 ms   102 (trim 20%, outliers 0.0%)
safari   Tiny      clean                <= 13.74 ns      <= 367.64%  13.66..13.73 ns  -0.00..0.00 ns  13.66..13.74 ns    0.13  11.740 ms     54.730 ms   102 (trim 20%, outliers 3.9%)

Raw samples are in the live-debugger-runtime-bench-results artifact.

@datadog-datadog-prod-us1-2

datadog-datadog-prod-us1-2 Bot commented Jun 22, 2026

Copy link
Copy Markdown

@watson watson force-pushed the watson/DEBUG-5787/add-bench branch 5 times, most recently from 3a9f3d6 to 1296572 Compare June 24, 2026 16:05
@watson watson marked this pull request as ready for review June 24, 2026 16:17
@watson watson requested review from a team as code owners June 24, 2026 16:17
@watson watson force-pushed the watson/DEBUG-5787/add-bench branch 2 times, most recently from d80cc4f to ba9d257 Compare June 26, 2026 11:49
The browser Live Debugger instrumentation needs repeatable runtime overhead
checks before transform changes land.

Add an opt-in Playwright benchmark that compares baseline, control, and
instrumented workloads in the same browser session with dormant probe hooks
installed. Report conservative per-call overhead bounds with confidence
intervals, A/A diagnostics, block bootstrap checks, and PR comment output.

Wire the benchmark into CI as a non-blocking job, share Playwright setup between
jobs, upload raw samples as artifacts, and document how contributors should run
and interpret the benchmark.
@watson watson force-pushed the watson/DEBUG-5787/add-bench branch from ba9d257 to 0168305 Compare July 1, 2026 13:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant