Auto-enable op-tracing via per-model op_tracing_targets by xieofxie · Pull Request #1032 · microsoft/winml-cli · GitHub
Skip to content

Auto-enable op-tracing via per-model op_tracing_targets#1032

Merged
xieofxie merged 2 commits into
mainfrom
hualxie/add_tracing_flag
Jul 3, 2026
Merged

Auto-enable op-tracing via per-model op_tracing_targets#1032
xieofxie merged 2 commits into
mainfrom
hualxie/add_tracing_flag

Conversation

@xieofxie

@xieofxie xieofxie commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds per-model auto-enable of op-tracing in the e2e eval runner. When run_eval.py runs without an explicit --op-tracing, a model whose registry entry opts into the current <EP>_<device> target now gets basic op-tracing automatically.

Changes

  • scripts/e2e_eval/utils/registry.py — add op_tracing_targets: list[str] field to ModelEntry and load it in load_registry.
  • scripts/e2e_eval/run_eval.py
    • _op_tracing_target_key(ep, device) builds the match key, e.g. QNNExecutionProvider_npu.
    • _resolve_op_tracing(cli, entry, ep, device) resolves the effective level: disabled forces off (even for opted-in models); basic/detail used as-is; unset defaults to basic when the target key is in the model's op_tracing_targets, otherwise off.
    • Add disabled to --op-tracing choices with expanded help text.
    • Compute effective op_tracing per model in the run loop and pass it to both run_model call sites.
  • scripts/e2e_eval/testsets/models_all.json — add "op_tracing_targets": ["QNNExecutionProvider_npu"] to the 10 profiled models.

Behavior

--op-tracing Model opts in (op_tracing_targets matches) Result
omitted yes basic
omitted no off
basic/detail any as specified
disabled any off

Add an op_tracing_targets list to model registry entries so op-tracing turns on automatically for a given <EP>_<device> target (e.g. QNNExecutionProvider_npu) unless --op-tracing is explicitly set. Add 'disabled' choice to force it off.
@xieofxie xieofxie requested a review from a team as a code owner July 3, 2026 02:46

@DingmaomaoBJTU DingmaomaoBJTU left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the per-model auto-enable op-tracing change. The approach is clean overall: tracing runs as a separate additive pass so it doesn't skew the benchmark, and disabled gives an escape hatch. A few things worth addressing, the main one being the device=auto matching gap. Details inline.

Comment thread scripts/e2e_eval/run_eval.py Outdated
Comment thread scripts/e2e_eval/run_eval.py
Comment thread scripts/e2e_eval/testsets/models_all.json
Comment thread scripts/e2e_eval/utils/registry.py Outdated
… add tests

- Centralize the <EP>_<device> key builder in registry (op_tracing_target_key) and reuse it in run_eval so there is a single source of truth.
- Normalize op_tracing_targets on load so the registry can use the short EP alias (qnn_npu) consistent with the rest of the eval tooling; both alias and full name map to the canonical QNNExecutionProvider_npu.
- Document that auto-enable requires explicit --ep and --device (default --device auto does not match a device-specific target).
- Add unit tests for _resolve_op_tracing (disabled-overrides-optin, explicit levels, auto-device no-match, no-ep) and the key builder/normalizer.

@DingmaomaoBJTU DingmaomaoBJTU left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewed the follow-up commit (bbef45c). Thanks for the quick turnaround — all four points are addressed well:

  1. device=auto gap — not silently ignored anymore: documented in the _resolve_op_tracing docstring, the --op-tracing help text, and pinned by test_unset_device_auto_does_not_match. Reasonable to require explicit --ep/--device rather than guessing.
  2. TestsTestResolveOpTracing + TestOpTracingTargetKey cover disabled-overrides-optin, explicit levels, matching/non-matching, auto-device, and no-ep. I ran the file locally: 57 passed, ruff clean.
  3. Naming — centralizing on op_tracing_target_key / _canonical_ep_device_key and accepting both qnn_npu and the full form (JSON now uses the short alias, consistent with the pair catalog) resolves the two-conventions concern. Key builder and normalizer share one code path, so they can't drift.
  4. Validation — normalization handles case/alias variance; one residual gap noted inline (non-blocking).

Looks good to merge once you've had a look at the inline note.

Comment thread scripts/e2e_eval/utils/registry.py

@DingmaomaoBJTU DingmaomaoBJTU left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving. All four review points from the prior round are addressed; op-tracing runs as a separate additive pass so it does not skew the benchmark, disabled gives a clean override, and the key builder/normalizer share one code path. Verified locally: 57 tests pass in tests/unit/eval/test_run_eval_script.py and ruff is clean. The remaining registry-validation note is non-blocking (every current target is qnn_npu and CI pins --device npu).

@xieofxie xieofxie merged commit b95fd48 into main Jul 3, 2026
9 checks passed
@xieofxie xieofxie deleted the hualxie/add_tracing_flag branch July 3, 2026 07:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants