fix(langfuse): resolve trace user from runtime context by d33kayyy · Pull Request #3794 · bytedance/deer-flow · GitHub
Skip to content

fix(langfuse): resolve trace user from runtime context#3794

Open
d33kayyy wants to merge 1 commit into
bytedance:mainfrom
d33kayyy:fix/langfuse-generation-trace-user-id
Open

fix(langfuse): resolve trace user from runtime context#3794
d33kayyy wants to merge 1 commit into
bytedance:mainfrom
d33kayyy:fix/langfuse-generation-trace-user-id

Conversation

@d33kayyy

Copy link
Copy Markdown

Why

Background / gateway agent runs were recording the wrong user on their Langfuse trace. The trace's langfuse_user_id is built from get_effective_user_id(), which reads the request-scoped _current_user ContextVar. When a run is invoked over an internal token on behalf of an end user, that ContextVar is never the end user — so every such run's trace was attributed to langfuse_user_id="default".

That makes per-user filtering / cost attribution in Langfuse useless for the entire background-run population: distinct end users all collapse onto one default bucket.

What changed

  • Background/gateway run traces are now attributed to the real end user whenever the run request carries context.user_id — instead of collapsing to default.
  • No-auth and browser-authenticated runs are unchanged (they keep resolving the user from the ContextVar).
  • Caller-supplied langfuse_user_id metadata still wins (unchanged).

This is the trace-attribute wiring in the run worker only — no agent, graph, prompt, or output behavior changes. It brings worker.py in line with the sandbox middleware/tools sites, which already resolve the effective user via resolve_runtime_user_id(runtime) (runtime.context["user_id"]get_effective_user_id()default).

This change inspired by PR #3729

Surface area

  • Frontend UI — page / component / setting / interaction under frontend/
  • Backend API — endpoint / SSE event / request-response shape under backend/app
  • Agents / LangGraph — agent node, graph wiring, langgraph.json, or prompt change
  • Sandboxdocker/ or sandboxed execution
  • Skills — change under skills/
  • Dependencies — new/upgraded entry in backend/pyproject.toml or frontend/package.json (say what it buys us)
  • Default behavior change — changes existing behavior without the user opting in (default model, default setting, data shape)
  • Docs / tests / CI only — no runtime behavior change

Screenshots / Recording

N/A — backend observability-metadata change only.

Bug fix verification

Bug is encoded as a failing test that goes red before the fix:

  • Test path: backend/tests/test_worker_langfuse_metadata.py::test_run_agent_uses_context_user_id_over_contextvar
  • The existing test_run_agent_falls_back_to_default_user_when_unset was
    repointed to patch get_effective_user_id at its definition module
    (user_context) — the name the resolver actually calls — and still pins the
    default fallback.

Validation

cd backend
# changed files — clean
uv run ruff check packages/harness/deerflow/runtime/runs/worker.py tests/test_worker_langfuse_metadata.py   # All checks passed!
uv run ruff format --check packages/harness/deerflow/runtime/runs/worker.py tests/test_worker_langfuse_metadata.py  # already formatted
# tracing tests
uv run pytest tests/test_worker_langfuse_metadata.py tests/test_client_langfuse_metadata.py tests/test_tracing_metadata.py   # 17 passed

AI assistance

Tool(s) used: Claude Code

How you used it: Investigated the trace-attribution bug and located the root cause, wrote the failing test first (TDD), then applied the one-line resolver swap and repointed the affected fallback test. I reviewed every line.

  • I've read and understand every line of this change and take responsibility for it — it's not unreviewed AI output.

The worker built langfuse_user_id from get_effective_user_id(), which reads
the request-scoped _current_user ContextVar. For runs invoked over an
internal token on behalf of an end user, that ContextVar is never the end
user, so traces recorded langfuse_user_id="default".

Switch to resolve_runtime_user_id(runtime), matching the sandbox
middleware/tools sites: it reads runtime.context["user_id"] (the owner
carried in the run request's context, which survives background-task
boundaries) and falls back to get_effective_user_id() for no-auth / browser
paths. Caller-supplied metadata still wins via inject_langfuse_metadata's
setdefault.
@CLAassistant

CLAassistant commented Jun 25, 2026

Copy link
Copy Markdown

@github-actions github-actions Bot added area:backend Gateway / runtime / core backend under backend/ risk:medium Medium risk: regular code changes size/S PR changes 20-100 lines labels Jun 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:backend Gateway / runtime / core backend under backend/ risk:medium Medium risk: regular code changes size/S PR changes 20-100 lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants