Schema lineage full-screen: rich node cards, cross-DB transitive lineage & a detail pane (parity with system-audit) · Issue #47 · Altinity/altinity-sql-browser · GitHub
Skip to content

Schema lineage full-screen: rich node cards, cross-DB transitive lineage & a detail pane (parity with system-audit) #47

Description

@BorisTyshkevich

Summary

Our sibling project system-audit (the offline ClickHouse audit report) recently
grew a much richer full-screen schema-lineage view, and we'd like the SQL browser's
full-screen schema graph (openSchemaFullscreen in src/ui/explain-graph.js) to reach
parity. Today the SQL browser draws a clean but minimal graph: each object is a single
labelled box (db.table), kind-coloured border, feeds/writes/… edge labels, and a
click runs SHOW CREATE. That's a great base — this issue is about adding the three
things that make the audit version more useful: rich node cards, a node detail pane,
and cross-database (transitive) lineage with the external objects shown as dimmed
context.

This is a feature request + design note, not a bug report. It deliberately reuses
patterns we already shipped and hardened in system-audit, so the hard parts (the
traversal, the dimming UX, the detail layout, the gotchas) are already worked out.

Current behaviour (observed on github.demo.altinity.cloud/sql)

Dropping a database onto the results pane → app.actions.showSchemaGraph
loadSchemaLineagebuildSchemaGraphbuildSchemaSvg/openSchemaFullscreen.
The result:

  • Nodes are plain <rect> + <text> boxes (renderGraphSvg in explain-graph.js):
    just the db.table label, no engine / row count / size / columns.
  • Single-database scope. loadSchemaLineage(ctx, focus) queries
    system.tables WHERE database = focus.db only. So when a Distributed / MV-TO /
    dictionary edge points at a table in another database, buildSchemaGraph does emit
    a node for it (via node(refId, …)), but:
    • it renders identically to in-DB objects (no "this is elsewhere" signal), and
    • its own lineage is never followed (its rows were never loaded), so it's a dead-end
      leaf even if it's actually the hub of a big pipeline in the other DB.
  • Clicking a node inserts SHOW CREATE into the editor — there's no in-graph way to see
    columns/keys/partitions without leaving the graph.

(Screenshots of the current full-screen view vs. the system-audit pop-out are attached
in the issue thread.)

What system-audit's pop-out does (the target)

Repo: Altinity/system-audit, generator/assets/app.js (openDbGraph / openDbWindow
/ cardHtml / popupDetailHtml) and generator/views/schema_graph.py. Relevant commit
on branch feat/schema-graph-collapse-popout.

  1. Cross-DB transitive lineage. Seeds on the clicked DB's objects, then BFS-walks
    lineage edges in both directions across database boundaries, pulling external
    objects in one hop at a time until the chain ends (transitive closure of the connected
    components touching the DB). A Distributed → shard-local, an MV → target in another DB, or a dictGet to a dictionary elsewhere is followed instead of dead-ending. A
    soft node cap (600) stops a pathologically interconnected cluster from freezing the
    view.

  2. External objects rendered as dimmed/dashed context. Every node whose DB ≠ the
    clicked DB is tagged external and drawn at reduced opacity with a dashed border, so it
    reads as "context from elsewhere", not part of this DB. The cross-DB edges are drawn
    normally.

  3. Rich node cards. Each box is an HTML card: engine, row count, compressed size, and
    the top-N columns with type + key-role flags (PK / SK / partition / sampling) and a
    +N more overflow, plus a skip-index line.

  4. Node detail pane. Tapping a node opens a resizable pane with the full column table
    (type, codec, compressed / uncompressed / ratio / marks), partitions
    (parts/rows/bytes), partition/sorting/primary keys, and the formatted DDL.

Proposed work for altinity-sql-browser

1. Cross-DB transitive lineage (the headline)

buildSchemaGraph is already cross-DB-capable on the assembly side (it joins refs with
their own db and keeps any node that participates in an edge). The missing piece is at the
data-loading layer — and this is the one real architectural difference from
system-audit: the audit report ships all DBs' nodes/edges in one offline bundle, so its
BFS is pure in-memory graph traversal. The SQL browser is live, so the traversal has
to happen at the query level:

  • After the first loadSchemaLineage(focus.db) + buildSchemaGraph, collect edge
    endpoints whose database ∉ {already-loaded DBs}. Load lineage rows for those databases,
    merge, rebuild, and repeat until no new external DB appears or a cap is hit.
  • Cap both the number of external databases pulled and the total node count (mirror the
    600-node soft cap; surface a "lineage truncated" note when it trips). This dovetails
    with Schema lineage: cap the EXPLAIN AST fan-out on large schemas #42 (EXPLAIN-AST fan-out cap) — crossing DBs multiplies the per-view AST calls, so
    the cap and the AST-fan-out budget should be shared/aware of each other.
  • Tag external nodes: n.external = (n.db !== focus.db) and render them dimmed + dashed.
    You already have an external kind (non-CH dictionary sources, ext: ids) — keep that
    distinct; "external" here is an orthogonal flag on a normal CH object, so add a
    .eg-node--ext modifier / external data attr rather than overloading the kind.

A lighter v1 (if full transitive expansion is too heavy for huge clusters): expand one
hop
across DB boundaries (load only the directly-referenced external DBs, don't recurse).
That already kills the dead-end-leaf problem for the common Distributed/MV-target cases.

2. Rich node cards

Replace the single-line label with a card showing engine, rows, size, and top-N columns +
key flags. This needs columns for the graph's tables — currently columns load lazily
per-table in the tree (loadColumns). Either batch-load columns for the graph's node set
in loadSchemaLineage, or lazy-load on first render. See system-audit _attach_cards
(schema_graph.py) for the column/flag/skip-index shape and cardHtml for the layout.

Rendering note: see §4 — multi-line cards don't fit the current <rect>+<text> drawer;
they want either SVG <foreignObject> HTML or a node-html-label layer.

3. Node detail pane

On node click, open a side/bottom pane (Esc / ✕ to close) with the full column table
(type, codec, compressed/uncompressed/ratio/marks), partitions, keys, and DDL — instead of
(or alongside) the current SHOW CREATE-to-editor action. system-audit's popupDetailHtml
is a drop-in reference for the layout and the data it pulls. The columns/partitions queries
(system.columns, system.parts) are the same ones build_table_details runs.

4. Rendering approach — SVG <foreignObject> vs cytoscape (note for the implementer)

system-audit uses cytoscape + cytoscape-dagre + node-html-label to get HTML cards.
The SQL browser uses a pure dagre layout + hand-rolled SVG + panzoom.js stack.
Both already use dagre for layout, so this is purely about the node-rendering layer.
My recommendation is to stay on the SVG stack and add <foreignObject> HTML cards,
not adopt cytoscape — see the trade-off table in the issue thread. In short: cytoscape's
only real win is out-of-the-box rich labels, and in system-audit it cost a heavy dependency
plus two interaction bugs we had to work around (its public pan() was a no-op so we mutate
the private viewport; cross-document popup re-use threw on file://). The SQL browser's
existing attachPanZoom already does drag-pan, wheel-pan, ⌘/Ctrl-wheel zoom, and fit
cleanly — keeping it avoids re-introducing those problems.

5. Minor: dotted-database-name id split

node() in schema-graph.js derives db/name by splitting the id at the first dot
(id.slice(0, dot)). A database named my.db (id my.db.tbl) mis-splits to db=my. In
system-audit we hit the same bug and fixed it by carrying the db separately from the id
rather than re-splitting. Low priority (dotted DB names are rare) but it's latent here and
the cross-DB work above makes correct db-of-node attribution load-bearing.

Acceptance criteria

  • Dropping a DB whose objects have cross-DB lineage shows the external objects
    (transitively, or at least one hop), drawn dimmed/dashed, with the cross-DB edges.
  • A DB with only intra-DB lineage looks unchanged.
  • Node cards show engine + rows + size + top-N columns with key flags.
  • Clicking a node opens a detail pane with the full column table (incl. marks),
    partitions, keys, and DDL.
  • A node/DB cap prevents runaway expansion on large clusters (coordinated with Schema lineage: cap the EXPLAIN AST fan-out on large schemas #42),
    with a visible "truncated" note when it trips.
  • Existing pan/zoom/fit/Esc behaviour preserved.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions