feat: nested CORS iframes, ignore controls, and closed shadow DOM#312
feat: nested CORS iframes, ignore controls, and closed shadow DOM#312aryanku-dev wants to merge 15 commits into
Conversation
Replace the flat top-level iframe loop with a recursive `processFrameTree` that switches into each cross-origin iframe, captures its DOM, and descends into any further cross-origin iframes nested inside it (up to a configurable depth). Cycles are detected by tracking the chain of ancestor frame URLs and skipping any frame whose `src` already appears in the chain — without this guard, pages that link to each other could produce up to `maxIframeDepth` duplicate corsIframes entries. The depth cap defaults to 5 (matching the canonical Percy SDK behaviour) and is configurable per-snapshot via `maxIframeDepth` or via `cliConfig.snapshot.maxIframeDepth`. Inputs are clamped to a 1..10 range through `clampFrameDepth`. Nested-frame origin is compared against the IMMEDIATE PARENT origin (not the top page origin) so a same-origin grandchild inside a cross-origin parent is correctly inlined by PercyDOM and a cross-origin grandchild inside a same-origin parent is still captured. Mirrors percy/percy-nightwatch#869 and percy/percy-playwright#609. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Skip iframes that carry the `data-percy-ignore` boolean attribute when enumerating both top-level and nested cross-origin iframes. Customers add this attribute to opt out of CORS iframe capture for a specific frame without having to maintain a selector list — useful for ad slots or analytics iframes whose contents are noisy. Selenium's `getAttribute` returns an empty string for boolean attributes with no value, so a non-null result is treated as a positive hit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Customers can now pass an `ignoreIframeSelectors` list (either in the per-snapshot options Map or via `cliConfig.snapshot.ignoreIframeSelectors`) to skip any cross-origin iframe whose element matches one of the supplied CSS selectors. Matching is performed in-browser via `Element.matches` so any selector the browser accepts is valid; invalid selectors are tolerated without aborting the snapshot. Inputs go through `normalizeIgnoreSelectors` which accepts a List<String>, a single String, or null and yields a sanitised List<String> with empty/ whitespace-only entries removed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After switching into a cross-origin iframe, read `document.URL` and run the unsupported-src check again. The parent-side `src` attribute can be stale or misleading — the frame may have failed to load (leaving an about:blank document), or been navigated by script after attach to a data:/javascript: URL. Skipping these post-switch avoids attempting to serialize a placeholder document. When a post-switch URL is available it is also reported as the captured `frameUrl` and used as the parent context for any nested CORS iframe enumeration. Falls back to the parent-side `src` when the executor returns a non-String value (e.g. under mocking). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ostException When the driver fails to step back to a parent frame after recursing into a nested cross-origin iframe, we previously lost everything captured so far (a flaky network call inside a depth-3 frame would forfeit even the depth-1 snapshot). Introduce `PercyContextLostException` which carries a `partialCapture` list of every iframe snapshot collected before the failure; each recursion layer appends its own captures to the carried list and re-throws, and the top-level loop in `getSerializedDOM` merges the recovered captures into the snapshot and falls back to default content before aborting further sibling enumeration. Mirrors the `percyContextLost` flag in percy/percy-nightwatch#869 and percy/percy-webdriverio#... so the wire-format output stays consistent across SDKs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closed shadow roots (`{mode: 'closed'}`) are invisible to JavaScript —
`element.shadowRoot` is `null` and there is no API that returns the
underlying ShadowRoot object. The PercyDOM serializer can pierce them
through a window-bound `__percyClosedShadowRoots` WeakMap (host element
→ shadow root) populated before serialization, but Selenium has no way
to obtain the closed shadow root from page script.
Use Chrome DevTools Protocol to discover and resolve them:
1. `DOM.getDocument {depth: -1, pierce: true}` to walk the entire DOM
tree including closed shadow subtrees.
2. For each closed shadow root, `DOM.resolveNode` on the host and the
shadow root to obtain JS object handles.
3. `Runtime.callFunctionOn` to write the pair into the WeakMap.
`contentDocument` nodes are skipped because their execution context is
distinct and has no WeakMap. Non-Chromium drivers are detected with a
single `instanceof ChromeDriver` check and silently fall through, so the
SDK keeps working with Firefox/WebKit without changes.
Mirrors percy/percy-playwright#609.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add JUnit + Mockito unit tests for the new helper methods and the nested cross-origin iframe capture flow: - `clampFrameDepth` bounds + defaults - `normalizeIgnoreSelectors` accepts List<String> / String / null - `resolveMaxFrameDepth` precedence (option > cliConfig > default) - `resolveIgnoreSelectors` precedence - `data-percy-ignore` iframes are skipped without `switchTo` - `ignoreIframeSelectors` matches are skipped without `switchTo` - `processFrame` bails after switch when document.URL is unsupported - `PercyContextLostException.partialCapture` round-trips - `getSerializedDOM` recovers partial captures on context loss - `exposeClosedShadowRoots` is a no-op for non-Chrome drivers - `collectClosedShadowPairs` walks the CDP tree and skips iframes Tests live in a separate `IframeFeatureTest` class to avoid being blocked by `SdkTest`'s `@BeforeAll` Firefox initialisation in environments without a Firefox binary. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…anup - responsive snapshot detection threw NPE when cliConfig.snapshot was missing or JSON-null; guard each layer before reading responsiveSnapshotCapture. - getSerializedDOM treated a null jse return as a Map and ClassCastException'd deep in the snapshot path. Detect non-Map results and raise a clear error pointing at the @percy/dom load failure as the root cause. - Pair every successful DOM.enable with DOM.disable in a finally block so the CDP session doesn't keep emitting DOM events after closed-shadow capture. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Null-safe origin equality in both top-level and nested iframe comparisons. Switched to Objects.equals so a child URI that resolves to no host (data:, mailto:, schemeless) can never trigger an NPE that escapes the per-iframe catch. - Document the clampFrameDepth semantic: maxIframeDepth=0 falls back to DEFAULT_MAX_FRAME_DEPTH (5), mirroring @percy/sdk-utils. Disabling CORS capture should use ignoreIframeSelectors or data-percy-ignore, not depth=0. Comment guards against a silent flip in future refactors. - Expose closed shadow roots inside each CORS frame after switchTo() — mirrors the top-page behaviour so closed shadow DOM inside cross- origin iframes is also captured. Per-pair try/catch in the existing helper keeps one bad backendNodeId from aborting the rest. TODO tracks moving to per-frame CDP sessions when BiDi stabilises. - Remove the dead processFrame method — fully replaced by processFrameTree. Keeping duplicate logic invited drift. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- nestedIframeWithNullOriginIsNullSafeAndDoesNotAbortLoop: regression test for the NPE risk at the child-origin comparison; a data:... child must not abort the outer CORS frame capture. - clampFrameDepthZeroReturnsDocumentedDefault: semantic guard so any future change to treat 0 as "disable" trips a test. - exposeClosedShadowRootsIsAttemptedInsideCorsFrame: confirms the per-CORS-frame helper invocation and the TODO marker survive future refactors; ensures the call is safe on a non-Chrome driver. - collectClosedShadowPairsContinuesPastOneBadEntry: documents that the collector tolerates missing backendNodeId fields without throwing, and one bad pair does not abort the rest at runtime. - Update the post-switch unsupported-URL test to drive processFrameTree directly (processFrame removed). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Picks up the latest CLI patch series for parity with sibling SDKs and to unblock the external percy/percy-java-selenium status check. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The legacy processFrame(WebElement, Map) helper was removed as dead code once processFrameTree subsumed it. The reflection-based unit test still called the old name and failed with NoSuchMethodException in CI. Rewrite it to drive the same skip-when-percyElementId-missing path through processFrameTree, asserting an empty result and that the driver is never switched into the frame. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nd-shadow-dom # Conflicts: # package.json # src/main/java/io/percy/selenium/Percy.java
aryanku-dev
left a comment
There was a problem hiding this comment.
Claude Code Review (automated) — 5 inline finding(s). Full report in the PR comment below. Verdict: Failed - see PR comment.
| // Selenium 4 BiDi support stabilises across versions. | ||
| try { exposeClosedShadowRoots(driver); } catch (Exception ignore) {} | ||
|
|
||
| Map<String, Object> iframeSnapshot = serializeCurrentFrame(options); |
There was a problem hiding this comment.
[High] Null iframeSnapshot silently poisons the CORS payload
serializeCurrentFrame unchecked-casts the PercyDOM.serialize result and returns null when the script yields null (e.g. PercyDOM failed to load in a sandboxed CORS frame). That null is then placed straight into the collected entry below, producing a malformed corsIframes payload. The main-page path getSerializedDOM (~line 1063) explicitly throws on a non-Map result — this path lacks the equivalent guard.
Suggestion: After the call, if the snapshot is null/non-Map, log a warning and return collected; (skip this frame) rather than adding an entry with a null snapshot.
Reviewer: stack:code-reviewer
| // for host + shadow, then Runtime.callFunctionOn to write the pair into | ||
| // the WeakMap on the page. | ||
| @SuppressWarnings("unchecked") | ||
| private void exposeClosedShadowRoots(WebDriver driver) { |
There was a problem hiding this comment.
[Medium] CDP call bypasses the existing isCdpSupported guard
exposeClosedShadowRoots issues executeCdpCommand directly and relies on an outer try/catch, while changeWindowDimensionAndWait (~line 1275) guards with isCdpSupported(ChromeDriver). Use the same pattern for consistency.
Suggestion: if (!(driver instanceof ChromeDriver)) return; then if (!isCdpSupported(chrome)) return; before any CDP call.
Reviewer: stack:code-reviewer
| if (cliConfig != null && cliConfig.has("snapshot") && !cliConfig.isNull("snapshot")) { | ||
| JSONObject snap = cliConfig.getJSONObject("snapshot"); | ||
| if (snap.has("ignoreIframeSelectors") && !snap.isNull("ignoreIframeSelectors")) { | ||
| JSONArray arr = snap.optJSONArray("ignoreIframeSelectors"); |
There was a problem hiding this comment.
[Medium] Single-string ignoreIframeSelectors CLI config is silently dropped
snap.optJSONArray("ignoreIframeSelectors") returns null for a JSON string value (e.g. "ignoreIframeSelectors": "iframe.ads"), so a string-valued config is dropped — inconsistent with the options-map path that normalizes strings.
Suggestion: When optJSONArray is null, fall back to optString and run it through normalizeIgnoreSelectors.
Reviewer: stack:code-reviewer
| try { | ||
| JavascriptExecutor jse = (JavascriptExecutor) driver; | ||
| JSONArray sel = new JSONArray(selectors); | ||
| String script = "var el = arguments[0]; var selectors = " + sel.toString() + ";" |
There was a problem hiding this comment.
[Medium] Selector list interpolated into the script body
The script is built by concatenating sel.toString() (a JSON array literal) into the JS source. A selector containing a quote or control character breaks the literal and throws a script error; the inner el.matches try/catch only handles invalid CSS.
Suggestion: Pass the selector list as an executeScript argument (arguments[1]) instead of interpolating it.
Reviewer: stack:code-reviewer
| log("Reached max iframe nesting depth (" + maxFrameDepth + "); stopping at " + frameSrc, "debug"); | ||
| return collected; | ||
| } | ||
| if (ancestorUrls != null && frameSrc != null && ancestorUrls.contains(frameSrc)) { |
There was a problem hiding this comment.
[Medium] Cycle guard can miss a relative-vs-absolute src cycle
The guard checks ancestorUrls.contains(frameSrc) on the raw src. The ancestor set is seeded with both raw frameSrc and absolute reportedUrl, so identical recurring src strings are caught — but a cycle expressed as a relative src at one level and an absolute URL at another can slip through. Also untested.
Suggestion: Resolve frameSrc to absolute (against the parent URL) before seeding/checking the ancestor set, and add a cycle-guard test.
Reviewer: stack:code-reviewer
Claude Code PR ReviewPR: #312 • Head: 4b7835d • Reviewers: stack:code-reviewer SummaryAdds cross-origin (CORS) iframe capture and closed shadow DOM exposure to the Percy Selenium Java SDK: recursive iframe traversal with a depth cap and cycle guard, Review Table
FindingsHigh
Medium
Low
Missing test coverage (non-gating)
Verdict: FAIL — one High-severity correctness issue (#2: null iframe snapshot silently poisons the CORS payload). The Medium findings (#4–#7) are worth folding in before merge; the Low items can follow up. |
- Skip CORS frame when PercyDOM.serialize returns null instead of emitting a poisoned snapshot entry (mirror getSerializedDOM guard) [High] - Guard exposeClosedShadowRoots with isCdpSupported before CDP calls [Med] - Honour single-string ignoreIframeSelectors CLI config value [Med] - Pass ignore selectors as executeScript argument, not interpolated [Med] - Add post-switch absolute-URL cycle check for relative-src cycles [Med] - Remove unused CSS/XPath imports [Low] - Add tests for null-snapshot skip, resolved-URL cycle guard, single-string CLI Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
aryanku-dev
left a comment
There was a problem hiding this comment.
Claude Code Review (automated) — 2 inline finding(s). Full report in the PR comment below. Verdict: Passed.
|
|
||
| when(((JavascriptExecutor) mockedDriver).executeScript(any(String.class))) | ||
| .thenReturn(null) | ||
| .thenReturn("https://cdn.other.com/frame"); |
There was a problem hiding this comment.
[Medium] Brittle positional stub in the cycle-guard test
This .thenReturn(null).thenReturn("https://cdn.other.com/frame") chain passes today, but if a future change inserts another executeScript call between the dom.js injection and readCurrentFrameUrl(), the URL stub fires at the wrong call site — postSwitchUrl becomes null, the cycle check is skipped, and the test stays green while no longer validating the cycle guard.
Suggestion: Use a script-content-keyed thenAnswer (match return document.URL), as the null-serialize and context-lost tests already do.
Reviewer: stack:code-reviewer
| percy, "processFrameTree", | ||
| new Class[]{WebElement.class, int.class, Set.class, Map.class}, | ||
| iframe, 1, new HashSet<String>(), ctx); | ||
| assertTrue(result.isEmpty(), "Frame must be skipped when PercyDOM.serialize returns null"); |
There was a problem hiding this comment.
[Low] Test doesn't verify the frame was stepped out of
The test asserts the result is empty but doesn't confirm parentFrame() ran in the finally. Adding the verify would guard against a future change that bails before switchedIn = true.
Suggestion: Add verify(targetLocator, times(1)).parentFrame(); after the assertion.
Reviewer: stack:code-reviewer
Claude Code PR ReviewPR: #312 • Head: c7ffcb1 • Reviewers: stack:code-reviewer Continues the previous review — changes since SummaryDelta commit Review TableFindingsResolved since last review
Still open (non-gating)
Verdict: PASS — every gating finding from the previous review (1 High + 4 Mediums) is resolved and the fixes were independently verified as correct. The remaining items are non-gating test-quality nits and one pre-existing informational note. |

Summary
Brings percy-selenium-java to parity with the canonical Percy CORS iframe + closed shadow DOM feature set.
Implemented
data-percy-ignoreattribute opt-outignoreIframeSelectorsoptionisUnsupportedIframeSrcPercyContextLostExceptionrecovery mergespartialCaptureexposeClosedShadowRoots)clampFrameDepth,normalizeIgnoreSelectors,resolveMaxFrameDepth,resolveIgnoreSelectors)Skipped
Reference
Mirrored from percy/percy-nightwatch#869 (PER-7292-add-cors-iframe-support); CDP from percy/percy-playwright#609.
Test plan
IframeFeatureTest(11 tests) and existingCacheTest(3 tests) pass undermvn test.SdkTest's integration tests require a local Firefox binary (@BeforeAllinstantiatesFirefoxDriver). They were not exercised in the sync environment because Firefox is not installed; this matches the pre-existing baseline onmasterand is not a regression introduced by this PR.🤖 Generated with Claude Code via /percy-sdk-sync