{{ message }}
[SPARK-56032][SQL][FOLLOWUP] Defer throw-capable notNullPreds past guard otherPreds in FilterExec CSE codegen#55506
Open
cloud-fan wants to merge 9 commits intoapache:masterfrom
Open
Conversation
…ard otherPreds in FilterExec CSE codegen ### What changes were proposed in this pull request? In `FilterExec.doConsume`'s CSE branch, emit each `IsNotNull(ref)` from `notNullPreds` just before the first `otherPred` that references `ref`, and defer any remaining `notNullPreds` to after all `otherPreds`. This mirrors the non-CSE path's ordering. Previously the CSE branch emitted the full block of `notNullPreds` up front, before any `otherPred`. ### Why are the changes needed? `InferFiltersFromConstraints` can add an `IsNotNull(expr)` whose inner expression can throw on valid non-null inputs -- e.g. `IsNotNull(cast(s as int))` derived from a downstream join. That `IsNotNull` lands in `notNullPreds`. The CSE branch was emitting it up front, so on any row where an earlier-ordered guard `otherPred` (e.g. `kind = 'numeric'`) would have short-circuited, the upfront `IsNotNull` still evaluated `cast(s as int)` and threw `CAST_INVALID_INPUT`. The non-CSE branch interleaves by-reference and defers the leftovers, so the same query succeeds there. Repro (new test): a CTE chain where an inner guard filter bounds the input to a type that the outer projection then casts; `CombineFilters` + `PushPredicateThroughProject` + `InferFiltersFromConstraints` fold everything into a single `Filter(isnotnull(kind) AND kind='numeric' AND isnotnull(cast(s as int)))`. With CSE enabled the cast throws on non-numeric rows; with CSE disabled it succeeds. Same class of bug as (a) -- short-circuit preservation -- but across the `notNullPreds`/`otherPreds` boundary instead of between adjacent `otherPreds`. ### Does this PR introduce _any_ user-facing change? No user-facing API change. Fixes a regression where `FilterExec` CSE codegen could throw on rows an earlier-ordered guard predicate would have rejected. ### How was this patch tested? Added `SPARK-56032: FilterExec CSE emits all notNullPreds before guard otherPred` to `WholeStageCodegenSuite`. Without the fix it throws `SparkNumberFormatException`; with the fix it matches the non-CSE result. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Claude Opus 4.7
yaooqinn
reviewed
Apr 23, 2026
Comment on lines
+1034
to
+1048
Member
There was a problem hiding this comment.
Consider reducing the comments here, e.g. the query example here and in tests are basically identical?
Co-authored-by: Isaac
- Fix missing verb in "The fix for (b)" comment block. - Reframe the shared-ref test around invariant (b) and add a plan-shape assertion requiring both cast(s as int) conjuncts to survive, so optimizer drift cannot silently bypass the intended shape. Co-authored-by: Isaac
…se CSE invariant comment - WholeStageCodegenSuite: tighten Test 1 optimizer-drift guard to require both IsNotNull(Cast(...)) and a non-IsNotNull guard conjunct in the *same* FilterExec condition; otherwise a future optimizer split into two FilterExecs would pass the old assertion without the bug shape. Adds local conjuncts() helper rather than pulling in PredicateHelper. - WholeStageCodegenSuite: add one-line rationale for ADAPTIVE_EXECUTION_ENABLED=false in both new tests (plan-shape visibility), so the intent isn't lost. - basicPhysicalOperators.scala: condense the three-invariant preamble -- (a) and (c) now lean on prior PRs (apache#55471) and only state the new fact; (b) keeps the expanded prose since it is the point of this PR.
…ver block and defensive null gate
…commentary Align generatePredicateCode's notNullPreds lookup with the CSE branch by using a total pattern match instead of an unchecked asInstanceOf[IsNotNull], so the two paths stay structurally in lockstep if notNullPreds is ever widened beyond IsNotNull. Behavior is unchanged today. Also condense redundant commentary across the two SPARK-56032 tests: pull the AQE-off rationale up once and cross-reference it from the sibling test, and tighten the optimizer-drift guard note -- addressing the reviewer's feedback on redundant commentary. Co-authored-by: Isaac
Contributor
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

What changes were proposed in this pull request?
Follow-up to #54862 and #55471.
In
FilterExec.doConsume's CSE branch, emit eachIsNotNull(ref)fromnotNullPredsjust before the firstotherPredthat referencesref, and defer any remainingnotNullPredsto after allotherPreds. This mirrors the non-CSE path's ordering. Previously (since #54862, refined in #55471) the CSE branch emitted the full block ofnotNullPredsup front, before anyotherPred.Why are the changes needed?
InferFiltersFromConstraintscan add anIsNotNull(expr)whose inner expression can throw on valid non-null inputs -- e.g.IsNotNull(cast(s as int))derived from a downstream join that expects the cast to be non-null. ThatIsNotNulllands innotNullPreds. The CSE branch was emitting it up front, so on any row where an earlier-ordered guardotherPred(e.g.kind = 'numeric') would have short-circuited, the upfrontIsNotNullstill evaluatedcast(s as int)and threwCAST_INVALID_INPUT. The non-CSE branch interleaves by-reference and defers leftovers, so the same query succeeds there.Repro (added as a new test): a CTE chain where an inner guard filter bounds the input to a type that the outer projection then casts;
CombineFilters+PushPredicateThroughProject+InferFiltersFromConstraintsfold everything into a singleWith CSE enabled the cast throws on non-numeric rows; with CSE disabled it succeeds on the same physical plan. Same class of bug as (a) in #55471 -- short-circuit preservation -- but across the
notNullPreds/otherPredsboundary rather than between adjacentotherPreds.Does this PR introduce any user-facing change?
No user-facing API change. Fixes a regression introduced in #54862 where
FilterExecCSE codegen could throw on rows an earlier-ordered guard predicate would have rejected.How was this patch tested?
Added `SPARK-56032: FilterExec CSE emits all notNullPreds before guard otherPred` to `WholeStageCodegenSuite`. Without the fix it throws `SparkNumberFormatException`; with the fix it matches the non-CSE result. All existing `WholeStageCodegenSuite` and `DataFrameSuite` tests continue to pass.
Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Opus 4.7