[CALCITE-5132] Scalar IN subquery returns UNKNOWN instead of FALSE when key is partially NULL#4814
Conversation
|
This PR is not yet ready for review. I need to continue checking. |
|
|
I think it is ready for review. |
| // Now the left join | ||
| builder.join(JoinRelType.LEFT, builder.and(conditions), variablesSet); | ||
|
|
||
| // for multi-column IN with nullable RHS columns, the global ck < c check is |
There was a problem hiding this comment.
The comment here needs some context: the actual comparison ck < c is much lower.
Perhaps you can first explain how the expansion is handled for scalar values.
| // (empno, deptno) IN (select empno, deptno from ...) | ||
| // | ||
| // where one RHS row is (7521, null): | ||
| // empno=7521: AND(TRUE, UNKNOWN) = UNKNOWN → result should be null |
There was a problem hiding this comment.
I also don't understand what this AND is
core/src/main/java/org/apache/calcite/rel/rules/SubQueryRemoveRule.java
Outdated
Show resolved
Hide resolved
| // when ct.ck < ct.c then null -- some RHS value is NULL | ||
| // this is correct: if no direct match was found (dt.i IS NULL) and any | ||
| // RHS value is NULL, the answer for every unmatched LHS key is UNKNOWN, | ||
| // because we cannot rule out that the LHS key equals the NULL. |
There was a problem hiding this comment.
equals the NULL -> is NULL.
| // (empno, deptno) IN (select empno, deptno from ...) | ||
| // where one RHS row is (7521, null). For each LHS row, the dt join condition | ||
| // "empno = rhs.empno AND deptno = rhs.deptno" becomes "empno = 7521 AND deptno = null": | ||
| // for LHS empno=7521: (7521 = 7521) AND (7521 = null) = TRUE AND UNKNOWN = UNKNOWN |
There was a problem hiding this comment.
This is still confusing because = is used in two ways, maybe you can replace some = to "which reduces to"
| if (needsNullRowJoin) { | ||
| // nr.i IS NOT NULL: LHS key matched a null-row in RHS → UNKNOWN. | ||
| operands.add(builder.isNotNull(builder.field(nrAlias, "i")), b); | ||
| } else if (needsNullSafety) { |
There was a problem hiding this comment.
is the original comment valuable? If so, maybe you can move it here
Also, after this PR is merged, the word "traditional" will make no sense anymore - people who will read the code won't know that it was here before the other code was added. So I would just remove it.
| // Now the left join | ||
| builder.join(JoinRelType.LEFT, builder.and(conditions), variablesSet); | ||
|
|
||
| // for single-column IN (e.g. deptno IN (select deptno ...)), the CASE |
There was a problem hiding this comment.
I think this would be easier to read if you showed first an example of the LEFT JOIN that is being produced and then followed with these comments.
|
I discovered that the previous implementation had an extra join, and this commit removed that join. An additional SQL description has been provided. |
|
Yes, some plans had become larger, but now I see they all stay the same size or shrink. |
| // | ||
| // becomes | ||
| // | ||
| // select e.empno, |
…en key is partially NULL
|



See CALCITE-5132