Skip to content

[CALCITE-5132] Scalar IN subquery returns UNKNOWN instead of FALSE when key is partially NULL#4814

Merged
xiedeyantu merged 1 commit intoapache:mainfrom
xiedeyantu:CALCITE-5132
Mar 7, 2026
Merged

[CALCITE-5132] Scalar IN subquery returns UNKNOWN instead of FALSE when key is partially NULL#4814
xiedeyantu merged 1 commit intoapache:mainfrom
xiedeyantu:CALCITE-5132

Conversation

@xiedeyantu
Copy link
Member

@xiedeyantu xiedeyantu marked this pull request as draft March 1, 2026 15:06
@xiedeyantu
Copy link
Member Author

This PR is not yet ready for review. I need to continue checking.

@sonarqubecloud
Copy link

sonarqubecloud bot commented Mar 1, 2026

@xiedeyantu xiedeyantu marked this pull request as ready for review March 2, 2026 14:51
@xiedeyantu
Copy link
Member Author

I think it is ready for review.

// Now the left join
builder.join(JoinRelType.LEFT, builder.and(conditions), variablesSet);

// for multi-column IN with nullable RHS columns, the global ck < c check is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment here needs some context: the actual comparison ck < c is much lower.
Perhaps you can first explain how the expansion is handled for scalar values.

// (empno, deptno) IN (select empno, deptno from ...)
//
// where one RHS row is (7521, null):
// empno=7521: AND(TRUE, UNKNOWN) = UNKNOWN → result should be null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also don't understand what this AND is

// when ct.ck < ct.c then null -- some RHS value is NULL
// this is correct: if no direct match was found (dt.i IS NULL) and any
// RHS value is NULL, the answer for every unmatched LHS key is UNKNOWN,
// because we cannot rule out that the LHS key equals the NULL.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

equals the NULL -> is NULL.

// (empno, deptno) IN (select empno, deptno from ...)
// where one RHS row is (7521, null). For each LHS row, the dt join condition
// "empno = rhs.empno AND deptno = rhs.deptno" becomes "empno = 7521 AND deptno = null":
// for LHS empno=7521: (7521 = 7521) AND (7521 = null) = TRUE AND UNKNOWN = UNKNOWN
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still confusing because = is used in two ways, maybe you can replace some = to "which reduces to"

if (needsNullRowJoin) {
// nr.i IS NOT NULL: LHS key matched a null-row in RHS → UNKNOWN.
operands.add(builder.isNotNull(builder.field(nrAlias, "i")), b);
} else if (needsNullSafety) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the original comment valuable? If so, maybe you can move it here
Also, after this PR is merged, the word "traditional" will make no sense anymore - people who will read the code won't know that it was here before the other code was added. So I would just remove it.

// Now the left join
builder.join(JoinRelType.LEFT, builder.and(conditions), variablesSet);

// for single-column IN (e.g. deptno IN (select deptno ...)), the CASE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be easier to read if you showed first an example of the LEFT JOIN that is being produced and then followed with these comments.

@xiedeyantu
Copy link
Member Author

I discovered that the previous implementation had an extra join, and this commit removed that join. An additional SQL description has been provided.

@mihaibudiu
Copy link
Contributor

Yes, some plans had become larger, but now I see they all stay the same size or shrink.
Very good.

//
// becomes
//
// select e.empno,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is very useful

@xiedeyantu xiedeyantu added the LGTM-will-merge-soon Overall PR looks OK. Only minor things left. label Mar 4, 2026
@sonarqubecloud
Copy link

sonarqubecloud bot commented Mar 4, 2026

@xiedeyantu xiedeyantu merged commit b04f744 into apache:main Mar 7, 2026
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

LGTM-will-merge-soon Overall PR looks OK. Only minor things left.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants