Skip to content

Pushdown distinct support #33063

@Vladilen

Description

@Vladilen

Description

When dictionary encoding is enabled and there is distinct in the query, i.e. CL query:

SELECT distinct JSON_VALUE(json_payload, "$.\"hostname\"")
FROM `some table`
WHERE (`timestamp`) >= DateTime::FromSeconds(1765881533) AND (`timestamp`) < DateTime::FromSeconds(1765966133)
LIMIT 100

Distinct value can be taken from the dictionary and there's no need to read all data from the column, so the query can be sped up.

DoD

  • Queries with distinct and dictionary encoding are sped up at least by 2 times
  • Feature is merged in stable-26-2.

Changelog entry

Added distinct pushdown support

Documentation

Dependencies

Compatibility

The feature should be enabled through one version to guarantee rollback possibility (possible query failure otherwise)

Test plan

☐ Create issue with test plan

Stress tests

☐ Create issue with test plan

Implementation Plan

Milestone 1. 06.02.2026
☐ SSA program
☐ Build graph
☐ KQP pushdown

Milestone 2. 13.02.2025
☐ Distinct Aggregation Operator
☐ Sync Point distinct limit

Promotion

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions