-
Notifications
You must be signed in to change notification settings - Fork 748
Open
Labels
Description
Description
When dictionary encoding is enabled and there is distinct in the query, i.e. CL query:
SELECT distinct JSON_VALUE(json_payload, "$.\"hostname\"")
FROM `some table`
WHERE (`timestamp`) >= DateTime::FromSeconds(1765881533) AND (`timestamp`) < DateTime::FromSeconds(1765966133)
LIMIT 100Distinct value can be taken from the dictionary and there's no need to read all data from the column, so the query can be sped up.
DoD
- Queries with distinct and dictionary encoding are sped up at least by 2 times
- Feature is merged in stable-26-2.
Changelog entry
Added distinct pushdown support
Documentation
Dependencies
Compatibility
The feature should be enabled through one version to guarantee rollback possibility (possible query failure otherwise)
Test plan
☐ Create issue with test plan
Stress tests
☐ Create issue with test plan
Implementation Plan
Milestone 1. 06.02.2026
☐ SSA program
☐ Build graph
☐ KQP pushdown
Milestone 2. 13.02.2025
☐ Distinct Aggregation Operator
☐ Sync Point distinct limit