optimizer: Prevent constant folding of DynamicQuantizeLinear#2865
optimizer: Prevent constant folding of DynamicQuantizeLinear#2865justinchuby merged 2 commits intomainfrom
Conversation
Co-authored-by: justinchuby <11205048+justinchuby@users.noreply.github.com> Agent-Logs-Url: https://github.com/microsoft/onnxscript/sessions/8abd1582-6287-4f15-950b-c24ef6aab9fe
|
@copilot strange: DequantizeLinear was in the list, why was it still folded in the original issue? |
The My change adds |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2865 +/- ##
==========================================
+ Coverage 71.86% 71.87% +0.01%
==========================================
Files 239 239
Lines 29138 29150 +12
Branches 2875 2875
==========================================
+ Hits 20941 20953 +12
Misses 7219 7219
Partials 978 978 ☔ View full report in Codecov by Sentry. |
The constant folding pass was eliminating
DequantizeLinearnodes that operated on constant weight tensors duringoptimize(), collapsing the quantization structure into a plainConvand losing quantization semantics in QAT-exported models.Changes
optimizer/_constant_folding.py: AddDynamicQuantizeLineartoDEFAULT_CONSTANT_FOLD_BLACKLISTalongside the existingQuantizeLinearandDequantizeLinearentries; reorder alphabetically for consistencyoptimizer/_constant_folding_test.py: Add tests verifyingQuantizeLinearandDequantizeLinearare not folded when all inputs are constant initializersOriginal prompt
This section details on the original issue you should resolve
<issue_title>[ONNX] Optimize should not fold DequantizeLinear</issue_title>
<issue_description>### 🐛 Describe the bug
After the QAT model undergoes the onnx_program.optimize() process, there is a loss of quantization nodes. As shown in the figure on the left is the normal export, and on the right is the abnormal export graph.
This bug occurred in
torch/onnx/_internal/exporter/_onnx_program.py:and it internally called the optimize_ir function in
onnxscript/optimizer/_optimizer.py.The default value of
input_size_limitis 512. Nodes with an input size less than this value will be collapsed.⭐ Please enable the parameter
optimizationfunction intorch/onnx/_internal/exporter/_onnx_program.py. Otherwise, I will be able to install onnxscript only by referring to the source code.The smallest reproducible example:
Versions
Versions of relevant libraries:
[pip3] executorch==0.5.0
[pip3] numpy==1.23.5
[pip3] nvidia-cublas-cu11==11.11.3.6
[pip3] nvidia-cuda-cupti-cu11==11.8.87
[pip3] nvidia-cuda-nvrtc-cu11==11.8.89
[pip3] nvidia-cuda-runtime-cu11==11.8.89
[pip3] nvidia-cudnn-cu11==9.1.0.70
[pip3] nvidia-cufft-cu11==10.9.0.58
[pip3] nvidia-curand-cu11==10.3.0.86
[pip3] nvidia-cusolver-cu11==11.4.1.48
[pip3] nvidia-cusparse-cu11==11.7.5.86
[pip3] nvidia-nccl-cu11==2.21.5
[pip3] nvidia-nvtx-cu11==11.8.86
[pip3] onnx==1.17.0
[pip3] onnx_graphsurgeon==0.5.8
[pip3] onnx-ir==0.1.12
[pip3] onnx-simplifier==0.4.36
[pip3] onnxruntime==1.21.0
[pip3] onnxruntime-gpu==1.21.0
[pip3] onnxscript==0.4.0
[pip3] onnxslim==0.1.48
[pip3] torch==2.6.0+cu118
[pip3] torchao==0.14.1
[pip3] torchaudio==2.6.0+cu118
[pip3] torchvision==0.21.0+cu118
[pip3] ...
📍 Connect Copilot coding agent with Jira, Azure Boards or Linear to delegate work to Copilot in one click without leaving your project management tool.