Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/quant/per block #2849

Merged
merged 28 commits into from
Mar 3, 2025
Merged

Feat/quant/per block #2849

merged 28 commits into from
Mar 3, 2025

Conversation

laggui
Copy link
Member

@laggui laggui commented Feb 27, 2025

Pull Request Template

Checklist

  • Confirmed that run-checks all script has been executed.
  • Made sure the book is up to date with changes in this PR.

Changes

More quantization granularity!

  • Refactored QuantizationScheme enum
  • Changed Calibration to an enum
  • Added per-block quantization
    • Flat: linear segments (implemented for ndarray and cubecl backends)
    • Grid: m x n blocks (ndarray only via QuantizationStrategy)
    • Quantization parameters are stored as [offset_1, offset_2, ..., offset_num_blocks, scale_1, scale_2, ..., scale_num_blocks] (with offsets being optional)

Test utils:

  • Added #[might_panic] test attribute (for ops configuration that are not strictly required, e.g. different quantization schemes)

For the CI:

  • Disabled incremental compilation for the test profile (reduces total artifact sizes quite significantly, finally fixing the intermittent No space left on device issues).

Testing

Unit tests for new schemes

Copy link

codecov bot commented Feb 27, 2025

Codecov Report

Attention: Patch coverage is 83.27781% with 251 lines in your changes missing coverage. Please review.

Project coverage is 82.29%. Comparing base (17d9753) to head (24d5857).
Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
...es/burn-cubecl/src/kernel/quantization/quantize.rs 67.16% 88 Missing ⚠️
.../burn-cubecl/src/kernel/quantization/dequantize.rs 60.95% 41 Missing ⚠️
crates/burn-tch/src/ops/qtensor.rs 0.00% 38 Missing ⚠️
...tes/burn-cubecl/src/kernel/quantization/qtensor.rs 34.54% 36 Missing ⚠️
crates/burn-tch/src/tensor.rs 0.00% 11 Missing ⚠️
...es/burn-tensor/src/tensor/quantization/strategy.rs 96.20% 11 Missing ⚠️
crates/burn-tensor-testgen/src/lib.rs 88.88% 9 Missing ⚠️
...ates/burn-tensor/src/tensor/quantization/scheme.rs 94.54% 6 Missing ⚠️
crates/burn-tensor/src/tensor/element/base.rs 0.00% 3 Missing ⚠️
crates/burn-tensor/src/tensor/data.rs 88.23% 2 Missing ⚠️
... and 4 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2849      +/-   ##
==========================================
+ Coverage   82.18%   82.29%   +0.11%     
==========================================
  Files         854      861       +7     
  Lines      114059   116887    +2828     
==========================================
+ Hits        93734    96194    +2460     
- Misses      20325    20693     +368     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@laggui laggui force-pushed the feat/quant/per-block branch 2 times, most recently from 4a083fa to 08a4b7f Compare February 28, 2025 17:10
@laggui laggui force-pushed the feat/quant/per-block branch from 08a4b7f to a7bf68b Compare February 28, 2025 17:18
@laggui laggui marked this pull request as ready for review February 28, 2025 20:42
@laggui laggui merged commit a6b5210 into main Mar 3, 2025
11 checks passed
@laggui laggui deleted the feat/quant/per-block branch March 3, 2025 17:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants