Fix indexing overflow issue for blockwise quantization #1784

matthewdouglas · 2025-10-21T20:21:52Z

This PR resolves an issue with int32 overflows in indexing calculations used by the blockwise quantization and dequantization kernels. It also adds tests to verify that quantization and dequantization works on tensors with the maximum supported size of 2**31 - 1 elements. Prior to this fix, the quantization kernel would have issues with tensors above 2**30 elements.

…sor sizes

github-actions · 2025-10-21T20:26:45Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Fix indexing overflow issue for blockwise quantization with large ten…

cee93db

…sor sizes

matthewdouglas added this to the v0.48.2 milestone Oct 21, 2025

matthewdouglas added the CUDA Issues and PRs related to the CUDA backend, excluding installation/support help. label Oct 21, 2025

matthewdouglas mentioned this pull request Oct 21, 2025

Illegal memory access with quantize_4bit #1782

Closed

matthewdouglas merged commit 34400d2 into main Oct 22, 2025
252 of 264 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix indexing overflow issue for blockwise quantization #1784

Fix indexing overflow issue for blockwise quantization #1784

matthewdouglas commented Oct 21, 2025

Uh oh!

github-actions bot commented Oct 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Fix indexing overflow issue for blockwise quantization #1784

Fix indexing overflow issue for blockwise quantization #1784

Conversation

matthewdouglas commented Oct 21, 2025

Uh oh!

github-actions bot commented Oct 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant