-
Notifications
You must be signed in to change notification settings - Fork 679
Add get_unique_indices on CPU
#5096
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
gchalump
wants to merge
1
commit into
pytorch:main
Choose a base branch
from
gchalump:export-D85736286
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
+509
−0
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
✅ Deploy Preview for pytorch-fbgemm-docs ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Contributor
505c473 to
e6346cf
Compare
gchalump
added a commit
to gchalump/FBGEMM
that referenced
this pull request
Nov 7, 2025
Summary: X-link: facebookresearch/FBGEMM#2103 Add `get_unique_indices` on CPU Add test to compare `get_unique_indices` from CPU with GPU Differential Revision: D85736286
gchalump
added a commit
to gchalump/FBGEMM
that referenced
this pull request
Nov 7, 2025
Summary: X-link: facebookresearch/FBGEMM#2103 Add `get_unique_indices` on CPU Add test to compare `get_unique_indices` from CPU with GPU Differential Revision: D85736286
e6346cf to
dec139a
Compare
gchalump
added a commit
to gchalump/FBGEMM
that referenced
this pull request
Nov 7, 2025
Summary: X-link: facebookresearch/FBGEMM#2103 Add `get_unique_indices` on CPU Add test to compare `get_unique_indices` from CPU with GPU Differential Revision: D85736286
6452f4a to
e457285
Compare
gchalump
added a commit
to gchalump/FBGEMM
that referenced
this pull request
Nov 10, 2025
Summary: X-link: facebookresearch/FBGEMM#2103 Implements `get_unique_indices_cpu_impl()` to extract unique indices from linear index tensors on CPU, with comprehensive documentation and test coverage for both int32 and int64 dtypes. Function Description -------------------- **`get_unique_indices_cpu_impl`** processes a 1D tensor of linear indices and returns unique values with optional metadata (counts and inverse mapping for reordering). ### Example ``` Input: linear_indices = [20, 0, 10, 10, 0] Output: unique_indices = [0, 10, 20, x, x] (sorted, padded) unique_indices_length = [3] unique_indices_count = [2, 2, 1, x, x] (occurrence counts) linear_index_positions_sorted = [1, 4, 2, 3, 0] (positions that sort input: linear_indices[[1,4,2,3,0]] = [0,0,10,10,20]) ``` ### Returns 1. **unique_indices**: Sorted unique values padded to input size (first `num_unique` elements valid) 2. **unique_indices_length**: Scalar tensor with count of unique values 3. **unique_indices_count** (optional): Occurrence count for each unique value 4. **linear_index_positions_sorted** (optional): Original positions that reorder input to sorted order (int32) ### Implementation Details * Uses `at::unique_dim()` for core uniqueness computation with stable sorting * Preserves input dtype for unique values * Converts counts and positions to int32 for consistency with CUDA implementation * Supports both `torch.int` (int32) and `torch.long` (int64) input dtypes ### Test Coverage Added dtype parameterization to `test_get_unique_indices_cpu` to validate both int32 and int64, ensuring CPU implementation supports all dtypes that CUDA implementation support. Differential Revision: D85736286
gchalump
added a commit
to gchalump/FBGEMM
that referenced
this pull request
Nov 10, 2025
Summary: X-link: facebookresearch/FBGEMM#2103 Implements `get_unique_indices_cpu_impl()` to extract unique indices from linear index tensors on CPU, with comprehensive documentation and test coverage for both int32 and int64 dtypes. Function Description -------------------- **`get_unique_indices_cpu_impl`** processes a 1D tensor of linear indices and returns unique values with optional metadata (counts and inverse mapping for reordering). ### Example ``` Input: linear_indices = [20, 0, 10, 10, 0] Output: unique_indices = [0, 10, 20, x, x] (sorted, padded) unique_indices_length = [3] unique_indices_count = [2, 2, 1, x, x] (occurrence counts) linear_index_positions_sorted = [1, 4, 2, 3, 0] (positions that sort input: linear_indices[[1,4,2,3,0]] = [0,0,10,10,20]) ``` ### Returns 1. **unique_indices**: Sorted unique values padded to input size (first `num_unique` elements valid) 2. **unique_indices_length**: Scalar tensor with count of unique values 3. **unique_indices_count** (optional): Occurrence count for each unique value 4. **linear_index_positions_sorted** (optional): Original positions that reorder input to sorted order (int32) ### Implementation Details * Uses `at::unique_dim()` for core uniqueness computation with stable sorting * Preserves input dtype for unique values * Converts counts and positions to int32 for consistency with CUDA implementation * Supports both `torch.int` (int32) and `torch.long` (int64) input dtypes ### Test Coverage Added dtype parameterization to `test_get_unique_indices_cpu` to validate both int32 and int64, ensuring CPU implementation supports all dtypes that CUDA implementation support. Differential Revision: D85736286
d92220d to
1ac09f4
Compare
gchalump
added a commit
to gchalump/FBGEMM
that referenced
this pull request
Nov 11, 2025
Summary: X-link: facebookresearch/FBGEMM#2103 Implements `get_unique_indices_cpu_impl()` to extract unique indices from linear index tensors on CPU, with comprehensive documentation and test coverage for both int32 and int64 dtypes. Function Description -------------------- **`get_unique_indices_cpu_impl`** processes a 1D tensor of linear indices and returns unique values with optional metadata (counts and inverse mapping for reordering). ### Example ``` Input: linear_indices = [20, 0, 10, 10, 0] Output: unique_indices = [0, 10, 20, x, x] (sorted, padded) unique_indices_length = [3] unique_indices_count = [2, 2, 1, x, x] (occurrence counts) linear_index_positions_sorted = [1, 4, 2, 3, 0] (positions that sort input: linear_indices[[1,4,2,3,0]] = [0,0,10,10,20]) ``` ### Returns 1. **unique_indices**: Sorted unique values padded to input size (first `num_unique` elements valid) 2. **unique_indices_length**: Scalar tensor with count of unique values 3. **unique_indices_count** (optional): Occurrence count for each unique value 4. **linear_index_positions_sorted** (optional): Original positions that reorder input to sorted order (int32) ### Implementation Details * Uses `at::unique_dim()` for core uniqueness computation with stable sorting * Preserves input dtype for unique values * Converts counts and positions to int32 for consistency with CUDA implementation * Supports both `torch.int` (int32) and `torch.long` (int64) input dtypes ### Test Coverage Added dtype parameterization to `test_get_unique_indices_cpu` to validate both int32 and int64, ensuring CPU implementation supports all dtypes that CUDA implementation support. Differential Revision: D85736286
gchalump
added a commit
to gchalump/FBGEMM
that referenced
this pull request
Nov 11, 2025
Summary: X-link: facebookresearch/FBGEMM#2103 Implements `get_unique_indices_cpu_impl()` to extract unique indices from linear index tensors on CPU, with comprehensive documentation and test coverage for both int32 and int64 dtypes. Function Description -------------------- **`get_unique_indices_cpu_impl`** processes a 1D tensor of linear indices and returns unique values with optional metadata (counts and inverse mapping for reordering). ### Example ``` Input: linear_indices = [20, 0, 10, 10, 0] Output: unique_indices = [0, 10, 20, x, x] (sorted, padded) unique_indices_length = [3] unique_indices_count = [2, 2, 1, x, x] (occurrence counts) linear_index_positions_sorted = [1, 4, 2, 3, 0] (positions that sort input: linear_indices[[1,4,2,3,0]] = [0,0,10,10,20]) ``` ### Returns 1. **unique_indices**: Sorted unique values padded to input size (first `num_unique` elements valid) 2. **unique_indices_length**: Scalar tensor with count of unique values 3. **unique_indices_count** (optional): Occurrence count for each unique value 4. **linear_index_positions_sorted** (optional): Original positions that reorder input to sorted order (int32) ### Implementation Details * Uses `at::unique_dim()` for core uniqueness computation with stable sorting * Preserves input dtype for unique values * Converts counts and positions to int32 for consistency with CUDA implementation * Supports both `torch.int` (int32) and `torch.long` (int64) input dtypes ### Test Coverage Added dtype parameterization to `test_get_unique_indices_cpu` to validate both int32 and int64, ensuring CPU implementation supports all dtypes that CUDA implementation support. Differential Revision: D85736286
750bbec to
1621b41
Compare
gchalump
added a commit
to gchalump/FBGEMM
that referenced
this pull request
Nov 12, 2025
Summary: X-link: facebookresearch/FBGEMM#2103 Implements `get_unique_indices_cpu_impl()` to extract unique indices from linear index tensors on CPU, with comprehensive documentation and test coverage for both int32 and int64 dtypes. Function Description -------------------- **`get_unique_indices_cpu_impl`** processes a 1D tensor of linear indices and returns unique values with optional metadata (counts and inverse mapping for reordering). ### Example ``` Input: linear_indices = [20, 0, 10, 10, 0] Output: unique_indices = [0, 10, 20, x, x] (sorted, padded) unique_indices_length = [3] unique_indices_count = [2, 2, 1, x, x] (occurrence counts) linear_index_positions_sorted = [1, 4, 2, 3, 0] (positions that sort input: linear_indices[[1,4,2,3,0]] = [0,0,10,10,20]) ``` ### Returns 1. **unique_indices**: Sorted unique values padded to input size (first `num_unique` elements valid) 2. **unique_indices_length**: Scalar tensor with count of unique values 3. **unique_indices_count** (optional): Occurrence count for each unique value 4. **linear_index_positions_sorted** (optional): Original positions that reorder input to sorted order (int32) ### Implementation Details * Uses `at::unique_dim()` for core uniqueness computation with stable sorting * Preserves input dtype for unique values * Converts counts and positions to int32 for consistency with CUDA implementation * Supports both `torch.int` (int32) and `torch.long` (int64) input dtypes ### Test Coverage Added dtype parameterization to `test_get_unique_indices_cpu` to validate both int32 and int64, ensuring CPU implementation supports all dtypes that CUDA implementation support. Differential Revision: D85736286
gchalump
added a commit
to gchalump/FBGEMM
that referenced
this pull request
Nov 12, 2025
Summary: X-link: facebookresearch/FBGEMM#2103 Implements `get_unique_indices_cpu_impl()` to extract unique indices from linear index tensors on CPU, with comprehensive documentation and test coverage for both int32 and int64 dtypes. Function Description -------------------- **`get_unique_indices_cpu_impl`** processes a 1D tensor of linear indices and returns unique values with optional metadata (counts and inverse mapping for reordering). ### Example ``` Input: linear_indices = [20, 0, 10, 10, 0] Output: unique_indices = [0, 10, 20, x, x] (sorted, padded) unique_indices_length = [3] unique_indices_count = [2, 2, 1, x, x] (occurrence counts) linear_index_positions_sorted = [1, 4, 2, 3, 0] (positions that sort input: linear_indices[[1,4,2,3,0]] = [0,0,10,10,20]) ``` ### Returns 1. **unique_indices**: Sorted unique values padded to input size (first `num_unique` elements valid) 2. **unique_indices_length**: Scalar tensor with count of unique values 3. **unique_indices_count** (optional): Occurrence count for each unique value 4. **linear_index_positions_sorted** (optional): Original positions that reorder input to sorted order (int32) ### Implementation Details * Uses `at::unique_dim()` for core uniqueness computation with stable sorting * Preserves input dtype for unique values * Converts counts and positions to int32 for consistency with CUDA implementation * Supports both `torch.int` (int32) and `torch.long` (int64) input dtypes ### Test Coverage Added dtype parameterization to `test_get_unique_indices_cpu` to validate both int32 and int64, ensuring CPU implementation supports all dtypes that CUDA implementation support. Differential Revision: D85736286
1621b41 to
2db46f1
Compare
Summary: X-link: facebookresearch/FBGEMM#2103 Implements `get_unique_indices_cpu_impl()` to extract unique indices from linear index tensors on CPU, with comprehensive documentation and test coverage for both int32 and int64 dtypes. Function Description -------------------- **`get_unique_indices_cpu_impl`** processes a 1D tensor of linear indices and returns unique values with optional metadata (counts and inverse mapping for reordering). ### Example ``` Input: linear_indices = [20, 0, 10, 10, 0] Output: unique_indices = [0, 10, 20, x, x] (sorted, padded) unique_indices_length = [3] unique_indices_count = [2, 2, 1, x, x] (occurrence counts) linear_index_positions_sorted = [1, 4, 2, 3, 0] (positions that sort input: linear_indices[[1,4,2,3,0]] = [0,0,10,10,20]) ``` ### Returns 1. **unique_indices**: Sorted unique values padded to input size (first `num_unique` elements valid) 2. **unique_indices_length**: Scalar tensor with count of unique values 3. **unique_indices_count** (optional): Occurrence count for each unique value 4. **linear_index_positions_sorted** (optional): Original positions that reorder input to sorted order (int32) ### Implementation Details * Uses `at::unique_dim()` for core uniqueness computation with stable sorting * Preserves input dtype for unique values * Converts counts and positions to int32 for consistency with CUDA implementation * Supports both `torch.int` (int32) and `torch.long` (int64) input dtypes ### Test Coverage Added dtype parameterization to `test_get_unique_indices_cpu` to validate both int32 and int64, ensuring CPU implementation supports all dtypes that CUDA implementation support. Differential Revision: D85736286
2db46f1 to
a867c4b
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary:
Add
get_unique_indiceson CPUAdd test to compare
get_unique_indicesfrom CPU with GPUDifferential Revision: D85736286