You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Summary:
X-link: facebookresearch/FBGEMM#2103
Implements `get_unique_indices_cpu_impl()` to extract unique indices from linear index tensors on CPU, with comprehensive documentation and test coverage for both int32 and int64 dtypes.
Function Description
--------------------
**`get_unique_indices_cpu_impl`** processes a 1D tensor of linear indices and returns unique values with optional metadata (counts and inverse mapping for reordering).
### Example
```
Input: linear_indices = [20, 0, 10, 10, 0]
Output:
unique_indices = [0, 10, 20, x, x] (sorted, padded)
unique_indices_length = [3]
unique_indices_count = [2, 2, 1, x, x] (occurrence counts)
linear_index_positions_sorted = [1, 4, 2, 3, 0] (positions that sort input: linear_indices[[1,4,2,3,0]] = [0,0,10,10,20])
```
### Returns
1. **unique_indices**: Sorted unique values padded to input size (first `num_unique` elements valid)
2. **unique_indices_length**: Scalar tensor with count of unique values
3. **unique_indices_count** (optional): Occurrence count for each unique value
4. **linear_index_positions_sorted** (optional): Original positions that reorder input to sorted order (int32)
### Implementation Details
* Uses `at::unique_dim()` for core uniqueness computation with stable sorting
* Preserves input dtype for unique values
* Converts counts and positions to int32 for consistency with CUDA implementation
* Supports both `torch.int` (int32) and `torch.long` (int64) input dtypes
### Test Coverage
Added dtype parameterization to `test_get_unique_indices_cpu` to validate both int32 and int64, ensuring CPU implementation supports all dtypes that CUDA implementation support.
Differential Revision: D85736286
0 commit comments