-
Couldn't load subscription status.
- Fork 46
FIX: larger inputs for sorting tests - to test sorts stability #390
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Confirmed it errors, out but I don't understand the failure TBH. What goes wrong with sorting an array full or unsigned int8 zeros, could you please explain? |
|
When you're stable-argsorting a constant array, you expected the output to be Note: The test would be much more explicit without using hypothesis, it would look like: |
|
Thanks for the explanation!
That would be nice indeed. We tend to keep library-specific tests for behaviors which differ between "bare" and array-api-compat wrapped libraries in array-api-compat itself, so this test would be a great addition to |
Note that it's probably a common trait of sorting algorithms, in numpy: np.argsort(np.zeros(16, dtype='int8')) # stable
np.argsort(np.zeros(17, dtype='int8')) # unstable(you don't see this behavior for dtypes that benefit from SIMD optimization though) It seems that it's a common optimization to switch to insertion sort for small arrays, and insertion sort is inherently stable. See https://github.com/numpy/numpy/blob/main/numpy/_core/src/npysort/quicksort.cpp#L70 in numpy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me.
I had to go clicking through the many layers of hypothesis and the array API helpers but finally got to https://github.com/HypothesisWorks/hypothesis/blob/7b93827d14d21ec27cf3e2fd45ee5ace904082f1/hypothesis-python/src/hypothesis/extra/_array_helpers.py#L81 which helped me understand what min_side and max_side do :D
The fact that you have to click a lot to get there and somehow min_side doesn't scream "this is the size of the array" to me I think it would be good to add a comment about what the goal of the max_side=50 is (you have to understand what "side" means and that 16/17 is a magic number. Neither is really explicit from max_side=50 if. you ask me). Just to make the life of people from the future easier.
The best best solution would be if we could explicitly add a "small size" and a "large size" in addition to those that hypothesis generates. That way it would be clear what the magic numbers are (16 and 17). But maybe this is too big a change for the scope of this PR (which is nice and compact).
|
I agree this would be quite hard to understand for future readers. I added a short comment that reference the issue #389, that should already be much easier to understand now. |
|
Thanks @cakedev0 for getting to the bottom of this, thanks @betatim for the review! |
cross-ref data-apis#356 which wrapped torch.argsort to fix the default, and data-apis/array-api-tests#390 which made a matching change in the array-api-test suite.
cross-ref data-apis#356 which wrapped torch.argsort to fix the default, and data-apis/array-api-tests#390 which made a matching change in the array-api-test suite.
Fixes: #389
To test sort stability of array libraries, you usually need arrays of more than 16 elements (because for less than 16, libs often default to an inherently stable sort, even for unstable algorithms).
Because of this was not tested properly, a gap in array-api-compat was not detected by this test suite. See issue data-apis/array-api-compat#354. You can try running:
ARRAY_API_TESTS_VERSION="2024.12" ARRAY_API_TESTS_MODULE=array_api_compat.torch pytest array_api_tests/test_sorting_functions.pywith this branch of
array-api-teststo see it indeed fails.