-
Notifications
You must be signed in to change notification settings - Fork 1k
Alamb/test boolean kernels #8793
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…table. but I don't want to pass slice of bytes as then I don't know the source and users must make sure that they hold the same promises as Buffer/MutableBuffer
…olean-buffer-builder
…m:rluvaton/arrow-rs into add-bitwise-ops-to-boolean-buffer-builder
…olean-buffer-builder
f71a57e to
b0cf38b
Compare
|
🤖 |
|
🤖: Benchmark completed Details
|
|
🤖 |
|
🤖: Benchmark completed Details
|
|
That is interesting that OR is slower |
| let rem = &rem.to_le_bytes()[0..remainder_bytes]; | ||
| buffer.extend_from_slice(rem); | ||
| let len_bytes = ceil(starting_bit_in_byte + len_in_bits, 8); | ||
| let mut result = left[start_byte..len_bytes].to_vec(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does an extra copy which wasn't there before (using from_trusted_len_iter)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 you are right. We do need a new allocation to write into but don't need to copy the values
It is fascinating however, that this code is often still faster than the previous one (maybe due to fewer branches)
I'll see if I can perhaps optimize the case when the offsets are zero which I think is a common case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optimization works well: #8807
| let len_in_bytes = ceil(len_in_bits, 8); | ||
| let mut result; | ||
| if offset_in_bits == 0 { | ||
| result = left.as_slice()[0..len_in_bytes].to_vec(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be needed as well
|
Thank you @Dandandan for your comments. After thinking about this some more I think there is an important difference between modify in place and create new APIs. I filed a larger potential code reorg (we could do it and keep backwards compatibility though): However, given the fact that this PR shows we can get the "create new API" to go faster, I am trying some other tricks |
|
#8807 is looking much more promising |
Which issue does this PR close?
We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax.
Rationale for this change
What changes are included in this PR?
apply_unary_opandapply_binary_opbitwise operations #8619 to the existing boolean kernelsbenchmark results for boolean_kernels
Are these changes tested?
will benchmark
If tests are not included in your PR, please explain why (for example, are they covered by existing tests)?
Are there any user-facing changes?
If there are user-facing changes then we may require documentation to be updated before approving the PR.
If there are any breaking changes to public APIs, please call them out.