[Parquet] Adaptive Parquet Predicate Pushdown #8733

hhhizzz · 2025-10-28T18:30:05Z

Which issue does this PR close?

Closes [Parquet]Performance Degradation with RowFilter on Unsorted Columns due to Fragmented ReadPlan #8565
Closes Adaptive Parquet Predicate Pushdown Evaluation #5523

Rationale for this change

Predicate pushdown and row filters often yield very short alternating select/skip runs; materialising those runs through a VecDeque<RowSelector> is both allocation-heavy and prone to panics
when entire pages are skipped and never decoded. This PR introduces a mask-backed execution path for short selectors, adds the heuristics and guards needed to decide between masks and selectors,
and provides tooling to measure the trade‑offs so we can tune the threshold confidently.

What changes are included in this PR?

Introduced RowSelectionStrategy plus a RowSelectionCursor that can iterate either a boolean mask or the legacy selector queue, along with a public guard/override so tests and benchmarks can
tweak the average-selector-length heuristic.
ReadPlanBuilder now trims selections, estimates their density, and chooses the faster mask strategy when safe; ParquetRecordBatchReader streams mask chunks, skips contiguous gaps, filters
the projected batch with Arrow’s boolean kernel, and still falls back to selectors when needed.
RowSelection can now inspect offset indexes to detect when a selection would skip entire pages; both the synchronous and asynchronous readers consult that signal so row filters no longer
panic when predicate pruning drops whole pages.
Added the row_selection_state Criterion benchmark plus dev/row_selection_analysis.py to run the bench, export CSV summaries, and render comparative plots across selector lengths, column
widths, and Utf8View payload sizes; wired the bench into parquet/Cargo.toml.
Expanded unit/async regression coverage around mask iteration (test_row_selection_interleaved_skip, test_row_selection_mask_sparse_rows, test_row_filter_full_page_skip_is_handled and its
async twin) and updated the push-decoder size assertion to reflect the new state.

Are these changes tested?

Added the synchronous and async row-filter regression tests above plus new ReadPlanBuilder threshold tests; the Criterion bench + Python tooling provide manual validation for performance
tuning. Full parquet/arrow test suites will still run in CI.

Are there any user-facing changes?

Row-selection heavy scans should see lower CPU usage because dense masks are streamed directly, while the new guards prevent predicate-pruned queries from panicking.
Advanced users now have access to RowSelectionStrategy, RowSelectionCursor, and set_avg_selector_len_mask_threshold for experimentation, and developers gain the new benchmarking/plotting
workflow. No breaking API changes were introduced.

hhhizzz · 2025-10-28T18:32:13Z

The version without synthetic page:

group                                                                                main                                   rowselectionempty
-----                                                                                ----                                   -----------------
arrow_reader_row_filter/float64 <= 99.0/all_columns/async                            1.00   911.5±13.20µs        ? ?/sec    1.01   923.9±12.05µs        ? ?/sec
arrow_reader_row_filter/float64 <= 99.0/all_columns/sync                             1.02   954.2±15.29µs        ? ?/sec    1.00   938.6±39.67µs        ? ?/sec
arrow_reader_row_filter/float64 <= 99.0/exclude_filter_column/async                  1.00   840.3±12.74µs        ? ?/sec    1.02   853.4±10.03µs        ? ?/sec
arrow_reader_row_filter/float64 <= 99.0/exclude_filter_column/sync                   1.02   857.7±32.24µs        ? ?/sec    1.00   839.3±14.74µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/all_columns/async              1.00   733.3±10.42µs        ? ?/sec    1.01   737.1±11.33µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/all_columns/sync               1.00   829.7±31.73µs        ? ?/sec    1.00   828.6±14.14µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/exclude_filter_column/async    1.00    707.6±9.97µs        ? ?/sec    1.01    714.4±9.14µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/exclude_filter_column/sync     1.00    705.4±7.10µs        ? ?/sec    1.01   712.6±12.49µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/all_columns/async                             1.02   950.4±10.18µs        ? ?/sec    1.00   932.2±15.97µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/all_columns/sync                              1.09  1012.0±76.49µs        ? ?/sec    1.00   928.7±14.82µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/exclude_filter_column/async                   1.01   877.7±11.81µs        ? ?/sec    1.00   866.7±34.30µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/exclude_filter_column/sync                    1.04   902.6±97.27µs        ? ?/sec    1.00   868.2±36.74µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/all_columns/async                              1.12   521.9±62.45µs        ? ?/sec    1.00    466.3±8.31µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/all_columns/sync                               1.27  596.3±142.09µs        ? ?/sec    1.00   470.3±10.87µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/exclude_filter_column/async                    1.04    483.9±9.56µs        ? ?/sec    1.00    464.3±6.78µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/exclude_filter_column/sync                     1.02    478.0±6.52µs        ? ?/sec    1.00   469.7±19.68µs        ? ?/sec
arrow_reader_row_filter/int64 > 90/all_columns/async                                 1.38      2.2±0.03ms        ? ?/sec    1.00  1588.8±20.90µs        ? ?/sec
arrow_reader_row_filter/int64 > 90/all_columns/sync                                  1.72      2.4±0.03ms        ? ?/sec    1.00  1380.1±14.85µs        ? ?/sec
arrow_reader_row_filter/int64 > 90/exclude_filter_column/async                       1.19  1834.0±23.61µs        ? ?/sec    1.00  1539.5±25.19µs        ? ?/sec
arrow_reader_row_filter/int64 > 90/exclude_filter_column/sync                        1.48  1915.9±24.09µs        ? ?/sec    1.00  1291.0±15.63µs        ? ?/sec
arrow_reader_row_filter/ts < 9000/all_columns/async                                  1.00  1115.4±22.07µs        ? ?/sec    1.01  1122.2±25.25µs        ? ?/sec
arrow_reader_row_filter/ts < 9000/all_columns/sync                                   1.00  1213.8±63.20µs        ? ?/sec    1.02  1233.9±38.11µs        ? ?/sec
arrow_reader_row_filter/ts < 9000/exclude_filter_column/async                        1.00  1048.7±18.78µs        ? ?/sec    1.03  1081.6±54.03µs        ? ?/sec
arrow_reader_row_filter/ts < 9000/exclude_filter_column/sync                         1.01  1060.6±21.15µs        ? ?/sec    1.00  1055.0±18.67µs        ? ?/sec
arrow_reader_row_filter/ts >= 9000/all_columns/async                                 1.01   684.7±26.23µs        ? ?/sec    1.00   675.6±25.76µs        ? ?/sec
arrow_reader_row_filter/ts >= 9000/all_columns/sync                                  1.04   744.8±85.25µs        ? ?/sec    1.00    716.5±9.81µs        ? ?/sec
arrow_reader_row_filter/ts >= 9000/exclude_filter_column/async                       1.00   638.4±21.72µs        ? ?/sec    1.00   637.1±11.83µs        ? ?/sec
arrow_reader_row_filter/ts >= 9000/exclude_filter_column/sync                        1.00   634.8±15.76µs        ? ?/sec    1.01   639.2±14.18µs        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/all_columns/async                             1.27      2.3±0.03ms        ? ?/sec    1.00  1805.7±35.68µs        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/all_columns/sync                              1.41      2.9±0.07ms        ? ?/sec    1.00      2.1±0.03ms        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/exclude_filter_column/async                   1.19  1902.3±26.55µs        ? ?/sec    1.00  1601.7±18.59µs        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/exclude_filter_column/sync                    1.45      2.1±0.05ms        ? ?/sec    1.00  1417.9±49.18µs        ? ?/sec

alamb · 2025-10-29T21:14:04Z

😮 thank you @hhhizzz -- I plan to review this PR carefully, but it will likely take me a few days

alamb · 2025-10-29T21:14:12Z

fyi @zhuqi-lucas and @XiangpengHao

alamb · 2025-10-30T09:29:07Z

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.14.0-1017-gcp #18~24.04.1-Ubuntu SMP Tue Sep 23 17:51:44 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing rowselectionempty (14647e1) to 5744743 diff
BENCH_NAME=arrow_reader
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench arrow_reader
BENCH_FILTER=
BENCH_BRANCH_NAME=rowselectionempty
Results will be posted here when complete

parquet/src/arrow/arrow_reader/read_plan.rs

alamb · 2025-10-30T10:49:21Z

🤖: Benchmark completed

Details

group                                                                                                      main                                   rowselectionempty
-----                                                                                                      ----                                   -----------------
arrow_array_reader/BYTE_ARRAY/Decimal128Array/plain encoded, mandatory, no NULLs                           1.00   1099.8±2.52µs        ? ?/sec    1.16   1274.8±3.26µs        ? ?/sec
arrow_array_reader/BYTE_ARRAY/Decimal128Array/plain encoded, optional, half NULLs                          1.00   1269.3±2.75µs        ? ?/sec    1.03   1307.9±4.94µs        ? ?/sec
arrow_array_reader/BYTE_ARRAY/Decimal128Array/plain encoded, optional, no NULLs                            1.00   1107.4±3.98µs        ? ?/sec    1.16   1281.4±2.80µs        ? ?/sec
arrow_array_reader/BinaryArray/dictionary encoded, mandatory, no NULLs                                     1.05    513.1±6.17µs        ? ?/sec    1.00    486.6±3.42µs        ? ?/sec
arrow_array_reader/BinaryArray/dictionary encoded, optional, half NULLs                                    1.00    660.4±3.25µs        ? ?/sec    1.02    673.5±6.14µs        ? ?/sec
arrow_array_reader/BinaryArray/dictionary encoded, optional, no NULLs                                      1.03    506.2±2.98µs        ? ?/sec    1.00    493.5±3.45µs        ? ?/sec
arrow_array_reader/BinaryArray/plain encoded, mandatory, no NULLs                                          1.04    572.5±2.07µs        ? ?/sec    1.00    552.8±2.47µs        ? ?/sec
arrow_array_reader/BinaryArray/plain encoded, optional, half NULLs                                         1.00    725.3±4.38µs        ? ?/sec    1.01    731.3±3.60µs        ? ?/sec
arrow_array_reader/BinaryArray/plain encoded, optional, no NULLs                                           1.03    585.1±3.07µs        ? ?/sec    1.00    565.3±3.97µs        ? ?/sec
arrow_array_reader/BinaryViewArray/dictionary encoded, mandatory, no NULLs                                 1.00    238.9±3.00µs        ? ?/sec    1.14    271.5±2.70µs        ? ?/sec
arrow_array_reader/BinaryViewArray/dictionary encoded, optional, half NULLs                                1.00    214.6±0.58µs        ? ?/sec    1.24    266.2±1.12µs        ? ?/sec
arrow_array_reader/BinaryViewArray/dictionary encoded, optional, no NULLs                                  1.00    235.8±2.71µs        ? ?/sec    1.18    278.6±3.64µs        ? ?/sec
arrow_array_reader/BinaryViewArray/plain encoded, mandatory, no NULLs                                      1.21    354.3±2.88µs        ? ?/sec    1.00    292.4±4.95µs        ? ?/sec
arrow_array_reader/BinaryViewArray/plain encoded, mandatory, no NULLs, short string                        1.16    326.1±0.58µs        ? ?/sec    1.00    282.3±1.53µs        ? ?/sec
arrow_array_reader/BinaryViewArray/plain encoded, optional, half NULLs                                     1.04    291.0±2.36µs        ? ?/sec    1.00    279.3±1.39µs        ? ?/sec
arrow_array_reader/BinaryViewArray/plain encoded, optional, no NULLs                                       1.20    359.2±5.50µs        ? ?/sec    1.00    299.9±2.58µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/byte_stream_split encoded, mandatory, no NULLs     1.00    980.1±3.50µs        ? ?/sec    1.14   1122.1±9.70µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/byte_stream_split encoded, optional, half NULLs    1.00    847.1±2.21µs        ? ?/sec    1.14    966.6±2.17µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/byte_stream_split encoded, optional, no NULLs      1.00    985.8±2.30µs        ? ?/sec    1.15   1133.0±3.28µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/plain encoded, mandatory, no NULLs                 1.00    306.5±4.21µs        ? ?/sec    1.46    446.5±4.50µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/plain encoded, optional, half NULLs                1.00    478.8±1.42µs        ? ?/sec    1.32    633.6±7.02µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Decimal128Array/plain encoded, optional, no NULLs                  1.00    311.6±5.15µs        ? ?/sec    1.46    456.1±3.48µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/byte_stream_split encoded, mandatory, no NULLs        1.00    161.2±0.81µs        ? ?/sec    1.26    202.8±0.55µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/byte_stream_split encoded, optional, half NULLs       1.00    303.1±0.74µs        ? ?/sec    1.13    343.6±0.47µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/byte_stream_split encoded, optional, no NULLs         1.00    166.7±0.32µs        ? ?/sec    1.25    208.1±0.38µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, mandatory, no NULLs                    1.00     77.7±0.31µs        ? ?/sec    1.52    118.2±0.36µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, optional, half NULLs                   1.00    260.3±0.57µs        ? ?/sec    1.16    300.7±0.87µs        ? ?/sec
arrow_array_reader/FIXED_LEN_BYTE_ARRAY/Float16Array/plain encoded, optional, no NULLs                     1.00     84.1±0.30µs        ? ?/sec    1.46    123.2±0.17µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/byte_stream_split encoded, mandatory, no NULLs                    1.00    738.5±1.43µs        ? ?/sec    1.00    737.3±2.45µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/byte_stream_split encoded, optional, half NULLs                   1.00    580.0±2.20µs        ? ?/sec    1.02    591.8±2.08µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/byte_stream_split encoded, optional, no NULLs                     1.00    743.9±2.69µs        ? ?/sec    1.00    743.2±2.36µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/plain encoded, mandatory, no NULLs                                1.00     64.7±4.61µs        ? ?/sec    1.01     65.5±5.32µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/plain encoded, optional, half NULLs                               1.00    245.5±1.68µs        ? ?/sec    1.03    252.8±1.09µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(16)/plain encoded, optional, no NULLs                                 1.00     73.1±6.90µs        ? ?/sec    1.03     75.4±1.73µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/byte_stream_split encoded, mandatory, no NULLs                     1.00     94.5±0.22µs        ? ?/sec    1.00     94.6±0.23µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/byte_stream_split encoded, optional, half NULLs                    1.00    235.9±0.94µs        ? ?/sec    1.00    235.0±1.30µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/byte_stream_split encoded, optional, no NULLs                      1.00     99.7±0.48µs        ? ?/sec    1.01    100.2±1.65µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/plain encoded, mandatory, no NULLs                                 1.05      9.8±0.13µs        ? ?/sec    1.00      9.3±0.09µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/plain encoded, optional, half NULLs                                1.00    193.5±0.31µs        ? ?/sec    1.00    192.7±0.72µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(2)/plain encoded, optional, no NULLs                                  1.04     15.1±0.28µs        ? ?/sec    1.00     14.4±0.14µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/byte_stream_split encoded, mandatory, no NULLs                     1.00    184.5±0.56µs        ? ?/sec    1.00    185.1±0.80µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/byte_stream_split encoded, optional, half NULLs                    1.00    346.8±0.82µs        ? ?/sec    1.00    348.1±2.56µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/byte_stream_split encoded, optional, no NULLs                      1.00    190.0±0.52µs        ? ?/sec    1.00    190.9±0.99µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/plain encoded, mandatory, no NULLs                                 1.02     14.7±0.32µs        ? ?/sec    1.00     14.4±0.31µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/plain encoded, optional, half NULLs                                1.00    262.2±0.84µs        ? ?/sec    1.00    262.6±1.26µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(4)/plain encoded, optional, no NULLs                                  1.00     20.1±0.58µs        ? ?/sec    1.02     20.5±0.36µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/byte_stream_split encoded, mandatory, no NULLs                     1.00    364.7±1.81µs        ? ?/sec    1.01    367.8±1.49µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/byte_stream_split encoded, optional, half NULLs                    1.00    383.5±1.16µs        ? ?/sec    1.01    388.4±1.63µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/byte_stream_split encoded, optional, no NULLs                      1.00    372.1±0.72µs        ? ?/sec    1.01    374.9±1.16µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/plain encoded, mandatory, no NULLs                                 1.00     26.4±0.30µs        ? ?/sec    1.05     27.7±0.53µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/plain encoded, optional, half NULLs                                1.00    215.4±0.94µs        ? ?/sec    1.01    218.2±1.01µs        ? ?/sec
arrow_array_reader/FixedLenByteArray(8)/plain encoded, optional, no NULLs                                  1.00     30.6±0.46µs        ? ?/sec    1.16     35.5±0.52µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed skip, mandatory, no NULLs                           1.00    124.8±0.51µs        ? ?/sec    1.00    124.3±0.20µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed skip, optional, half NULLs                          1.12    140.0±0.78µs        ? ?/sec    1.00    125.1±1.22µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed skip, optional, no NULLs                            1.00    127.4±0.77µs        ? ?/sec    1.00    127.4±0.33µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed, mandatory, no NULLs                                1.00    178.7±0.42µs        ? ?/sec    1.00    178.3±1.56µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed, optional, half NULLs                               1.14    236.1±1.84µs        ? ?/sec    1.00    207.6±1.98µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/binary packed, optional, no NULLs                                 1.00    183.6±0.37µs        ? ?/sec    1.00    183.9±1.78µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/byte_stream_split encoded, mandatory, no NULLs                    1.01     76.3±0.21µs        ? ?/sec    1.00     75.5±0.25µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/byte_stream_split encoded, optional, half NULLs                   1.15    181.2±0.83µs        ? ?/sec    1.00    157.3±2.15µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/byte_stream_split encoded, optional, no NULLs                     1.01     83.0±0.37µs        ? ?/sec    1.00     82.1±0.18µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/dictionary encoded, mandatory, no NULLs                           1.00    135.3±0.37µs        ? ?/sec    1.06    143.3±2.28µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/dictionary encoded, optional, half NULLs                          1.11    214.0±0.96µs        ? ?/sec    1.00    192.2±2.48µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/dictionary encoded, optional, no NULLs                            1.00    141.2±0.32µs        ? ?/sec    1.05    148.7±0.34µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/plain encoded, mandatory, no NULLs                                1.00     74.7±0.33µs        ? ?/sec    1.00     74.6±0.31µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/plain encoded, optional, half NULLs                               1.16    177.7±0.61µs        ? ?/sec    1.00    153.6±0.51µs        ? ?/sec
arrow_array_reader/INT32/Decimal128Array/plain encoded, optional, no NULLs                                 1.01     78.6±0.27µs        ? ?/sec    1.00     77.9±0.28µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed skip, mandatory, no NULLs                           1.00    111.9±0.23µs        ? ?/sec    1.02    114.5±0.26µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed skip, optional, half NULLs                          1.00    124.4±0.42µs        ? ?/sec    1.07    133.7±0.60µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed skip, optional, no NULLs                            1.02    117.7±2.19µs        ? ?/sec    1.00    115.7±0.32µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed, mandatory, no NULLs                                1.00    169.4±0.63µs        ? ?/sec    1.01    170.7±0.31µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed, optional, half NULLs                               1.00    211.4±1.55µs        ? ?/sec    1.13    239.2±0.76µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/binary packed, optional, no NULLs                                 1.00    175.8±1.56µs        ? ?/sec    1.00    175.9±0.33µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/byte_stream_split encoded, mandatory, no NULLs                    1.00    202.0±0.31µs        ? ?/sec    1.00    201.2±0.43µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/byte_stream_split encoded, optional, half NULLs                   1.00    225.6±0.64µs        ? ?/sec    1.12    252.9±3.10µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/byte_stream_split encoded, optional, no NULLs                     1.00    208.6±1.41µs        ? ?/sec    1.00    208.3±0.65µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/dictionary encoded, mandatory, no NULLs                           1.00    144.6±0.53µs        ? ?/sec    1.07    154.4±4.00µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/dictionary encoded, optional, half NULLs                          1.00    193.6±1.10µs        ? ?/sec    1.15    221.9±0.54µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/dictionary encoded, optional, no NULLs                            1.00    148.7±0.40µs        ? ?/sec    1.05    155.5±1.67µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/plain encoded, mandatory, no NULLs                                1.00    107.7±0.78µs        ? ?/sec    1.01    108.3±2.06µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/plain encoded, optional, half NULLs                               1.00    172.1±1.13µs        ? ?/sec    1.16    199.4±1.08µs        ? ?/sec
arrow_array_reader/INT64/Decimal128Array/plain encoded, optional, no NULLs                                 1.00    115.7±1.55µs        ? ?/sec    1.00    115.4±1.66µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed skip, mandatory, no NULLs                                      1.01    102.1±0.21µs        ? ?/sec    1.00    101.1±0.22µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed skip, optional, half NULLs                                     1.14    117.8±0.30µs        ? ?/sec    1.00    103.5±0.31µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed skip, optional, no NULLs                                       1.01    105.6±0.24µs        ? ?/sec    1.00    104.2±0.58µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed, mandatory, no NULLs                                           1.01    139.5±0.29µs        ? ?/sec    1.00    137.4±1.31µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed, optional, half NULLs                                          1.15    194.3±0.71µs        ? ?/sec    1.00    168.5±2.16µs        ? ?/sec
arrow_array_reader/Int16Array/binary packed, optional, no NULLs                                            1.02    144.8±0.33µs        ? ?/sec    1.00    142.0±1.28µs        ? ?/sec
arrow_array_reader/Int16Array/byte_stream_split encoded, mandatory, no NULLs                               1.00     42.5±0.26µs        ? ?/sec    1.04     44.2±0.32µs        ? ?/sec
arrow_array_reader/Int16Array/byte_stream_split encoded, optional, half NULLs                              1.21    143.2±0.50µs        ? ?/sec    1.00    118.0±0.46µs        ? ?/sec
arrow_array_reader/Int16Array/byte_stream_split encoded, optional, no NULLs                                1.00     47.7±0.10µs        ? ?/sec    1.03     49.1±0.21µs        ? ?/sec
arrow_array_reader/Int16Array/dictionary encoded, mandatory, no NULLs                                      1.00    102.7±0.31µs        ? ?/sec    1.07    110.3±0.30µs        ? ?/sec
arrow_array_reader/Int16Array/dictionary encoded, optional, half NULLs                                     1.14    177.3±0.34µs        ? ?/sec    1.00    155.5±0.82µs        ? ?/sec
arrow_array_reader/Int16Array/dictionary encoded, optional, no NULLs                                       1.00    108.2±0.24µs        ? ?/sec    1.07    115.8±0.87µs        ? ?/sec
arrow_array_reader/Int16Array/plain encoded, mandatory, no NULLs                                           1.01     38.5±0.12µs        ? ?/sec    1.00     38.2±0.14µs        ? ?/sec
arrow_array_reader/Int16Array/plain encoded, optional, half NULLs                                          1.22    142.1±0.35µs        ? ?/sec    1.00    116.0±0.41µs        ? ?/sec
arrow_array_reader/Int16Array/plain encoded, optional, no NULLs                                            1.01     44.0±0.18µs        ? ?/sec    1.00     43.7±0.13µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed skip, mandatory, no NULLs                                      1.01     98.5±0.30µs        ? ?/sec    1.00     97.3±0.17µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed skip, optional, half NULLs                                     1.15    111.4±0.31µs        ? ?/sec    1.00     96.6±0.70µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed skip, optional, no NULLs                                       1.01    101.3±0.19µs        ? ?/sec    1.00    100.7±0.20µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed, mandatory, no NULLs                                           1.02    128.6±0.19µs        ? ?/sec    1.00    126.3±0.42µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed, optional, half NULLs                                          1.16    175.9±1.18µs        ? ?/sec    1.00    151.9±0.43µs        ? ?/sec
arrow_array_reader/Int32Array/binary packed, optional, no NULLs                                            1.00    130.9±0.42µs        ? ?/sec    1.00    131.3±0.34µs        ? ?/sec
arrow_array_reader/Int32Array/byte_stream_split encoded, mandatory, no NULLs                               1.05     25.8±0.31µs        ? ?/sec    1.00     24.5±0.20µs        ? ?/sec
arrow_array_reader/Int32Array/byte_stream_split encoded, optional, half NULLs                              1.25    127.1±0.73µs        ? ?/sec    1.00    101.7±0.59µs        ? ?/sec
arrow_array_reader/Int32Array/byte_stream_split encoded, optional, no NULLs                                1.04     31.1±0.27µs        ? ?/sec    1.00     30.0±0.44µs        ? ?/sec
arrow_array_reader/Int32Array/dictionary encoded, mandatory, no NULLs                                      1.00     83.6±0.33µs        ? ?/sec    1.09     91.2±0.28µs        ? ?/sec
arrow_array_reader/Int32Array/dictionary encoded, optional, half NULLs                                     1.13    156.0±0.29µs        ? ?/sec    1.00    137.5±0.39µs        ? ?/sec
arrow_array_reader/Int32Array/dictionary encoded, optional, no NULLs                                       1.00     89.1±0.20µs        ? ?/sec    1.09     97.0±0.65µs        ? ?/sec
arrow_array_reader/Int32Array/plain encoded, mandatory, no NULLs                                           1.02     18.0±0.43µs        ? ?/sec    1.00     17.7±0.51µs        ? ?/sec
arrow_array_reader/Int32Array/plain encoded, optional, half NULLs                                          1.27    122.7±0.52µs        ? ?/sec    1.00     96.2±0.21µs        ? ?/sec
arrow_array_reader/Int32Array/plain encoded, optional, no NULLs                                            1.01     25.9±0.45µs        ? ?/sec    1.00     25.6±0.87µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed skip, mandatory, no NULLs                                      1.00     85.7±0.23µs        ? ?/sec    1.00     86.0±0.27µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed skip, optional, half NULLs                                     1.00     93.8±0.90µs        ? ?/sec    1.13    106.0±0.42µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed skip, optional, no NULLs                                       1.00     88.8±0.86µs        ? ?/sec    1.00     89.1±0.37µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed, mandatory, no NULLs                                           1.00    117.5±1.43µs        ? ?/sec    1.00    117.2±0.44µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed, optional, half NULLs                                          1.00    149.7±1.76µs        ? ?/sec    1.17    174.8±0.37µs        ? ?/sec
arrow_array_reader/Int64Array/binary packed, optional, no NULLs                                            1.02    122.7±0.63µs        ? ?/sec    1.00    120.5±0.42µs        ? ?/sec
arrow_array_reader/Int64Array/byte_stream_split encoded, mandatory, no NULLs                               1.01    149.5±0.77µs        ? ?/sec    1.00    148.5±0.50µs        ? ?/sec
arrow_array_reader/Int64Array/byte_stream_split encoded, optional, half NULLs                              1.00    170.3±0.59µs        ? ?/sec    1.15    196.4±0.60µs        ? ?/sec
arrow_array_reader/Int64Array/byte_stream_split encoded, optional, no NULLs                                1.00    154.6±0.44µs        ? ?/sec    1.00    154.5±0.89µs        ? ?/sec
arrow_array_reader/Int64Array/dictionary encoded, mandatory, no NULLs                                      1.00     90.7±0.39µs        ? ?/sec    1.09     98.7±0.52µs        ? ?/sec
arrow_array_reader/Int64Array/dictionary encoded, optional, half NULLs                                     1.00    138.0±1.48µs        ? ?/sec    1.21    167.0±0.54µs        ? ?/sec
arrow_array_reader/Int64Array/dictionary encoded, optional, no NULLs                                       1.00     97.0±0.69µs        ? ?/sec    1.07    104.1±0.48µs        ? ?/sec
arrow_array_reader/Int64Array/plain encoded, mandatory, no NULLs                                           1.00     43.0±0.70µs        ? ?/sec    1.05     45.2±2.20µs        ? ?/sec
arrow_array_reader/Int64Array/plain encoded, optional, half NULLs                                          1.00    112.4±1.30µs        ? ?/sec    1.22    137.3±0.54µs        ? ?/sec
arrow_array_reader/Int64Array/plain encoded, optional, no NULLs                                            1.00     49.1±0.66µs        ? ?/sec    1.05     51.5±2.74µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed skip, mandatory, no NULLs                                       1.02     98.9±0.24µs        ? ?/sec    1.00     96.9±0.29µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed skip, optional, half NULLs                                      1.15    114.0±0.17µs        ? ?/sec    1.00     99.3±0.18µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed skip, optional, no NULLs                                        1.02    102.1±0.19µs        ? ?/sec    1.00     99.9±0.40µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed, mandatory, no NULLs                                            1.01    130.5±0.72µs        ? ?/sec    1.00    128.9±0.36µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed, optional, half NULLs                                           1.16    184.9±0.64µs        ? ?/sec    1.00    159.8±0.61µs        ? ?/sec
arrow_array_reader/Int8Array/binary packed, optional, no NULLs                                             1.00    135.1±1.03µs        ? ?/sec    1.00    134.8±0.57µs        ? ?/sec
arrow_array_reader/Int8Array/byte_stream_split encoded, mandatory, no NULLs                                1.06     36.6±0.13µs        ? ?/sec    1.00     34.3±0.08µs        ? ?/sec
arrow_array_reader/Int8Array/byte_stream_split encoded, optional, half NULLs                               1.21    137.3±0.65µs        ? ?/sec    1.00    113.1±0.30µs        ? ?/sec
arrow_array_reader/Int8Array/byte_stream_split encoded, optional, no NULLs                                 1.00     41.6±0.11µs        ? ?/sec    1.00     41.6±0.10µs        ? ?/sec
arrow_array_reader/Int8Array/dictionary encoded, mandatory, no NULLs                                       1.00     95.3±0.29µs        ? ?/sec    1.07    102.4±0.89µs        ? ?/sec
arrow_array_reader/Int8Array/dictionary encoded, optional, half NULLs                                      1.14    170.2±0.84µs        ? ?/sec    1.00    148.7±0.42µs        ? ?/sec
arrow_array_reader/Int8Array/dictionary encoded, optional, no NULLs                                        1.00    101.1±0.28µs        ? ?/sec    1.07    107.8±0.28µs        ? ?/sec
arrow_array_reader/Int8Array/plain encoded, mandatory, no NULLs                                            1.00     30.5±0.13µs        ? ?/sec    1.01     30.7±0.17µs        ? ?/sec
arrow_array_reader/Int8Array/plain encoded, optional, half NULLs                                           1.23    134.3±0.75µs        ? ?/sec    1.00    109.6±0.50µs        ? ?/sec
arrow_array_reader/Int8Array/plain encoded, optional, no NULLs                                             1.00     36.1±0.13µs        ? ?/sec    1.00     36.1±0.18µs        ? ?/sec
arrow_array_reader/ListArray/plain encoded optional strings half NULLs                                     1.00      7.0±0.02ms        ? ?/sec    1.02      7.1±0.04ms        ? ?/sec
arrow_array_reader/ListArray/plain encoded optional strings no NULLs                                       1.00     12.8±0.09ms        ? ?/sec    1.04     13.3±0.18ms        ? ?/sec
arrow_array_reader/StringArray/dictionary encoded, mandatory, no NULLs                                     1.05    506.2±3.03µs        ? ?/sec    1.00    483.7±4.49µs        ? ?/sec
arrow_array_reader/StringArray/dictionary encoded, optional, half NULLs                                    1.00    657.7±2.06µs        ? ?/sec    1.03    676.4±6.01µs        ? ?/sec
arrow_array_reader/StringArray/dictionary encoded, optional, no NULLs                                      1.01    505.5±4.10µs        ? ?/sec    1.00    498.2±2.68µs        ? ?/sec
arrow_array_reader/StringArray/plain encoded, mandatory, no NULLs                                          1.07    735.7±5.14µs        ? ?/sec    1.00    686.6±3.75µs        ? ?/sec
arrow_array_reader/StringArray/plain encoded, optional, half NULLs                                         1.02    800.6±3.27µs        ? ?/sec    1.00    784.2±3.19µs        ? ?/sec
arrow_array_reader/StringArray/plain encoded, optional, no NULLs                                           1.06    741.7±4.61µs        ? ?/sec    1.00    699.1±3.62µs        ? ?/sec
arrow_array_reader/StringDictionary/dictionary encoded, mandatory, no NULLs                                1.02    301.0±1.47µs        ? ?/sec    1.00    296.4±1.47µs        ? ?/sec
arrow_array_reader/StringDictionary/dictionary encoded, optional, half NULLs                               1.07    385.6±6.15µs        ? ?/sec    1.00    358.7±5.47µs        ? ?/sec
arrow_array_reader/StringDictionary/dictionary encoded, optional, no NULLs                                 1.01    306.8±4.70µs        ? ?/sec    1.00    302.4±1.34µs        ? ?/sec
arrow_array_reader/StringViewArray/dictionary encoded, mandatory, no NULLs                                 1.00    229.3±4.89µs        ? ?/sec    1.20    274.1±2.95µs        ? ?/sec
arrow_array_reader/StringViewArray/dictionary encoded, optional, half NULLs                                1.00    214.2±0.62µs        ? ?/sec    1.24    266.2±1.71µs        ? ?/sec
arrow_array_reader/StringViewArray/dictionary encoded, optional, no NULLs                                  1.00    233.3±2.41µs        ? ?/sec    1.20    279.1±2.72µs        ? ?/sec
arrow_array_reader/StringViewArray/plain encoded, mandatory, no NULLs                                      1.00    462.7±6.19µs        ? ?/sec    1.04    479.3±1.83µs        ? ?/sec
arrow_array_reader/StringViewArray/plain encoded, optional, half NULLs                                     1.00    337.0±1.23µs        ? ?/sec    1.10    371.2±1.76µs        ? ?/sec
arrow_array_reader/StringViewArray/plain encoded, optional, no NULLs                                       1.00    470.2±2.72µs        ? ?/sec    1.04    488.4±1.63µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed skip, mandatory, no NULLs                                     1.00    109.6±2.85µs        ? ?/sec    1.07    116.8±0.24µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed skip, optional, half NULLs                                    1.09    121.1±0.33µs        ? ?/sec    1.00    111.2±0.32µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed skip, optional, no NULLs                                      1.00    112.6±0.38µs        ? ?/sec    1.06    119.8±0.53µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed, mandatory, no NULLs                                          1.00    143.9±0.74µs        ? ?/sec    1.09    156.3±0.51µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed, optional, half NULLs                                         1.12    198.0±2.59µs        ? ?/sec    1.00    177.4±0.38µs        ? ?/sec
arrow_array_reader/UInt16Array/binary packed, optional, no NULLs                                           1.00    148.1±0.41µs        ? ?/sec    1.09    161.7±0.49µs        ? ?/sec
arrow_array_reader/UInt16Array/byte_stream_split encoded, mandatory, no NULLs                              1.00     42.5±0.39µs        ? ?/sec    1.04     44.1±0.20µs        ? ?/sec
arrow_array_reader/UInt16Array/byte_stream_split encoded, optional, half NULLs                             1.23    146.9±2.60µs        ? ?/sec    1.00    119.2±0.27µs        ? ?/sec
arrow_array_reader/UInt16Array/byte_stream_split encoded, optional, no NULLs                               1.00     47.6±0.14µs        ? ?/sec    1.03     49.0±0.13µs        ? ?/sec
arrow_array_reader/UInt16Array/dictionary encoded, mandatory, no NULLs                                     1.00    103.0±0.21µs        ? ?/sec    1.07    110.1±0.24µs        ? ?/sec
arrow_array_reader/UInt16Array/dictionary encoded, optional, half NULLs                                    1.14    176.9±0.47µs        ? ?/sec    1.00    154.6±0.30µs        ? ?/sec
arrow_array_reader/UInt16Array/dictionary encoded, optional, no NULLs                                      1.00    107.9±0.35µs        ? ?/sec    1.07    115.7±0.38µs        ? ?/sec
arrow_array_reader/UInt16Array/plain encoded, mandatory, no NULLs                                          1.01     38.6±0.17µs        ? ?/sec    1.00     38.2±0.10µs        ? ?/sec
arrow_array_reader/UInt16Array/plain encoded, optional, half NULLs                                         1.22    141.8±0.45µs        ? ?/sec    1.00    116.3±0.36µs        ? ?/sec
arrow_array_reader/UInt16Array/plain encoded, optional, no NULLs                                           1.00     43.8±0.12µs        ? ?/sec    1.00     43.8±0.14µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed skip, mandatory, no NULLs                                     1.02    100.9±0.22µs        ? ?/sec    1.00     98.5±0.32µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed skip, optional, half NULLs                                    1.15    112.6±0.34µs        ? ?/sec    1.00     97.5±0.31µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed skip, optional, no NULLs                                      1.03    104.4±0.26µs        ? ?/sec    1.00    101.4±0.34µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed, mandatory, no NULLs                                          1.01    128.7±0.36µs        ? ?/sec    1.00    127.4±0.30µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed, optional, half NULLs                                         1.19    181.8±0.85µs        ? ?/sec    1.00    152.7±1.58µs        ? ?/sec
arrow_array_reader/UInt32Array/binary packed, optional, no NULLs                                           1.01    133.8±0.46µs        ? ?/sec    1.00    132.7±1.95µs        ? ?/sec
arrow_array_reader/UInt32Array/byte_stream_split encoded, mandatory, no NULLs                              1.08     27.0±0.40µs        ? ?/sec    1.00     25.0±0.49µs        ? ?/sec
arrow_array_reader/UInt32Array/byte_stream_split encoded, optional, half NULLs                             1.27    125.3±0.33µs        ? ?/sec    1.00     98.6±0.30µs        ? ?/sec
arrow_array_reader/UInt32Array/byte_stream_split encoded, optional, no NULLs                               1.06     31.8±0.35µs        ? ?/sec    1.00     30.1±0.40µs        ? ?/sec
arrow_array_reader/UInt32Array/dictionary encoded, mandatory, no NULLs                                     1.00     86.3±0.39µs        ? ?/sec    1.07     92.7±0.38µs        ? ?/sec
arrow_array_reader/UInt32Array/dictionary encoded, optional, half NULLs                                    1.16    159.3±0.86µs        ? ?/sec    1.00    136.9±0.42µs        ? ?/sec
arrow_array_reader/UInt32Array/dictionary encoded, optional, no NULLs                                      1.00     91.0±0.38µs        ? ?/sec    1.08     98.2±0.35µs        ? ?/sec
arrow_array_reader/UInt32Array/plain encoded, mandatory, no NULLs                                          1.00     21.0±0.58µs        ? ?/sec    1.01     21.2±0.69µs        ? ?/sec
arrow_array_reader/UInt32Array/plain encoded, optional, half NULLs                                         1.27    123.4±0.41µs        ? ?/sec    1.00     96.9±0.44µs        ? ?/sec
arrow_array_reader/UInt32Array/plain encoded, optional, no NULLs                                           1.00     26.8±0.83µs        ? ?/sec    1.02     27.2±0.92µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed skip, mandatory, no NULLs                                     1.00     85.8±0.28µs        ? ?/sec    1.00     86.1±0.25µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed skip, optional, half NULLs                                    1.00     93.8±0.22µs        ? ?/sec    1.13    105.9±0.35µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed skip, optional, no NULLs                                      1.00     89.0±0.32µs        ? ?/sec    1.00     88.7±0.27µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed, mandatory, no NULLs                                          1.00    116.6±0.51µs        ? ?/sec    1.02    118.9±0.48µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed, optional, half NULLs                                         1.00    158.1±0.82µs        ? ?/sec    1.15    181.9±0.65µs        ? ?/sec
arrow_array_reader/UInt64Array/binary packed, optional, no NULLs                                           1.00    122.8±0.53µs        ? ?/sec    1.00    123.2±0.55µs        ? ?/sec
arrow_array_reader/UInt64Array/byte_stream_split encoded, mandatory, no NULLs                              1.01    149.6±0.69µs        ? ?/sec    1.00    148.1±1.10µs        ? ?/sec
arrow_array_reader/UInt64Array/byte_stream_split encoded, optional, half NULLs                             1.00    169.8±1.23µs        ? ?/sec    1.15    195.9±1.84µs        ? ?/sec
arrow_array_reader/UInt64Array/byte_stream_split encoded, optional, no NULLs                               1.01    155.7±0.46µs        ? ?/sec    1.00    153.8±0.51µs        ? ?/sec
arrow_array_reader/UInt64Array/dictionary encoded, mandatory, no NULLs                                     1.00     91.4±0.37µs        ? ?/sec    1.07     98.0±0.51µs        ? ?/sec
arrow_array_reader/UInt64Array/dictionary encoded, optional, half NULLs                                    1.00    138.7±1.24µs        ? ?/sec    1.20    166.6±0.54µs        ? ?/sec
arrow_array_reader/UInt64Array/dictionary encoded, optional, no NULLs                                      1.00     96.7±0.42µs        ? ?/sec    1.07    103.8±0.87µs        ? ?/sec
arrow_array_reader/UInt64Array/plain encoded, mandatory, no NULLs                                          1.00     41.6±0.89µs        ? ?/sec    1.11     46.4±1.77µs        ? ?/sec
arrow_array_reader/UInt64Array/plain encoded, optional, half NULLs                                         1.00    113.8±0.33µs        ? ?/sec    1.22    138.6±0.71µs        ? ?/sec
arrow_array_reader/UInt64Array/plain encoded, optional, no NULLs                                           1.00     48.4±0.62µs        ? ?/sec    1.12     54.4±2.27µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed skip, mandatory, no NULLs                                      1.04    106.1±0.91µs        ? ?/sec    1.00    102.3±0.23µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed skip, optional, half NULLs                                     1.15    118.0±1.14µs        ? ?/sec    1.00    102.2±0.32µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed skip, optional, no NULLs                                       1.04    109.4±0.92µs        ? ?/sec    1.00    105.3±0.27µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed, mandatory, no NULLs                                           1.01    137.6±1.41µs        ? ?/sec    1.00    136.7±0.36µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed, optional, half NULLs                                          1.16    189.3±1.94µs        ? ?/sec    1.00    163.6±0.32µs        ? ?/sec
arrow_array_reader/UInt8Array/binary packed, optional, no NULLs                                            1.00    142.7±1.37µs        ? ?/sec    1.00    142.2±1.46µs        ? ?/sec
arrow_array_reader/UInt8Array/byte_stream_split encoded, mandatory, no NULLs                               1.00     36.5±0.31µs        ? ?/sec    1.00     36.3±0.11µs        ? ?/sec
arrow_array_reader/UInt8Array/byte_stream_split encoded, optional, half NULLs                              1.22    135.2±0.37µs        ? ?/sec    1.00    111.1±0.41µs        ? ?/sec
arrow_array_reader/UInt8Array/byte_stream_split encoded, optional, no NULLs                                1.00     41.3±0.32µs        ? ?/sec    1.00     41.3±0.12µs        ? ?/sec
arrow_array_reader/UInt8Array/dictionary encoded, mandatory, no NULLs                                      1.00     95.2±0.83µs        ? ?/sec    1.07    101.8±0.25µs        ? ?/sec
arrow_array_reader/UInt8Array/dictionary encoded, optional, half NULLs                                     1.15    169.5±0.50µs        ? ?/sec    1.00    147.0±0.40µs        ? ?/sec
arrow_array_reader/UInt8Array/dictionary encoded, optional, no NULLs                                       1.00    100.7±0.21µs        ? ?/sec    1.07    107.8±0.47µs        ? ?/sec
arrow_array_reader/UInt8Array/plain encoded, mandatory, no NULLs                                           1.02     30.8±0.10µs        ? ?/sec    1.00     30.2±0.12µs        ? ?/sec
arrow_array_reader/UInt8Array/plain encoded, optional, half NULLs                                          1.23    134.4±0.41µs        ? ?/sec    1.00    109.0±0.37µs        ? ?/sec
arrow_array_reader/UInt8Array/plain encoded, optional, no NULLs                                            1.00     36.2±0.12µs        ? ?/sec    1.00     36.3±0.16µs        ? ?/sec

alamb · 2025-10-30T10:49:26Z

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.14.0-1017-gcp #18~24.04.1-Ubuntu SMP Tue Sep 23 17:51:44 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing rowselectionempty (14647e1) to 5744743 diff
BENCH_NAME=arrow_reader_clickbench
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench arrow_reader_clickbench
BENCH_FILTER=
BENCH_BRANCH_NAME=rowselectionempty
Results will be posted here when complete

alamb · 2025-10-30T11:15:07Z

🤖: Benchmark completed

Details

group                                main                                   rowselectionempty
-----                                ----                                   -----------------
arrow_reader_clickbench/async/Q1     1.00      2.3±0.01ms        ? ?/sec    1.00      2.4±0.03ms        ? ?/sec
arrow_reader_clickbench/async/Q10    1.00     13.2±0.46ms        ? ?/sec    1.02     13.5±0.37ms        ? ?/sec
arrow_reader_clickbench/async/Q11    1.00     14.8±0.28ms        ? ?/sec    1.04     15.4±0.42ms        ? ?/sec
arrow_reader_clickbench/async/Q12    1.02     28.0±0.25ms        ? ?/sec    1.00     27.4±0.41ms        ? ?/sec
arrow_reader_clickbench/async/Q13    1.19     39.1±0.24ms        ? ?/sec    1.00     32.8±0.35ms        ? ?/sec
arrow_reader_clickbench/async/Q14    1.22     37.0±0.30ms        ? ?/sec    1.00     30.3±0.62ms        ? ?/sec
arrow_reader_clickbench/async/Q19    1.00      5.6±0.10ms        ? ?/sec    1.00      5.6±0.11ms        ? ?/sec
arrow_reader_clickbench/async/Q20    1.00   133.9±14.45ms        ? ?/sec    1.24   165.7±11.71ms        ? ?/sec
arrow_reader_clickbench/async/Q21    1.00   153.7±15.94ms        ? ?/sec    1.22   187.2±19.78ms        ? ?/sec
arrow_reader_clickbench/async/Q22    1.00   276.6±19.55ms        ? ?/sec    1.18   327.1±32.72ms        ? ?/sec
arrow_reader_clickbench/async/Q23    1.00    433.1±8.95ms        ? ?/sec    1.00    433.2±2.25ms        ? ?/sec
arrow_reader_clickbench/async/Q24    1.19     44.0±0.51ms        ? ?/sec    1.00     36.9±0.47ms        ? ?/sec
arrow_reader_clickbench/async/Q27    1.00    105.0±0.78ms        ? ?/sec    1.02    107.0±0.55ms        ? ?/sec
arrow_reader_clickbench/async/Q28    1.00    105.2±0.54ms        ? ?/sec    1.03    108.1±0.52ms        ? ?/sec
arrow_reader_clickbench/async/Q30    1.63     53.8±0.43ms        ? ?/sec    1.00     33.1±0.32ms        ? ?/sec
arrow_reader_clickbench/async/Q36    1.02    125.1±0.45ms        ? ?/sec    1.00    122.3±0.64ms        ? ?/sec
arrow_reader_clickbench/async/Q37    1.04     98.8±0.57ms        ? ?/sec    1.00     95.3±0.61ms        ? ?/sec
arrow_reader_clickbench/async/Q38    1.00     37.2±0.28ms        ? ?/sec    1.05     39.0±0.26ms        ? ?/sec
arrow_reader_clickbench/async/Q39    1.00     48.2±0.33ms        ? ?/sec    1.05     50.4±0.54ms        ? ?/sec
arrow_reader_clickbench/async/Q40    1.45     45.7±1.20ms        ? ?/sec    1.00     31.5±0.43ms        ? ?/sec
arrow_reader_clickbench/async/Q41    1.39     36.4±0.68ms        ? ?/sec    1.00     26.1±0.36ms        ? ?/sec
arrow_reader_clickbench/async/Q42    1.12     13.7±0.17ms        ? ?/sec    1.00     12.3±0.18ms        ? ?/sec
arrow_reader_clickbench/sync/Q1      1.00      2.1±0.01ms        ? ?/sec    1.01      2.1±0.01ms        ? ?/sec
arrow_reader_clickbench/sync/Q10     1.01      9.5±0.10ms        ? ?/sec    1.00      9.4±0.06ms        ? ?/sec
arrow_reader_clickbench/sync/Q11     1.00     11.0±0.12ms        ? ?/sec    1.01     11.1±0.06ms        ? ?/sec
arrow_reader_clickbench/sync/Q12     1.06     38.8±0.27ms        ? ?/sec    1.00     36.7±2.59ms        ? ?/sec
arrow_reader_clickbench/sync/Q13     1.02     49.9±0.29ms        ? ?/sec    1.00     48.7±0.47ms        ? ?/sec
arrow_reader_clickbench/sync/Q14     1.08     47.8±0.27ms        ? ?/sec    1.00     44.4±2.40ms        ? ?/sec
arrow_reader_clickbench/sync/Q19     1.01      4.3±0.03ms        ? ?/sec    1.00      4.2±0.02ms        ? ?/sec
arrow_reader_clickbench/sync/Q20     1.00    178.4±0.91ms        ? ?/sec    1.01    181.0±0.82ms        ? ?/sec
arrow_reader_clickbench/sync/Q21     1.01    242.7±1.06ms        ? ?/sec    1.00    239.6±1.94ms        ? ?/sec
arrow_reader_clickbench/sync/Q22     1.00    484.3±3.98ms        ? ?/sec    1.01    489.9±3.94ms        ? ?/sec
arrow_reader_clickbench/sync/Q23     1.00   440.8±15.88ms        ? ?/sec    1.00   441.8±14.12ms        ? ?/sec
arrow_reader_clickbench/sync/Q24     1.10     51.6±0.80ms        ? ?/sec    1.00     46.8±0.42ms        ? ?/sec
arrow_reader_clickbench/sync/Q27     1.00    155.5±0.99ms        ? ?/sec    1.03    159.6±0.98ms        ? ?/sec
arrow_reader_clickbench/sync/Q28     1.00    151.5±0.72ms        ? ?/sec    1.03    156.2±1.49ms        ? ?/sec
arrow_reader_clickbench/sync/Q30     1.66     52.1±0.36ms        ? ?/sec    1.00     31.4±0.39ms        ? ?/sec
arrow_reader_clickbench/sync/Q36     1.00    155.1±1.31ms        ? ?/sec    1.00    154.8±1.54ms        ? ?/sec
arrow_reader_clickbench/sync/Q37     1.06     90.0±0.37ms        ? ?/sec    1.00     85.1±0.76ms        ? ?/sec
arrow_reader_clickbench/sync/Q38     1.00     30.0±0.17ms        ? ?/sec    1.01     30.3±0.26ms        ? ?/sec
arrow_reader_clickbench/sync/Q39     1.00     34.7±0.35ms        ? ?/sec    1.02     35.2±0.51ms        ? ?/sec
arrow_reader_clickbench/sync/Q40     1.64     44.1±0.40ms        ? ?/sec    1.00     26.9±0.42ms        ? ?/sec
arrow_reader_clickbench/sync/Q41     1.47     33.3±0.34ms        ? ?/sec    1.00     22.6±0.36ms        ? ?/sec
arrow_reader_clickbench/sync/Q42     1.15     12.8±0.14ms        ? ?/sec    1.00     11.1±0.11ms        ? ?/sec

alamb · 2025-10-30T11:15:11Z

🤖 ./gh_compare_arrow.sh Benchmark Script Running
Linux aal-dev 6.14.0-1017-gcp #18~24.04.1-Ubuntu SMP Tue Sep 23 17:51:44 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing rowselectionempty (14647e1) to 5744743 diff
BENCH_NAME=arrow_reader_row_filter
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental --bench arrow_reader_row_filter
BENCH_FILTER=
BENCH_BRANCH_NAME=rowselectionempty
Results will be posted here when complete

alamb · 2025-10-30T11:28:43Z

🤖: Benchmark completed

Details

group                                                                                main                                   rowselectionempty
-----                                                                                ----                                   -----------------
arrow_reader_row_filter/float64 <= 99.0/all_columns/async                            1.00  1720.5±11.69µs        ? ?/sec    1.01  1739.0±13.11µs        ? ?/sec
arrow_reader_row_filter/float64 <= 99.0/all_columns/sync                             1.00      2.0±0.02ms        ? ?/sec    1.00      2.0±0.01ms        ? ?/sec
arrow_reader_row_filter/float64 <= 99.0/exclude_filter_column/async                  1.00   1557.1±7.68µs        ? ?/sec    1.02   1586.2±7.77µs        ? ?/sec
arrow_reader_row_filter/float64 <= 99.0/exclude_filter_column/sync                   1.00   1656.7±8.38µs        ? ?/sec    1.01  1672.2±14.29µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/all_columns/async              1.00   1519.3±8.50µs        ? ?/sec    1.00  1524.9±11.77µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/all_columns/sync               1.00  1870.1±14.00µs        ? ?/sec    1.00  1860.9±12.07µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/exclude_filter_column/async    1.00   1350.6±4.84µs        ? ?/sec    1.00  1356.5±10.51µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0 AND ts >= 9000/exclude_filter_column/sync     1.00   1450.9±6.97µs        ? ?/sec    1.01  1467.6±12.90µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/all_columns/async                             1.00   1693.8±7.84µs        ? ?/sec    1.03  1745.4±12.91µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/all_columns/sync                              1.00  1976.9±14.72µs        ? ?/sec    1.01      2.0±0.02ms        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/exclude_filter_column/async                   1.00   1550.6±5.40µs        ? ?/sec    1.02  1583.6±10.16µs        ? ?/sec
arrow_reader_row_filter/float64 > 99.0/exclude_filter_column/sync                    1.00  1634.7±18.11µs        ? ?/sec    1.02  1663.9±11.89µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/all_columns/async                              1.00    935.4±4.28µs        ? ?/sec    1.01    945.1±5.97µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/all_columns/sync                               1.00    988.9±5.01µs        ? ?/sec    1.00    993.7±9.68µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/exclude_filter_column/async                    1.00    865.3±3.69µs        ? ?/sec    1.01    870.3±9.16µs        ? ?/sec
arrow_reader_row_filter/int64 == 9999/exclude_filter_column/sync                     1.01    984.7±4.82µs        ? ?/sec    1.00    978.6±6.66µs        ? ?/sec
arrow_reader_row_filter/int64 > 90/all_columns/async                                 1.47      4.1±0.02ms        ? ?/sec    1.00      2.8±0.02ms        ? ?/sec
arrow_reader_row_filter/int64 > 90/all_columns/sync                                  1.57      4.1±0.01ms        ? ?/sec    1.00      2.6±0.02ms        ? ?/sec
arrow_reader_row_filter/int64 > 90/exclude_filter_column/async                       1.34      3.6±0.01ms        ? ?/sec    1.00      2.7±0.02ms        ? ?/sec
arrow_reader_row_filter/int64 > 90/exclude_filter_column/sync                        1.43      3.5±0.01ms        ? ?/sec    1.00      2.4±0.03ms        ? ?/sec
arrow_reader_row_filter/ts < 9000/all_columns/async                                  1.00   1917.4±8.96µs        ? ?/sec    1.01  1945.3±16.00µs        ? ?/sec
arrow_reader_row_filter/ts < 9000/all_columns/sync                                   1.00      2.2±0.01ms        ? ?/sec    1.00      2.2±0.02ms        ? ?/sec
arrow_reader_row_filter/ts < 9000/exclude_filter_column/async                        1.00   1752.7±8.89µs        ? ?/sec    1.00  1756.5±10.41µs        ? ?/sec
arrow_reader_row_filter/ts < 9000/exclude_filter_column/sync                         1.00   1885.0±8.43µs        ? ?/sec    1.00  1888.4±14.49µs        ? ?/sec
arrow_reader_row_filter/ts >= 9000/all_columns/async                                 1.01   1254.9±4.74µs        ? ?/sec    1.00  1243.6±11.57µs        ? ?/sec
arrow_reader_row_filter/ts >= 9000/all_columns/sync                                  1.00   1383.5±5.02µs        ? ?/sec    1.00  1387.9±10.40µs        ? ?/sec
arrow_reader_row_filter/ts >= 9000/exclude_filter_column/async                       1.00  1134.1±10.68µs        ? ?/sec    1.00   1134.0±7.09µs        ? ?/sec
arrow_reader_row_filter/ts >= 9000/exclude_filter_column/sync                        1.00   1264.6±7.34µs        ? ?/sec    1.00   1261.6±9.84µs        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/all_columns/async                             1.32      4.2±0.02ms        ? ?/sec    1.00      3.2±0.48ms        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/all_columns/sync                              1.35      4.9±0.02ms        ? ?/sec    1.00      3.6±0.06ms        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/exclude_filter_column/async                   1.35      3.5±0.02ms        ? ?/sec    1.00      2.6±0.02ms        ? ?/sec
arrow_reader_row_filter/utf8View <> ''/exclude_filter_column/sync                    1.37      3.4±0.01ms        ? ?/sec    1.00      2.5±0.03ms        ? ?/sec

hhhizzz · 2025-10-30T14:55:06Z

For how I get the the average length to use the mask, here's some statistic, you can checkout to (https://github.com/hhhizzz/arrow-rs/tree/rowselectionempty-charts) and run python3 dev/row_selection_analysis.py on your local machine, this is the results on my x86 PC:

One column `int32`, different distribution type:

Different column type:

Different column counts:

alamb

First of all, thank you so much @hhhizzz -- I think this is really nice change and the code is well structured and a pleasure to read. Also thank you to @zhuqi-lucas for setting the stage for much of this work

Given the performance results so far (basically as good or better as the existing code) I think this PR is almost ready to go

The only thing I am not sure about is the null page / skipping thing -- I left more comments inline

I think there are several additional improvements that could be done as follow on work:

The heuristic for when to use the masking strategy can likely be improved based on the types of values being filtered (for example the number of columns or the inclusion of StringView)
Avoid creating RowSelection just to turn it back to a BooleanArray (I left comments inline)

parquet/benches/row_selection_state.rs

parquet/benches/row_selection_cursor.rs

parquet/src/arrow/arrow_reader/mod.rs

parquet/src/arrow/arrow_reader/read_plan.rs

parquet/src/column/reader.rs

parquet/src/arrow/arrow_reader/mod.rs

parquet/src/column/reader.rs

parquet/src/arrow/arrow_reader/mod.rs

tustvold

I took a quick look, whilst I think orchestrating this skipping at the RecordReader level does have a certain elegance, it runs into the issue that the masked selections aren't necessarily page-aligned.

By definition the mask selection strategy requests rows that weren't part of the original selection, the problem is that this could result in requesting rows for pages that we know are irrelevant. In some cases this just results in wasted IO, however, when using prefetching IO systems (such as AsyncParquetReader) this results in errors. The hack of creating empty pages I'm not a big fan of.

I think a better solution would be to ensure we only construct MaskChunk that don't cross page boundaries. Ideally this would be done on a per-leaf column basis, but tbh I suspect just doing it globally would probably work just fine.

Edit: If one was feeling fancy, one could ignore page boundaries where both pages were present in the original selection, although in practice I suspect this not to make a huge difference.

parquet/src/arrow/arrow_reader/mod.rs

alamb · 2025-11-09T12:53:11Z

The benchmarks are looking great. Thank you (again @hhhizzz)

I have been working on the DataFusion 51 release but then I will get back to this one. I expect we'll be able to get this one in this week ❤️

hhhizzz · 2025-11-09T13:50:52Z

The benchmarks are looking great. Thank you (again @hhhizzz)

I have been working on the DataFusion 51 release but then I will get back to this one. I expect we'll be able to get this one in this week ❤️

Thank you ! I think the only thing need to be discussed is #8733 (comment)
If the threshold should be included in the RowSelectionStrategy, if so the API for the Builders should be updated. As the last commit.

# Conflicts: # parquet/src/arrow/async_reader/mod.rs

alamb

TLDR is thank you so much for this PR @hhhizzz -- I think it is a significant step forward

I have several ideas for how to make it better (some additional doc strings, potentially rename some things), but I think we can do them as follow on PRs (I'll make some PRs to target this branch as well so you can see what I have in mind)

alamb · 2025-11-10T15:13:07Z

parquet/src/arrow/arrow_reader/mod.rs

 use crate::schema::types::SchemaDescriptor;

 use crate::arrow::arrow_reader::metrics::ArrowReaderMetrics;
+// Exposed so integration tests and benchmarks can temporarily override the threshold.


I think there are other use cases too (aka I am not sure this comment is accurate anymore)

alamb · 2025-11-10T15:15:28Z

parquet/src/arrow/arrow_reader/mod.rs

-            .build();
+            .build_limited();
+
+        let preferred_strategy = plan_builder.preferred_selection_strategy();


Elsewhere in the PR you used the term "resolved" to reflect the chosen strategy -- perhaps we can do the same here

let resolved_strategy = plan_builder.resolve_selection_strategy(); plan_builder = plan_builder.with_selection_strategy(resolved_strategy);

I also found it slightly strange that the PlanBuilder has both methods -- maybe it would be simpler if build() simply resolved the strategy directly.

However, I think we can do this as a follow on PR

I choose preferred_selection_strategy because this is just one option preferred by readPlan, it will decide the final strategy after it include the page(offset) into consideration.
Here the sync reader won't do page skip so the preferred is the final result.

alamb · 2025-11-10T15:19:48Z

parquet/src/arrow/arrow_reader/mod.rs

        }
    }

+    /// Configure how row selections should be materialised during execution


It would be nice, eventually, to change RowSelection so the callers could provide a bitmap as well.

I will think about how this might look, but it is not something to change in this PR

alamb · 2025-11-10T16:02:51Z

Thank you ! I think the only thing need to be discussed is #8733 (comment)
If the threshold should be included in the RowSelectionStrategy, if so the API for the Builders should be updated. As the last commit.

Yes, I am trying some options to see if I can combine the RowSelectionStrategy and the RowSelection together into the public API...

Minor: clean up of selection strategy code

Another minor cleanup for the read plan builder

alamb · 2025-11-10T18:47:13Z

parquet/src/arrow/arrow_reader/read_plan.rs

 impl ReadPlan {
    /// Returns a mutable reference to the selection, if any
-    pub fn selection_mut(&mut self) -> Option<&mut VecDeque<RowSelector>> {
+    pub fn selection_mut(&mut self) -> Option<&mut RowSelectionCursor> {


🤔 it seems like this is a public API (so we can't change its signature)

https://docs.rs/parquet/latest/parquet/arrow/arrow_reader/struct.ReadPlan.html#method.selection_mut

We can get around this by making a function like

pub fn row_selection_cursor_mut(&mut self) -> Option<&mut RowSelectionCursor> {

And then

#[deprecated(since = "57.1.0", note = "Use `row_selection_cursor_mut` instead")] pub fn selection_mut(&mut self) -> Option<&mut VecDeque<RowSelector>> { ...

I am still playing around to see if I can make the RowSelectionCursor

I propose this change in

Rework RowSelectionCursor to use enums hhhizzz/arrow-rs#8

Rework RowSelectionCursor to use enums

alamb · 2025-11-10T20:44:56Z

Here is another PR

Split RowSelectionPolicy from RowSelectionStrategy hhhizzz/arrow-rs#9

alamb · 2025-11-10T20:50:34Z

Here are some notes to myself to file PRs for follow on work:

Add a way to avoid converting from a Mask --> Selection with the result of evaluating predicates (I think this will be a natural follow on)
Update the code for data page skipping to be mask aware (and not skip pages when doing so would prevent mask evaluation)

Split RowSelectionPolicy from RowSelectionStrategy

…th Mask

alamb · 2025-11-11T19:08:03Z

I plan to leave this PR open for a few more days to gather any additional comments and then merge it in

I may also do some prototyping of additional optimizations (avoiding the masks and fetches entirely) for fun

Thank you again @hhhizzz -- I think this PR will finally get us past the hurdle for making predicate pushdown work well for all cases

github-actions bot added the parquet Changes to the parquet crate label Oct 28, 2025

This was referenced Oct 29, 2025

[Parquet] Adaptive Parquet Predicate Pushdown hhhizzz/arrow-rs#1

Closed

[Parquet] Adaptive Parquet Predicate Pushdown hhhizzz/arrow-rs#2

Closed

alamb mentioned this pull request Oct 29, 2025

WIP: Pin to Adaptive Parquet Predicate Pushdown apache/datafusion#18368

Draft

zhuqi-lucas reviewed Oct 30, 2025

View reviewed changes

parquet/src/arrow/arrow_reader/read_plan.rs Outdated Show resolved Hide resolved

alamb reviewed Oct 30, 2025

View reviewed changes

parquet/src/column/reader.rs Outdated Show resolved Hide resolved

alamb reviewed Oct 30, 2025

View reviewed changes

parquet/src/arrow/arrow_reader/mod.rs Outdated Show resolved Hide resolved

alamb mentioned this pull request Oct 31, 2025

[EPIC] Faster performance for parquet predicate evaluation for non selective filters #7456

Open

8 tasks

tustvold reviewed Nov 1, 2025

View reviewed changes

parquet/src/column/reader.rs Outdated Show resolved Hide resolved

tustvold reviewed Nov 1, 2025

View reviewed changes

parquet/src/column/reader.rs Outdated Show resolved Hide resolved

tustvold reviewed Nov 1, 2025

View reviewed changes

parquet/src/arrow/arrow_reader/mod.rs Outdated Show resolved Hide resolved

alamb mentioned this pull request Nov 1, 2025

Remove synthetic page from adaptive row selection hhhizzz/arrow-rs#5

Open

tustvold reviewed Nov 1, 2025

View reviewed changes

parquet/src/arrow/arrow_reader/mod.rs Outdated Show resolved Hide resolved

hhhizzz marked this pull request as draft November 3, 2025 09:53

hhhizzz force-pushed the rowselectionempty branch from ad51d87 to ed51620 Compare November 4, 2025 12:22

alamb mentioned this pull request Nov 4, 2025

Andrew Lamb Weekly-ish Open Source plan - 2025-11-03 apache/datafusion#18486

Open

52 tasks

hhhizzz added 2 commits November 5, 2025 17:00

base code

37ab45a

Try to fix the page skip issue

5e81ee4

Update the strategy API

25bcdca

hhhizzz force-pushed the rowselectionempty branch from 32c06d2 to 25bcdca Compare November 9, 2025 13:42

hhhizzz added 2 commits November 10, 2025 21:30

Merge branch 'main' into rowselectionempty

64553be

# Conflicts: # parquet/src/arrow/async_reader/mod.rs

Support the latest push_decoder

9cb9e82

alamb approved these changes Nov 10, 2025

View reviewed changes

alamb mentioned this pull request Nov 10, 2025

Release arrow-rs / parquet Minor version 57.1.0 (November 2025) #8464

Open

10 tasks

Minor: clean up of selection strategy code

8edc8b1

alamb mentioned this pull request Nov 10, 2025

Minor: row selection strategy code cleanup #8816

Closed

alamb mentioned this pull request Nov 10, 2025

Minor: clean up of selection strategy code hhhizzz/arrow-rs#6

Merged

hhhizzz and others added 2 commits November 11, 2025 01:50

Merge pull request #6 from alamb/alamb/minor_rowselection_cleanup

6b59066

Minor: clean up of selection strategy code

Another minor cleanup for the read plan builder

d27d7d7

alamb mentioned this pull request Nov 10, 2025

Another minor cleanup for the read plan builder hhhizzz/arrow-rs#7

Merged

hhhizzz and others added 2 commits November 11, 2025 02:07

Merge pull request #7 from alamb/alamb/preferred_2

0e2895d

Another minor cleanup for the read plan builder

Rework RowSelectionCursor to use enums

93fc72b

alamb mentioned this pull request Nov 10, 2025

Rework RowSelectionCursor to use enums hhhizzz/arrow-rs#8

Merged

alamb reviewed Nov 10, 2025

View reviewed changes

alamb mentioned this pull request Nov 10, 2025

General virtual columns support + row numbers as a first use-case #8715

Open

Merge pull request #8 from alamb/alamb/rework_selections

59ee569

Rework RowSelectionCursor to use enums

alamb mentioned this pull request Nov 10, 2025

Split RowSelectionPolicy from RowSelectionStrategy hhhizzz/arrow-rs#9

Merged

Split RowSelectionPolicy from RowSelectionStrategy

631cb17

Remove safe flag

c30dca7

hhhizzz added 3 commits November 11, 2025 08:57

Merge pull request #9 from alamb/alamb/add_policy

1abae95

Split RowSelectionPolicy from RowSelectionStrategy

Rename the benchmark name to cursor

e2da9fe

Some miner code clean up. Refine error message when user hit error wi…

2566c26

…th Mask

[Parquet] Adaptive Parquet Predicate Pushdown #8733

Are you sure you want to change the base?

[Parquet] Adaptive Parquet Predicate Pushdown #8733

Uh oh!

Conversation

hhhizzz commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

hhhizzz commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alamb commented Oct 29, 2025

Uh oh!

alamb commented Oct 29, 2025

Uh oh!

alamb commented Oct 30, 2025

Uh oh!

Uh oh!

alamb commented Oct 30, 2025

Uh oh!

alamb commented Oct 30, 2025

Uh oh!

alamb commented Oct 30, 2025

Uh oh!

alamb commented Oct 30, 2025

Uh oh!

alamb commented Oct 30, 2025

Uh oh!

hhhizzz commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

One column int32, different distribution type:

Different column type:

Different column counts:

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tustvold left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

alamb commented Nov 9, 2025

Uh oh!

hhhizzz commented Nov 9, 2025

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

alamb Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

alamb Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

hhhizzz Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

alamb Nov 10, 2025

Choose a reason for hiding this comment

hhhizzz commented Oct 28, 2025 •

edited

Loading

hhhizzz commented Oct 28, 2025 •

edited

Loading

hhhizzz commented Oct 30, 2025 •

edited

Loading

One column `int32`, different distribution type:

tustvold left a comment •

edited

Loading

alamb commented Nov 11, 2025 •

edited

Loading