Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace ScalarBuffer in Parquet with Vec (#1849) (#5177) #5178

Merged
merged 1 commit into from
Dec 8, 2023

Conversation

tustvold
Copy link
Contributor

@tustvold tustvold commented Dec 6, 2023

Which issue does this PR close?

Closes #1849
Part of #5177

Rationale for this change

Following #3756 it is possible to construct arrow arrays directly from Vec without copying. This PR updates parquet to do this, not only reducing the amount of code, but opening the door to pushing Vec into ColumnReader proper (#5177)

What changes are included in this PR?

Are there any user-facing changes?

@github-actions github-actions bot added the parquet Changes to the parquet crate label Dec 6, 2023
@@ -339,8 +332,8 @@ where
Some(keys) => {
// Happy path - can just copy keys
// Keys will be validated on conversion to arrow
let keys_slice = keys.spare_capacity_mut(range.start + len);
Copy link
Contributor Author

@tustvold tustvold Dec 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was actually incorrect, but didn't matter as spare_capacity_mut didn't update the length of the buffer directly, and so this would just potentially allocate more space than necessary

use arrow_data::ArrayDataBuilder;
use arrow_schema::{DataType as ArrowType, TimeUnit};
use std::any::Any;
use std::sync::Arc;

/// Provides conversion from `Vec<T>` to `Buffer`
pub trait IntoBuffer {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole module is crate private, so this isn't a breaking change

///
/// [scalar]: https://doc.rust-lang.org/book/ch03-02-data-types.html#scalar-types
///
pub trait ScalarValue: Copy {}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This module is crate-private and so this is not a breaking change

/// instead a subsequent call should be made to [`BufferQueue::set_len`]
fn spare_capacity_mut(&mut self, batch_size: usize) -> &mut Self::Slice;
/// instead a subsequent call should be made to [`BufferQueue::truncate_buffer`]
fn get_output_slice(&mut self, batch_size: usize) -> &mut Self::Slice;
Copy link
Contributor Author

@tustvold tustvold Dec 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opted to rename these methods so they didn't collide with methods on Vec. Eventually the plan with #5178 is to remove the need for this trait entirely

@tustvold
Copy link
Contributor Author

tustvold commented Dec 6, 2023

As an added bonus this appears to yield some non-trivial performance improvements

arrow_array_reader/Int32Array/plain encoded, mandatory, no NULLs
                        time:   [5.3469 µs 5.3635 µs 5.3821 µs]
                        change: [-7.8969% -5.8593% -4.0691%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  2 (2.00%) high mild
  10 (10.00%) high severe
arrow_array_reader/Int32Array/plain encoded, optional, no NULLs
                        time:   [6.3293 µs 6.3470 µs 6.3699 µs]
                        change: [-3.6542% -3.2454% -2.7967%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  3 (3.00%) high mild
  5 (5.00%) high severe
arrow_array_reader/Int32Array/plain encoded, optional, half NULLs
                        time:   [25.788 µs 25.803 µs 25.821 µs]
                        change: [-23.063% -22.806% -22.551%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high severe
arrow_array_reader/Int32Array/binary packed, mandatory, no NULLs
                        time:   [25.695 µs 25.710 µs 25.729 µs]
                        change: [-2.2967% -1.7858% -1.3710%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  5 (5.00%) high mild
  2 (2.00%) high severe
arrow_array_reader/Int32Array/binary packed, optional, no NULLs
                        time:   [26.824 µs 26.836 µs 26.850 µs]
                        change: [-2.0692% -1.8333% -1.6824%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
  5 (5.00%) high mild
  6 (6.00%) high severe
arrow_array_reader/Int32Array/binary packed skip, mandatory, no NULLs
                        time:   [23.351 µs 23.361 µs 23.374 µs]
                        change: [+0.7233% +0.8950% +1.0302%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 9 outliers among 100 measurements (9.00%)
  2 (2.00%) low severe
  4 (4.00%) high mild
  3 (3.00%) high severe
arrow_array_reader/Int32Array/binary packed skip, optional, no NULLs
                        time:   [24.124 µs 24.134 µs 24.145 µs]
                        change: [+0.9784% +1.2249% +1.3891%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) high mild
  1 (1.00%) high severe
arrow_array_reader/Int32Array/binary packed, optional, half NULLs
                        time:   [37.076 µs 37.095 µs 37.117 µs]
                        change: [-16.959% -16.638% -16.312%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
  6 (6.00%) high mild
  7 (7.00%) high severe
arrow_array_reader/Int32Array/dictionary encoded, mandatory, no NULLs
                        time:   [30.219 µs 30.237 µs 30.260 µs]
                        change: [-0.1884% +0.1227% +0.3555%] (p = 0.45 > 0.05)
                        No change in performance detected.
Found 9 outliers among 100 measurements (9.00%)
  8 (8.00%) high mild
  1 (1.00%) high severe
arrow_array_reader/Int32Array/dictionary encoded, optional, no NULLs
                        time:   [31.470 µs 31.482 µs 31.495 µs]
                        change: [+0.5100% +0.7540% +0.9096%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) low mild
  3 (3.00%) high mild
  2 (2.00%) high severe
arrow_array_reader/Int32Array/dictionary encoded, optional, half NULLs
                        time:   [39.496 µs 39.512 µs 39.531 µs]
                        change: [-17.179% -16.930% -16.677%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) high mild
  1 (1.00%) high severe

arrow_array_reader/Int64Array/plain encoded, mandatory, no NULLs
                        time:   [8.7773 µs 8.8269 µs 8.8856 µs]
                        change: [-1.9864% -1.0590% -0.1504%] (p = 0.03 < 0.05)
                        Change within noise threshold.
Found 13 outliers among 100 measurements (13.00%)
  2 (2.00%) high mild
  11 (11.00%) high severe
arrow_array_reader/Int64Array/plain encoded, optional, no NULLs
                        time:   [10.101 µs 10.163 µs 10.235 µs]
                        change: [+2.2000% +3.1099% +4.1248%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 11 outliers among 100 measurements (11.00%)
  7 (7.00%) high mild
  4 (4.00%) high severe
arrow_array_reader/Int64Array/plain encoded, optional, half NULLs
                        time:   [28.492 µs 28.509 µs 28.531 µs]
                        change: [-19.647% -19.338% -19.096%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) high mild
  5 (5.00%) high severe
arrow_array_reader/Int64Array/binary packed, mandatory, no NULLs
                        time:   [24.179 µs 24.192 µs 24.208 µs]
                        change: [-1.1160% -0.8207% -0.5233%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild
  4 (4.00%) high severe
arrow_array_reader/Int64Array/binary packed, optional, no NULLs
                        time:   [25.760 µs 25.775 µs 25.797 µs]
                        change: [+0.7274% +1.1453% +1.4935%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 7 outliers among 100 measurements (7.00%)
  4 (4.00%) high mild
  3 (3.00%) high severe
arrow_array_reader/Int64Array/binary packed skip, mandatory, no NULLs
                        time:   [21.151 µs 21.160 µs 21.170 µs]
                        change: [-0.2043% -0.1230% -0.0440%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
  4 (4.00%) high mild
  2 (2.00%) high severe
arrow_array_reader/Int64Array/binary packed skip, optional, no NULLs
                        time:   [22.037 µs 22.049 µs 22.062 µs]
                        change: [+1.0202% +1.1622% +1.4265%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
  6 (6.00%) high mild
  2 (2.00%) high severe
arrow_array_reader/Int64Array/binary packed, optional, half NULLs
                        time:   [37.743 µs 37.770 µs 37.804 µs]
                        change: [-14.994% -14.719% -14.500%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) high mild
  4 (4.00%) high severe
arrow_array_reader/Int64Array/dictionary encoded, mandatory, no NULLs
                        time:   [31.698 µs 31.729 µs 31.771 µs]
                        change: [-3.9952% -3.7114% -3.4234%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
  7 (7.00%) high mild
  7 (7.00%) high severe
arrow_array_reader/Int64Array/dictionary encoded, optional, no NULLs
                        time:   [33.139 µs 33.155 µs 33.176 µs]
                        change: [-3.3441% -3.1211% -2.9877%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  7 (7.00%) high mild
  2 (2.00%) high severe
arrow_array_reader/Int64Array/dictionary encoded, optional, half NULLs
                        time:   [41.159 µs 41.187 µs 41.221 µs]
                        change: [-15.261% -14.988% -14.719%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) high mild
  4 (4.00%) high severe

arrow_array_reader/StringArray/plain encoded, mandatory, no NULLs
                        time:   [165.30 µs 165.36 µs 165.43 µs]
                        change: [-13.927% -13.600% -13.290%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) high mild
  4 (4.00%) high severe
arrow_array_reader/StringArray/plain encoded, optional, no NULLs
                        time:   [168.31 µs 168.40 µs 168.51 µs]
                        change: [-12.305% -12.060% -11.765%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe
arrow_array_reader/StringArray/plain encoded, optional, half NULLs
                        time:   [206.46 µs 206.64 µs 206.89 µs]
                        change: [-7.2365% -6.9669% -6.6819%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
  5 (5.00%) high mild
  6 (6.00%) high severe
arrow_array_reader/StringArray/dictionary encoded, mandatory, no NULLs
                        time:   [120.41 µs 120.54 µs 120.67 µs]
                        change: [-5.5460% -5.1399% -4.8258%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
arrow_array_reader/StringArray/dictionary encoded, optional, no NULLs
                        time:   [121.35 µs 121.51 µs 121.68 µs]
                        change: [-5.2993% -4.9421% -4.6451%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild
arrow_array_reader/StringArray/dictionary encoded, optional, half NULLs
                        time:   [184.41 µs 184.50 µs 184.61 µs]
                        change: [-4.0937% -3.8438% -3.6802%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high severe

arrow_array_reader/StringDictionary/dictionary encoded, mandatory, no NULLs
                        time:   [13.167 µs 13.173 µs 13.179 µs]
                        change: [-3.6494% -3.3819% -3.2152%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 13 outliers among 100 measurements (13.00%)
  1 (1.00%) low mild
  6 (6.00%) high mild
  6 (6.00%) high severe
arrow_array_reader/StringDictionary/dictionary encoded, optional, no NULLs
                        time:   [14.062 µs 14.093 µs 14.124 µs]
                        change: [-3.5597% -3.2495% -2.9608%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 20 outliers among 100 measurements (20.00%)
  8 (8.00%) low severe
  1 (1.00%) low mild
  5 (5.00%) high mild
  6 (6.00%) high severe
arrow_array_reader/StringDictionary/dictionary encoded, optional, half NULLs
                        time:   [32.436 µs 32.462 µs 32.491 µs]
                        change: [-22.130% -21.881% -21.657%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 14 outliers among 100 measurements (14.00%)
  6 (6.00%) high mild
  8 (8.00%) high severe

arrow_array_reader/ListArray/plain encoded optional strings no NULLs
                        time:   [3.6977 ms 3.7186 ms 3.7400 ms]
                        change: [-2.4012% -1.7459% -1.0145%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild
arrow_array_reader/ListArray/plain encoded optional strings half NULLs
                        time:   [2.1217 ms 2.1238 ms 2.1259 ms]
                        change: [+0.6408% +0.7861% +0.9330%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

arrow_array_reader/INT32/Decimal128Array/plain encoded, mandatory, no NULLs
                        time:   [116.80 µs 116.88 µs 116.96 µs]
                        change: [-0.4785% -0.2628% -0.1129%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 17 outliers among 100 measurements (17.00%)
  8 (8.00%) high mild
  9 (9.00%) high severe
arrow_array_reader/INT32/Decimal128Array/plain encoded, optional, no NULLs
                        time:   [117.79 µs 117.82 µs 117.87 µs]
                        change: [-0.0555% -0.0058% +0.0464%] (p = 0.82 > 0.05)
                        No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild
arrow_array_reader/INT32/Decimal128Array/plain encoded, optional, half NULLs
                        time:   [255.25 µs 255.38 µs 255.54 µs]
                        change: [-0.9133% -0.6088% -0.2878%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 9 outliers among 100 measurements (9.00%)
  5 (5.00%) high mild
  4 (4.00%) high severe
arrow_array_reader/INT32/Decimal128Array/binary packed, mandatory, no NULLs
                        time:   [137.26 µs 137.34 µs 137.45 µs]
                        change: [-0.7107% -0.4972% -0.3386%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
  3 (3.00%) high mild
  5 (5.00%) high severe
arrow_array_reader/INT32/Decimal128Array/binary packed, optional, no NULLs
                        time:   [138.10 µs 138.18 µs 138.28 µs]
                        change: [-0.5837% -0.4282% -0.1553%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe
arrow_array_reader/INT32/Decimal128Array/binary packed skip, mandatory, no NULLs
                        time:   [89.339 µs 89.368 µs 89.399 µs]
                        change: [-5.0484% -3.8945% -2.7749%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  5 (5.00%) high mild
  4 (4.00%) high severe
arrow_array_reader/INT32/Decimal128Array/binary packed skip, optional, no NULLs
                        time:   [90.181 µs 90.228 µs 90.279 µs]
                        change: [-0.8941% -0.6124% -0.3072%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 10 outliers among 100 measurements (10.00%)
  4 (4.00%) high mild
  6 (6.00%) high severe
arrow_array_reader/INT32/Decimal128Array/binary packed, optional, half NULLs
                        time:   [265.20 µs 265.32 µs 265.46 µs]
                        change: [-1.4025% -1.1627% -1.0013%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  2 (2.00%) high mild
  5 (5.00%) high severe
arrow_array_reader/INT32/Decimal128Array/dictionary encoded, mandatory, no NULLs
                        time:   [141.89 µs 142.00 µs 142.15 µs]
                        change: [+0.0146% +0.0844% +0.1664%] (p = 0.02 < 0.05)
                        Change within noise threshold.
Found 15 outliers among 100 measurements (15.00%)
  11 (11.00%) high mild
  4 (4.00%) high severe
Benchmarking arrow_array_reader/INT32/Decimal128Array/dictionary encoded, optionBenchmarking arrow_array_reader/INT32/Decimal128Array/dictionary encoded, optionarrow_array_reader/INT32/Decimal128Array/dictionary encoded, optional, no NULLs
                        time:   [142.91 µs 142.98 µs 143.06 µs]                                         change: [-0.2015% +0.0325% +0.1890%] (p = 0.81 > 0.05)                          No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) high mild
  6 (6.00%) high severe
arrow_array_reader/INT32/Decimal128Array/dictionary encoded, optional, half NULLs
                        time:   [269.08 µs 269.19 µs 269.33 µs]
                        change: [-1.0417% -0.7753% -0.5496%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 9 outliers among 100 measurements (9.00%)
  1 (1.00%) low mild
  5 (5.00%) high mild
  3 (3.00%) high severe

arrow_array_reader/INT64/Decimal128Array/plain encoded, mandatory, no NULLs
                        time:   [119.39 µs 119.49 µs 119.62 µs]
                        change: [-0.5333% -0.4507% -0.3532%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 14 outliers among 100 measurements (14.00%)
  4 (4.00%) high mild
  10 (10.00%) high severe
arrow_array_reader/INT64/Decimal128Array/plain encoded, optional, no NULLs
                        time:   [120.74 µs 121.07 µs 121.52 µs]
                        change: [+3.3684% +4.7770% +6.2647%] (p = 0.00 < 0.05)
                        Performance has regressed.
arrow_array_reader/INT64/Decimal128Array/plain encoded, optional, half NULLs
                        time:   [251.06 µs 251.19 µs 251.37 µs]
                        change: [-1.3163% -1.0429% -0.7463%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 9 outliers among 100 measurements (9.00%)
  3 (3.00%) high mild
  6 (6.00%) high severe
arrow_array_reader/INT64/Decimal128Array/binary packed, mandatory, no NULLs
                        time:   [134.91 µs 135.05 µs 135.19 µs]
                        change: [-0.4981% -0.2409% +0.0022%] (p = 0.04 < 0.05)
                        Change within noise threshold.
Found 18 outliers among 100 measurements (18.00%)
  9 (9.00%) high mild
  9 (9.00%) high severe
arrow_array_reader/INT64/Decimal128Array/binary packed, optional, no NULLs
                        time:   [135.89 µs 135.96 µs 136.04 µs]
                        change: [-0.1771% +0.1011% +0.3644%] (p = 0.55 > 0.05)
                        No change in performance detected.
Found 12 outliers among 100 measurements (12.00%)
  1 (1.00%) high mild
  11 (11.00%) high severe
arrow_array_reader/INT64/Decimal128Array/binary packed skip, mandatory, no NULLs
                        time:   [86.454 µs 86.484 µs 86.518 µs]
                        change: [-9.0288% -7.7596% -6.4482%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe
arrow_array_reader/INT64/Decimal128Array/binary packed skip, optional, no NULLs
                        time:   [87.193 µs 87.239 µs 87.292 µs]
                        change: [-0.1836% -0.0735% +0.0154%] (p = 0.16 > 0.05)
                        No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe
arrow_array_reader/INT64/Decimal128Array/binary packed, optional, half NULLs
                        time:   [260.18 µs 260.25 µs 260.34 µs]
                        change: [-1.1338% -0.8364% -0.5560%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
  2 (2.00%) high mild
  6 (6.00%) high severe
arrow_array_reader/INT64/Decimal128Array/dictionary encoded, mandatory, no NULLs
                        time:   [142.37 µs 142.47 µs 142.62 µs]
                        change: [-1.2780% -1.0573% -0.9141%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 14 outliers among 100 measurements (14.00%)
  1 (1.00%) low mild
  6 (6.00%) high mild
  7 (7.00%) high severe
arrow_array_reader/INT64/Decimal128Array/dictionary encoded, optional, no NULLs
                        time:   [143.35 µs 143.41 µs 143.49 µs]
                        change: [-0.9710% -0.7079% -0.5437%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) high mild
  1 (1.00%) high severe
arrow_array_reader/INT64/Decimal128Array/dictionary encoded, optional, half NULLs
                        time:   [263.88 µs 264.11 µs 264.41 µs]
                        change: [-1.3894% -1.0996% -0.7587%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 14 outliers among 100 measurements (14.00%)
  7 (7.00%) high mild
  7 (7.00%) high severe

arrow_array_reader/BYTE_ARRAY/Decimal128Array/plain encoded, mandatory, no NULLs
                        time:   [426.11 µs 426.39 µs 426.74 µs]
                        change: [-4.4502% -4.2590% -4.1212%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe
arrow_array_reader/BYTE_ARRAY/Decimal128Array/plain encoded, optional, no NULLs
                        time:   [427.23 µs 427.51 µs 427.82 µs]
                        change: [-4.4002% -4.1971% -3.9826%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe
arrow_array_reader/BYTE_ARRAY/Decimal128Array/plain encoded, optional, half NULLs
                        time:   [467.29 µs 467.51 µs 467.78 µs]
                        change: [-3.2500% -3.0242% -2.8062%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  6 (6.00%) high mild
  6 (6.00%) high severe

arrow_array_reader/FIXED_LENGTH_BYTE_ARRAY/Decimal128Array/plain encoded, mandatory, no NULLs
                        time:   [282.28 µs 282.46 µs 282.66 µs]
                        change: [-0.5004% -0.2419% +0.0325%] (p = 0.05 < 0.05)
                        Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
  4 (4.00%) high mild
  4 (4.00%) high severe
arrow_array_reader/FIXED_LENGTH_BYTE_ARRAY/Decimal128Array/plain encoded, optional, no NULLs
                        time:   [283.14 µs 283.36 µs 283.63 µs]
                        change: [-0.6192% -0.4698% -0.2550%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 10 outliers among 100 measurements (10.00%)
  2 (2.00%) high mild
  8 (8.00%) high severe
arrow_array_reader/FIXED_LENGTH_BYTE_ARRAY/Decimal128Array/plain encoded, optional, half NULLs
                        time:   [510.74 µs 511.10 µs 511.47 µs]
                        change: [+20.819% +21.067% +21.260%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high severe

@tustvold tustvold merged commit 2a213bc into apache:master Dec 8, 2023
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use BufferBuilder in parquet instead of custom ScalarBuffer
2 participants