Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Elide the total order wrapper for non-(float/option) types #14648

Merged
merged 7 commits into from
Feb 23, 2024

Conversation

nameexhaustion
Copy link
Collaborator

@nameexhaustion nameexhaustion commented Feb 23, 2024

This should be a slight perf improvement for hashing primitive types, but more importantly will revert the change in the hash output values for integer types introduced by #14617 (see #14617 (comment)).

Benchmark
running 1 test
n_iterations: 70000000
(10 as i32) hash_one(TotalOrdWrap(x)) vs hash_one(x.to_total_ord()): 2449ms vs 2157ms
(10 as i32) hash_one(TotalOrdWrap(x)) vs hash_one(x.to_total_ord()): 2411ms vs 2151ms
(10 as i32) hash_one(TotalOrdWrap(x)) vs hash_one(x.to_total_ord()): 2400ms vs 2122ms
(10 as i32) hash_one(TotalOrdWrap(x)) vs hash_one(x.to_total_ord()): 2437ms vs 2145ms
(10 as i32) hash_one(TotalOrdWrap(x)) vs hash_one(x.to_total_ord()): 2440ms vs 2166ms
(Some(10) as Option<i32>) hash_one(x) vs hash_one(x.to_total_ord()): 2916ms vs 3357ms
(Some(10) as Option<i32>) hash_one(x) vs hash_one(x.to_total_ord()): 3051ms vs 3350ms
(Some(10) as Option<i32>) hash_one(x) vs hash_one(x.to_total_ord()): 2921ms vs 3279ms
(Some(10) as Option<i32>) hash_one(x) vs hash_one(x.to_total_ord()): 2961ms vs 3262ms
(Some(10) as Option<i32>) hash_one(x) vs hash_one(x.to_total_ord()): 2989ms vs 3253ms
(None as Option<i32>) hash_one(x) vs hash_one(x.to_total_ord()): 2399ms vs 2584ms
test total_ord::test::test has been running for over 60 seconds
(None as Option<i32>) hash_one(x) vs hash_one(x.to_total_ord()): 2432ms vs 2665ms
(None as Option<i32>) hash_one(x) vs hash_one(x.to_total_ord()): 2403ms vs 2665ms
(None as Option<i32>) hash_one(x) vs hash_one(x.to_total_ord()): 2425ms vs 2578ms
(None as Option<i32>) hash_one(x) vs hash_one(x.to_total_ord()): 2411ms vs 2681ms
mod test {
    #[test]
    fn test() {
        use std::time::SystemTime;

        use super::*;
        let rs = ahash::RandomState::new();

        let n_iterations = 70_000_000;

        println!("n_iterations: {}", n_iterations);

        fn time_it<F: Fn() -> T, T>(f: F, n: u32) -> u128 {
            let start = SystemTime::now();
            for _ in 0..n {
                f();
            }
            let end = SystemTime::now();
            let duration = end.duration_since(start).unwrap();
            return duration.as_millis();
        }

        let x: i32 = 10;

        for _ in 0..5 {
            let hash_one_wrap_t = time_it(|| rs.clone().hash_one(TotalOrdWrap(x)), n_iterations);
            let hash_one_to_tot_t = time_it(|| rs.clone().hash_one(x.to_total_ord()), n_iterations);
            println!(
                "(10 as i32) hash_one(TotalOrdWrap(x)) vs hash_one(x.to_total_ord()): {}ms vs {}ms",
                hash_one_wrap_t, hash_one_to_tot_t
            );
        }

        let x: Option<i32> = Some(10);

        for _ in 0..5 {
            let hash_one_t = time_it(|| rs.clone().hash_one(x), n_iterations);
            let hash_one_to_tot_t = time_it(|| rs.clone().hash_one(x.to_total_ord()), n_iterations);
            println!(
                "(Some(10) as Option<i32>) hash_one(x) vs hash_one(x.to_total_ord()): {}ms vs {}ms",
                hash_one_t, hash_one_to_tot_t
            );
        }

        let x: Option<i32> = None;

        for _ in 0..5 {
            let hash_one_t = time_it(|| rs.clone().hash_one(x), n_iterations);
            let hash_one_to_tot_t = time_it(|| rs.clone().hash_one(x.to_total_ord()), n_iterations);
            println!(
                "(None as Option<i32>) hash_one(x) vs hash_one(x.to_total_ord()): {}ms vs {}ms",
                hash_one_t, hash_one_to_tot_t
            );
        }
    }
}

c
@github-actions github-actions bot added performance Performance issues or improvements python Related to Python Polars rust Related to Rust Polars labels Feb 23, 2024
@nameexhaustion nameexhaustion changed the title perf: elide the total order wrapper for types that don't need it perf: elide the total order wrapper for non-(float/option) types Feb 23, 2024
@nameexhaustion nameexhaustion changed the title perf: elide the total order wrapper for non-(float/option) types perf: Elide the total order wrapper for non-(float/option) types Feb 23, 2024
Copy link

codecov bot commented Feb 23, 2024

Codecov Report

Attention: Patch coverage is 86.74033% with 24 lines in your changes are missing coverage. Please review.

Project coverage is 80.77%. Comparing base (9560c34) to head (7f12860).
Report is 2 commits behind head on main.

Files Patch % Lines
crates/polars-utils/src/total_ord.rs 58.33% 15 Missing ⚠️
crates/polars-core/src/frame/group_by/hashing.rs 66.66% 4 Missing ⚠️
crates/polars-core/src/datatypes/any_value.rs 25.00% 3 Missing ⚠️
crates/polars-core/src/hashing/vector_hasher.rs 80.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #14648      +/-   ##
==========================================
- Coverage   80.78%   80.77%   -0.02%     
==========================================
  Files        1326     1326              
  Lines      173157   173227      +70     
  Branches     2453     2453              
==========================================
+ Hits       139891   139923      +32     
- Misses      32792    32830      +38     
  Partials      474      474              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -36,7 +37,7 @@ pub trait PolarsObjectSafe: Any + Debug + Send + Sync + Display {

/// Values need to implement this so that they can be stored into a Series and DataFrame
pub trait PolarsObject:
Any + Debug + Clone + Send + Sync + Default + Display + TotalHash + TotalEq
Any + Debug + Clone + Send + Sync + Default + Display + Hash + TotalHash + PartialEq + Eq + TotalEq
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wasn't straightforward to add ToTotalOrd trait for vec_hash on PolarsObject so juts hash directly for now, since I don't think it's needed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. Objects are 🤷

@@ -185,8 +188,7 @@ where
let group_probe_table = unsafe {
hash_tbls.get_unchecked(hash_to_partition(by_left_k.dirty_hash(), n_tables))
};
let Some(right_grp_idxs) = group_probe_table.get(&TotalOrdWrap(*by_left_k))
else {
let Some(right_grp_idxs) = group_probe_table.get(by_left_k) else {
Copy link
Collaborator Author

@nameexhaustion nameexhaustion Feb 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the build_tables function now uses .to_total_ord() this is now just a plain BytesHash instead of a TotalOrdWrap<BytesHash>, so we don't need to call .to_total_ord() here (this function is asof_join_by_binary).

@ritchie46
Copy link
Member

Thank you @nameexhaustion.

@ritchie46 ritchie46 merged commit 635f727 into pola-rs:main Feb 23, 2024
23 checks passed
@nameexhaustion nameexhaustion deleted the tot branch February 23, 2024 23:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance issues or improvements python Related to Python Polars rust Related to Rust Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants