Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added randomness to lookup_htbl. #4839

Merged
merged 7 commits into from
Feb 26, 2024
Merged

Conversation

waywardgeek
Copy link
Contributor

@waywardgeek waywardgeek commented Feb 25, 2024

This makes it harder to extract a user's location via cache-timing, but not impossible.

The new hash function distributes keys very evenly in tests. In contrast, the ahash crate produces poor hashes, resulting in more hash collisions. I've submitted Issue #210 upstream to ahash about this.

The average number of collisions in a 2B entry swiss table with 50% load factor was 3.50 for ahash. For the custom hasher, it is 2.54, which is identical to what we see using true random numbers as hash values. The custom hash function is also about 20% faster than ahash on my laptop.

This makes it harder to extract a user's location via cache-timing, but not impossible.  The hash function is extended to 2 rounds and is turned into a hash rather than a pseudo-random permutation.  This modified 2-round hash passes the dieharder tests when feeding it with a counter, and outputting the lwoer 32 bits of every hash.  This is more than enough proof that the hash function output is suitable for a hash table.
waywardgeek and others added 6 commits February 25, 2024 07:38
This no longer passes dieharder tests, but all we really want is a nice random-ish distribution that does not have more collisions in the hash table than expected.  We now measure this in a collision test, and one round is good enough.  This is a both faster and a better distribution than the ahash::RandomState algorithm.
@waywardgeek waywardgeek merged commit 7c8d259 into project-oak:main Feb 26, 2024
16 checks passed
jblebrun pushed a commit to jblebrun/oak that referenced this pull request Feb 28, 2024
* Added randomness to lookup_htbl.

This makes it harder to extract a user's location via cache-timing, but not impossible.  The hash function is extended to 2 rounds and is turned into a hash rather than a pseudo-random permutation.  This modified 2-round hash passes the dieharder tests when feeding it with a counter, and outputting the lwoer 32 bits of every hash.  This is more than enough proof that the hash function output is suitable for a hash table.

* Updated Cargo.lock file.

* Updated Cargo.lock file.

* Just reformatted 1 file.

* Reduced hashing back down to 1 round.

This no longer passes dieharder tests, but all we really want is a nice random-ish distribution that does not have more collisions in the hash table than expected.  We now measure this in a collision test, and one round is good enough.  This is a both faster and a better distribution than the ahash::RandomState algorithm.

* Blocked clippy from trying to make an iterator to traverse the hash table.
@jblebrun jblebrun mentioned this pull request Feb 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants