Convert to `hashbrown::HashTable` #21

cuviper · 2024-01-07T01:46:56Z

HashTable is basically a safer version of hashbrown::RawTable. It leaves all of the hashing to be done externally, which means we don't need a NullHasher anymore. Code like LinkedHashMap::shrink_to_fit gets a lot simpler when it can just provide a hashing callback.

There's one public API change here, that LinkedHashMap::reserve and try_reserve are now constrained such that K can be (re)hashed with S, which is needed when hashbrown tries to move them to a new allocation. However, I think it was a bug that this was not required before, because it was previously possible to trigger NullHasher's assertion when resizing -- the new test_reserve demonstrates a case that failed. Also, LinkedHashSet already has these constraints on its reserve methods.

cuviper · 2024-01-07T01:47:57Z

This would close the door on #13 though -- it's understandable if you don't want to be tied more closely to hashbrown.

kyren · 2024-01-08T19:35:15Z

However, I think it was a bug that this was not required before, because it was previously possible to trigger NullHasher's assertion when resizing -- the new test_reserve demonstrates a case that failed.

Yeah, this is embarrassing and definitely wrong, thank you for catching it! I promise I'll merge either #21 or #22 ASAP and make a new release. This has been broken since very early in this crate's life and I think I just nobody has even tried to use it yet 😔.

It is true that I think it would be nice if this crate could not depend on hashbrown, but not because hashbrown is not ofc extremely high quality, but just because it makes sense for it to have absolutely minimal dependencies. HOWEVER, this PR seems simpler (and thus safer) than #22, and since this crate, at least for the moment, has to depend on hashbrown anyway due to the lack of raw entry api, I think this is probably the PR to go with. It was extremely courteous of you to give me two PRs to choose from 😅, I'm sorry you had to do both of them before I responded!

I don't think this completely closes the door on #13 ofc, I would still like that someday... it just doesn't seem like a very good idea to attempt to do that yet because the raw entry API is so useful when using this crate to build LRU caches, which is sort of the thing which I imagine this crate is mostly useful for. Moving off of hashbrown would depend on what the final API in the stdlib looks like... I guess if it was right around the corner or something I might feel differently and merge #22, but it might be that that would take major changes anyway to match what the final stdlib API will look like, so I don't actually feel that bad about this PR.

Does that reasoning make sense to you? If you feel differently please let me know, I'm willing to defer to your (probably better) judgement.

cuviper · 2024-01-08T20:12:53Z

I'm sorry you had to do both of them before I responded!

No worries! That was not impatience, and I'm sorry if I made you feel pressured -- it just piqued my interest enough that it was fun to try both approaches. I also maintain indexmap, which is in a similar "weird HashMap" position, so I like to explore the possibilities.

Does that reasoning make sense to you? If you feel differently please let me know, I'm willing to defer to your (probably better) judgement.

It does make sense, but ultimately I think my judgement is not as important as which one you feel more comfortable maintaining. FWIW, there's also a third option where reserve is fixed for rehashing without as much structural change, but I do think HashTable works pretty well as an expression of intent. It's up to you!

kyren · 2024-01-08T20:22:11Z

It does make sense, but ultimately I think my judgement is not as important as which one you feel more comfortable maintaining. FWIW, there's also a third option where reserve is fixed for rehashing without as much structural change, but I do think HashTable works pretty well as an expression of intent. It's up to you!

I mean honestly I like the HashTable approach too, I'm glad the API exists in hashbrown, so it was a biased choice anyway 😛.

I think everything in the PR looks good, and some of the drive by changes are signs that I really should be running clippy more often. I'll merge this in just a bit and make a new release, thank you! ❤️

kyren · 2024-04-10T20:54:14Z

src/linked_hash_map.rs

-                .insert_with_hasher(hash, new_node, (), move |k| hasher((*k).as_ref().key_ref()))
-                .0;
+                .into_table()
+                .insert_unique(hash, new_node, move |k| hasher((*k).as_ref().key_ref()))


One thing I missed when reviewing this is the behavior of RawVacantEntryMut, which has subtly changed (but I think it is still acceptable behavior).

After this PR, if you produce a RawVacantEntryMut for a specific key, then insert a totally unrelated key using the RawVacantEntryMut you receive, you can end up with duplicate keys (which is not a safety issue, only a correctness one). This seems to match the same behavior in hashbrown, so this is fine, but I should have brought this up during review to ask to make sure I understood everything correctly.

I don't see what subtle change you mean. Yes, you can do incorrect things here, but nothing new -- AFAICT it ends up doing the exact same insertion on the inner RawTable, before and after my change. Am I missing something?

Before: RawVacantEntryMut::insert_with_hasher -> RawTable::insert_entry -> RawTable::insert
After: HashTable::insert_unique -> RawTable::insert

Oh no you're exactly right, it's exactly the same as it was before, I just didn't realize what the previous behavior was.

Of course it would be the same, because it used hashbrown's RawEntryMut API before too.

Sorry... it's been a long month lol.

Convert to hashbrown::HashTable

df848f8

cuviper mentioned this pull request Jan 7, 2024

Move the hasher to the inner HashMap #22

Closed

kyren merged commit db2d230 into kyren:master Jan 9, 2024
1 check passed

kyren reviewed Apr 10, 2024

View reviewed changes

kyren mentioned this pull request Apr 10, 2024

feat: add CursorMut #25

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert to `hashbrown::HashTable` #21

Convert to `hashbrown::HashTable` #21

cuviper commented Jan 7, 2024

cuviper commented Jan 7, 2024

kyren commented Jan 8, 2024 •

edited

cuviper commented Jan 8, 2024

kyren commented Jan 8, 2024

kyren Apr 10, 2024 •

edited

cuviper Apr 10, 2024

kyren Apr 10, 2024 •

edited

Convert to hashbrown::HashTable #21

Convert to hashbrown::HashTable #21

Conversation

cuviper commented Jan 7, 2024

cuviper commented Jan 7, 2024

kyren commented Jan 8, 2024 • edited

cuviper commented Jan 8, 2024

kyren commented Jan 8, 2024

kyren Apr 10, 2024 • edited

Choose a reason for hiding this comment

cuviper Apr 10, 2024

Choose a reason for hiding this comment

kyren Apr 10, 2024 • edited

Choose a reason for hiding this comment

Convert to `hashbrown::HashTable` #21

Convert to `hashbrown::HashTable` #21

kyren commented Jan 8, 2024 •

edited

kyren Apr 10, 2024 •

edited

kyren Apr 10, 2024 •

edited