Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hfe serialization listpack #13243

Merged

Conversation

ronen-kalish
Copy link
Collaborator

@ronen-kalish ronen-kalish commented May 2, 2024

Add RDB de/serialization for HFE

This PR adds two new RDB types: RDB_TYPE_HASH_METADATA and RDB_TYPE_HASH_LISTPACK_TTL to save HFE data.
When the hash RAM encoding is dict, it will be saved in the former, and when it is listpack it will be saved in the latter.
Both formats just add the TTL value for each field after the data that was previously saved, i.e HASH_METADATA will save the number of entries and, for each entry, key, value and TTL, whereas listpack is saved as a blob.
On read, the usual dict <--> listpack conversion takes place if required.
In addition, when reading a hash that was saved as a dict fields are actively expired if expiry is due. Currently this slao holds for listpack encoding, but it is supposed to be removed.

Until Ozan's listpack implementation will be merged to the integ branch, review should be done by commits.

TODO:
Remove active expiry on load when loading from listpack format (unless we'll decide to keep it)

@ronen-kalish ronen-kalish marked this pull request as draft May 2, 2024 16:14
src/listpack.h Outdated Show resolved Hide resolved
src/rdb.c Outdated Show resolved Hide resolved
src/rdb.c Outdated Show resolved Hide resolved
src/rdb.c Outdated Show resolved Hide resolved
src/rdb.c Outdated Show resolved Hide resolved
src/ebuckets.c Outdated Show resolved Hide resolved
src/listpack.c Outdated Show resolved Hide resolved
src/rdb.h Outdated Show resolved Hide resolved
src/rdb.c Outdated Show resolved Hide resolved
src/t_hash.c Outdated Show resolved Hide resolved
src/t_hash.c Outdated Show resolved Hide resolved
tests/integration/rdb.tcl Outdated Show resolved Hide resolved
src/cluster.c Outdated Show resolved Hide resolved
src/rdb.h Outdated Show resolved Hide resolved
src/rdb.c Outdated Show resolved Hide resolved
src/rdb.c Outdated Show resolved Hide resolved
src/rdb.c Outdated Show resolved Hide resolved
src/rdb.c Outdated Show resolved Hide resolved
src/rdb.c Outdated Show resolved Hide resolved
src/rdb.c Outdated Show resolved Hide resolved
src/rdb.h Outdated Show resolved Hide resolved
tests/integration/rdb.tcl Outdated Show resolved Hide resolved
@ronen-kalish ronen-kalish force-pushed the hfe-serialization-listpack-v2 branch 2 times, most recently from e568656 to 99d8f40 Compare May 15, 2024 11:18
src/cluster.c Outdated Show resolved Hide resolved
src/rdb.c Outdated Show resolved Hide resolved
src/t_hash.c Outdated Show resolved Hide resolved
src/server.h Outdated Show resolved Hide resolved
src/t_hash.c Outdated Show resolved Hide resolved
tests/integration/rdb.tcl Outdated Show resolved Hide resolved
src/rdb.c Show resolved Hide resolved
src/rdb.c Outdated Show resolved Hide resolved
src/redis-check-rdb.c Outdated Show resolved Hide resolved
@moticless
Copy link
Collaborator

@ronen-kalish , if it is not too much effort at this point in time, please consider to rdbSave() / rdbLoad() of RDB_TYPE_HASH_METADATA tuples of type [ttl][field][value], instead of [field][value][ttl].

We will be able to optimize later on the allocation of fields such that from first place will be allocated with TTL support.

decrRefCount(o);
goto emptykey;
}
if ((db != NULL) && (minExpire != EB_EXPIRE_TIME_INVALID))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following this change, it won't convert any more to HT in case of db is NULL.

Instead I think we should modify hashTypeConvert() to accept pointer of redisDb instead of ebuckets hexpires and let it handle with NULL. It will simplifes conditions at rdb.c as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow - what does this change have to do with conversion to dict? Any why should we care about conversion to dict when db is NULL? In this case, the requirement (AFAIK) is to check RDB validity, the resulting object should be discarded (at least in this repo). What am I missing?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't miss anything. Just saw a change in the logic which i tend to avoid if not necessary at large changes. And in addition to the fact that passing redisDb to hashTypeConvert() it will relax the condtions at rdb.c, made me to write my thought.

@ronen-kalish ronen-kalish marked this pull request as ready for review May 15, 2024 14:37
serialize metadata hashes to RDB using a new type, saving per each field
the key, one byte indicating what else is saved and than (optionally)
the value and the TTL (if available).
src/t_hash.c Outdated Show resolved Hide resolved
src/rdb.c Show resolved Hide resolved
tests/integration/rdb.tcl Outdated Show resolved Hide resolved
@ronen-kalish ronen-kalish force-pushed the hfe-serialization-listpack-v2 branch from 600b48c to 2935658 Compare May 15, 2024 15:52
Comment on lines +2684 to +2685
if (!lpValidateIntegrityAndDups(encoded, encoded_len,
deep_integrity_validation, tuple_len)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deep_integrity_validation is 1 means that we will not trigger any assertion after loading this datatype.
so we should validate the ttl field is a valid integer, otherwise, we should trigger the assertion in listpackExExpireDryRun().

        s = lpGetValue(fptr, NULL, &val);
        serverAssert(!s);

am i missing something?

Copy link
Collaborator

@sundb sundb May 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

restore with a corrupt data, and there is a string TTL inside it.
when enabling sanitize-dump-payload, the restore command shouldn't success.

config set sanitize-dump-payload yes
restore h 0 "\x17--\x00\x00\x00\t\x00\x81a\x02\x01\x01\xf4\xa6\x96\x18\xb8\x8f\x01\x00\x00\t\x82f1\x03\x82v1\x03\x83foo\x04\x82f2\x03\x82v2\x03\x00\x01\xff\x0c\x00\xde@\xe57Q\x1c\x12V" replace
hgetf h fields 2 f1 f2

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after discuss with @ronen-kalish , let's deal with this together in restore/dump.

@ronen-kalish ronen-kalish force-pushed the hfe-serialization-listpack-v2 branch 6 times, most recently from 15a9acf to d9d4440 Compare May 16, 2024 13:09
@ronen-kalish ronen-kalish force-pushed the hfe-serialization-listpack-v2 branch from d9d4440 to 5eb92e0 Compare May 16, 2024 13:17
@ronen-kalish
Copy link
Collaborator Author

src/rdb.c Outdated Show resolved Hide resolved
src/listpack.c Show resolved Hide resolved
src/rdb.c Show resolved Hide resolved
src/rdb.c Show resolved Hide resolved
@tezc
Copy link
Collaborator

tezc commented May 16, 2024

LGTM!

If there are flaky tests in the daily, we may fix it later on the way (or you may comment out for now, or merge later with another PR after resolving the issue etc). It is up to you.

@ronen-kalish
Copy link
Collaborator Author

@tezc, I'm not sure the flaky tests in the daily are related to this PR, unless we're sure they weren't flaky before. Is this indeed the case?

@tezc
Copy link
Collaborator

tezc commented May 16, 2024

hmm, I'm not sure.

@sundb I see there is one hfe defrag and a corrupt/dump fuzzer failure. Maybe you can comment on that? Could you take a quick look? #13243 (comment)

@ronen-kalish
Copy link
Collaborator Author

@tezc , @sundb , if anyone can approve this PR once done reviewing, It'll make @YaacovHazan very happy...

Copy link
Collaborator

@sundb sundb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can ignore the failure in the fully CI. it's a known optinization.

@sundb sundb changed the title Hfe serialization listpack v2 Hfe serialization listpack May 17, 2024
@sundb sundb merged commit 323be4d into redis:hash-field-expiry-integ May 17, 2024
13 checks passed
tezc added a commit that referenced this pull request May 18, 2024
FIELDS keyword was added as part of
[#13270](#13270). 

It was missing in
[#13243](#13243)
@ronen-kalish ronen-kalish deleted the hfe-serialization-listpack-v2 branch May 19, 2024 08:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants