New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hfe serialization listpack #13243
Hfe serialization listpack #13243
Conversation
e568656
to
99d8f40
Compare
@ronen-kalish , if it is not too much effort at this point in time, please consider to We will be able to optimize later on the allocation of |
decrRefCount(o); | ||
goto emptykey; | ||
} | ||
if ((db != NULL) && (minExpire != EB_EXPIRE_TIME_INVALID)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following this change, it won't convert any more to HT in case of db
is NULL.
Instead I think we should modify hashTypeConvert()
to accept pointer of redisDb
instead of ebuckets hexpires
and let it handle with NULL. It will simplifes conditions at rdb.c as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I follow - what does this change have to do with conversion to dict? Any why should we care about conversion to dict when db is NULL? In this case, the requirement (AFAIK) is to check RDB validity, the resulting object should be discarded (at least in this repo). What am I missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't miss anything. Just saw a change in the logic which i tend to avoid if not necessary at large changes. And in addition to the fact that passing redisDb
to hashTypeConvert()
it will relax the condtions at rdb.c
, made me to write my thought.
serialize metadata hashes to RDB using a new type, saving per each field the key, one byte indicating what else is saved and than (optionally) the value and the TTL (if available).
600b48c
to
2935658
Compare
if (!lpValidateIntegrityAndDups(encoded, encoded_len, | ||
deep_integrity_validation, tuple_len)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deep_integrity_validation is 1 means that we will not trigger any assertion after loading this datatype.
so we should validate the ttl field is a valid integer, otherwise, we should trigger the assertion in listpackExExpireDryRun().
s = lpGetValue(fptr, NULL, &val);
serverAssert(!s);
am i missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
restore with a corrupt data, and there is a string TTL inside it.
when enabling sanitize-dump-payload
, the restore command shouldn't success.
config set sanitize-dump-payload yes
restore h 0 "\x17--\x00\x00\x00\t\x00\x81a\x02\x01\x01\xf4\xa6\x96\x18\xb8\x8f\x01\x00\x00\t\x82f1\x03\x82v1\x03\x83foo\x04\x82f2\x03\x82v2\x03\x00\x01\xff\x0c\x00\xde@\xe57Q\x1c\x12V" replace
hgetf h fields 2 f1 f2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after discuss with @ronen-kalish , let's deal with this together in restore/dump.
15a9acf
to
d9d4440
Compare
d9d4440
to
5eb92e0
Compare
Full daily run: https://github.com/ronen-kalish/redis/actions/runs/9113052419 |
LGTM! If there are flaky tests in the daily, we may fix it later on the way (or you may comment out for now, or merge later with another PR after resolving the issue etc). It is up to you. |
@tezc, I'm not sure the flaky tests in the daily are related to this PR, unless we're sure they weren't flaky before. Is this indeed the case? |
hmm, I'm not sure. @sundb I see there is one hfe defrag and a corrupt/dump fuzzer failure. Maybe you can comment on that? Could you take a quick look? #13243 (comment) |
@tezc , @sundb , if anyone can approve this PR once done reviewing, It'll make @YaacovHazan very happy... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can ignore the failure in the fully CI. it's a known optinization.
Add RDB de/serialization for HFE
This PR adds two new RDB types:
RDB_TYPE_HASH_METADATA
andRDB_TYPE_HASH_LISTPACK_TTL
to save HFE data.When the hash RAM encoding is dict, it will be saved in the former, and when it is listpack it will be saved in the latter.
Both formats just add the TTL value for each field after the data that was previously saved, i.e HASH_METADATA will save the number of entries and, for each entry, key, value and TTL, whereas listpack is saved as a blob.
On read, the usual dict <--> listpack conversion takes place if required.
In addition, when reading a hash that was saved as a dict fields are actively expired if expiry is due. Currently this slao holds for listpack encoding, but it is supposed to be removed.
Until Ozan's listpack implementation will be merged to the integ branch, review should be done by commits.
TODO:
Remove active expiry on load when loading from listpack format (unless we'll decide to keep it)