Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor the management of rocksdb ColumnFamily #2279

Open
2 tasks done
mapleFU opened this issue Apr 28, 2024 · 7 comments
Open
2 tasks done

Refactor the management of rocksdb ColumnFamily #2279

mapleFU opened this issue Apr 28, 2024 · 7 comments
Labels
enhancement type enhancement

Comments

@mapleFU
Copy link
Member

mapleFU commented Apr 28, 2024

Search before asking

  • I had searched in the issues and found no similar issues.

Motivation

Currently, kvrocks has multiple Column Families, and some configs are written ad-hoc. So, some issues below would be tricky:

  1. Reduce the number of write buffers for infrequently used column families #2193
  2. Enable CompactionChecker for "search" CF #2263

Now I think we need a class for owning the kvrocks cf configs.

Solution

A unified CF manager

Are you willing to submit a PR?

  • I'm willing to submit a PR!
@mapleFU mapleFU added the enhancement type enhancement label Apr 28, 2024
@jjz921024
Copy link
Contributor

jjz921024 commented Apr 29, 2024

For setting the write_buff_size for each column family. There are a simple way that i can think of is to extend config set rocksdb.write_buffer_size num command.

For example, the command of config set rocksdb.write_buffer_size [cf_name] num is used to set the buff_size of the specify cf.

More example:

config set rocksdb.write_buffer_size ALL num  
# equivalent to `config set rocksdb.write_buffer_size num`
# settings the buff_size for all of column families

config set rocksdb.write_buffer_size MAJOR num
# for major colunm families include metadata and default

config set rocksdb.write_buffer_size MINOR num
# for minor colunm families except for metadata and default

config set rocksdb.write_buffer_size [metadate | default | zset | pubsub | ... ] num
# for specify column family

@mapleFU
Copy link
Member Author

mapleFU commented Apr 29, 2024

I'm trying to add a code-level refactor on this but I'm feel sick these days, maybe I'll update in coming few days

@jjz921024 these style of config here ( #2279 (comment) ) can separate as configs and would be nice for user

@PragmaTwice
Copy link
Member

For setting the write_buff_size for each column family. There are a simple way that i can think of is to extend config set rocksdb.write_buffer_size num command.

For example, the command of config set rocksdb.write_buffer_size [cf_name] num is used to set the buff_size of the specify cf.

More example:

config set rocksdb.write_buffer_size ALL num  
# equivalent to `config set rocksdb.write_buffer_size num`
# settings the buff_size for all of column families

config set rocksdb.write_buffer_size MAJOR num
# for major colunm families include metadata and default

config set rocksdb.write_buffer_size MINOR num
# for minor colunm families except for metadata and default

config set rocksdb.write_buffer_size [metadate | default | zset | pubsub | ... ] num
# for specify column family

That's a good idea. However, here are two things that need to be considered:

  • how to be compatible with old config files
  • previously I try to make all config keys unique, since it will make the config part more intuitive and easy to process in the code level, especially for GET and REWRITE (I spent lots of time to rewrite code of the config part). But this design will break it.

@PragmaTwice
Copy link
Member

PragmaTwice commented May 6, 2024

About the naming of column family:

AS IS:

0 default
1 metadata
2 zset score
3 pubsub
4 propagate
5 stream
6 search

TO BE:

0 primary subkeys
1 key metadata
2 secondary subkeys
3 pubsub
4 propagate
5 TBD (I don't know why stream should have a seperate cf)
6 search

@PragmaTwice
Copy link
Member

Also, we should have a better abstraction of the management of column families.

Currently it's too hard to maintain, and these adhoc logic about column families is written randomly in everywhere.

@PragmaTwice
Copy link
Member

Since it seems very important, I'll add it into our roadmap. cc @mapleFU @git-hulk

@git-hulk
Copy link
Member

git-hulk commented May 6, 2024

Also, we should have a better abstraction of the management of column families.

Currently it's too hard to maintain, and these adhoc logic about column families is written randomly in everywhere.

Agreed, adding the new column family also will break the forward compatibility, we also need to address them while releasing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement type enhancement
Projects
Development

No branches or pull requests

4 participants