Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scan command support non-prefixed string matching #2199

Open
1 of 2 tasks
iushas opened this issue Mar 26, 2024 · 10 comments
Open
1 of 2 tasks

Scan command support non-prefixed string matching #2199

iushas opened this issue Mar 26, 2024 · 10 comments
Assignees
Labels
enhancement type enhancement good first issue Good for newcomers help wanted Good for newcomers

Comments

@iushas
Copy link

iushas commented Mar 26, 2024

Search before asking

  • I had searched in the issues and found no similar issues.

Motivation

kvrocks currently only supports prefix matching for scan. Can it achieve the same matching method of scan 0 match *xxx* as redis?

Solution

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!
@iushas iushas added the enhancement type enhancement label Mar 26, 2024
@jihuayu
Copy link
Member

jihuayu commented Mar 27, 2024

@iushas Thanks for your feedback!

@jihuayu jihuayu added help wanted Good for newcomers good first issue Good for newcomers labels Mar 27, 2024
@PragmaTwice
Copy link
Member

PragmaTwice commented Mar 27, 2024

Without additional indexing, such query is quite inefficient: it will probably trigger a full namespace scanning.

I'm not sure if we should add them without any option to control the behavior.

@jihuayu
Copy link
Member

jihuayu commented Mar 27, 2024

Without additional indexing, such query is quite inefficient: it will probably trigger a full namespace scanning.

If there are too many scans, we can return early before filling up the list. The indication of the end of a SCAN is cursor 0, not the list count matching the desired number.

@PragmaTwice
Copy link
Member

Perhaps you misunderstood my point.

For instance, if a user inputs the pattern string *hehehehe, it may search through all keys in the database without finding any matches.

Iterating through all keys in the database can be time-consuming, leading to inconsistent execution times for this command.

@jihuayu
Copy link
Member

jihuayu commented Mar 27, 2024

I means we can do this:

redis 127.0.0.1:6379>  SCAN key 0 match *hehe
1) "17"
2) (empty list or set)

redis 127.0.0.1:6379>  SCAN key 17 match *hehe
1) "21"
2) (empty list or set)

redis 127.0.0.1:6379>  SCAN key 21 match *hehe
1) "0"
2)  1) "xxhehe"

We can set a max scan keys num if more than this num directly returns the empty list.

@baobaomaomeng
Copy link

I'm a beginner programmer and I'd like to try tackling this "good first issue", but I'm uncertain about how to get started with this project. Could you provide some advice on reading the source code, such as parts I might need to modify, as well as information I need to know about the Redis SCAN syntax (or any relevant resources)?

@jihuayu
Copy link
Member

jihuayu commented Apr 1, 2024

Hi @baobaomaomeng thank you!
You can read Redis DOC and get Redis SCAN syntax.
You can read the code from CommandScan.

We have 4 types of scan commands: SCAN, HSCAN, SSCAN and ZSCAN, you can fix SCAN first.

If you have any other problem, please ask me.

@baobaomaomeng
Copy link

Hi @baobaomaomeng thank you! You can read Redis DOC and get Redis SCAN syntax. You can read the code from CommandScan.

We have 4 types of scan commands: SCAN, HSCAN, SSCAN and ZSCAN, you can fix SCAN first.

If you have any other problem, please ask me.

I have some ideas on how to add this feature, but before getting started, I would like to seek your advice. I plan to first modify the SCAN operator, and I will modify the ParseMatchAndCountParam code in the common base class CommandScanBase. Then, I noticed that the redis::Database::scan function is called in execute, and the iter->key().starts_with used in scan is to determine whether it is a prefix match. I need to change this part, but I am not sure what would be the best way to do it. Because I need to know whether I want prefix matching, suffix matching, or (key) pattern matching. I am not sure whether I should add a parameter to this function to determine which match I need, but I feel that this is not elegant enough. Perhaps a better way is to turn it into a template function, pass the scan judgment function through the template, and add a pointer to the function in CommandScanBase, similar to templaterocksdb::Status Database::Scan(), and then determine in ParseMatchAndCountParam which scan function needs to be used.

@baobaomaomeng
Copy link

I have added substring matching and suffix matching for the scan operation, and have done some simple self-tests. Now I want to write some test cases for it, but I don't know how to add tests for it in Go. I want to know if there are any documents or resources that can help me. In addition, I hope you can assign this good first issue to me.

@jihuayu
Copy link
Member

jihuayu commented Apr 4, 2024

@baobaomaomeng Thank you!
You can add some test case in [this file].(https://github.com/apache/kvrocks/blob/unstable/tests/gocase/unit/scan/scan_test.go)
You can create a draft PR. We can talk in the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement type enhancement good first issue Good for newcomers help wanted Good for newcomers
Projects
None yet
Development

No branches or pull requests

4 participants