Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQLite: stress test shows reading after a commit does not always work #2384

Closed
inetic opened this issue Mar 7, 2023 · 1 comment · Fixed by #2750
Closed

SQLite: stress test shows reading after a commit does not always work #2384

inetic opened this issue Mar 7, 2023 · 1 comment · Fixed by #2750
Labels

Comments

@inetic
Copy link

inetic commented Mar 7, 2023

Bug Description

I have created a stress test which shows that after inserting into a table, subsequent read from the table not always succeeds. I first though this was a SQLite bug so I created a bugreport there but further digging shows that it's likely in sqxl-sqlite, or more particularly in the flume. I tried with the most version (0.10.14) of the flume library but the problem persisted.

What seems to be happening is that after the commit, when we do a SELECT the operation goes like this:

  1. sqlx creates a flume channel here and sends Command::Execute to the connection thread.
  2. The connection thread executes the action here and creates an iter which will correctly iterate over two values here. One of the values is Either::Right and the other is Either::Left, that seems correct.
  3. The problem is that the flume::Receiver occasionally fails to receive the value.

I've been digging into flume (latest master) a bit but haven't yet managed to find a fix. If I have more time I'll try to write them an issue as well.

Minimal Reproduction

I created a small project on github/database-bug. In pseudo code, the rust part does this:

1. Create two databases A and B, both with one connection for reading
    and one for writing (although the one for reading in B isn't used).
2. Create table `a_table` in A and `b_table` in B.
3. Create 200 entries in `a_table` in a single transaction, then commit.
4. In parallel:
   a. Do unrelated write IO on `b_table`
   b. Read the last entry from the `a_table`

The problem is that 4.b. sometimes returns nothing even though the entry has been committed into the table in step #3. Without 4.a. I can't reproduce the issue, but I think it's just there just mess with thread scheduling.

To reproduce the issue I have to run many (200) instances of the above algorithm in parallel. I wrote this shell script to help me with it. I usually need to run the test for up to 10 minutes.

Info

  • SQLx version: 0.5.5, but also the latest main 1ff6a2a
  • SQLx features enabled: default-features = false, features = ["runtime-tokio-rustls", "sqlite"]
  • Database server and version: SQLite, the one used by the main branch (3.35.4), but I also tried to update to libsqlite3-sys to the latest which uses 3.41.0 currently latest.
  • Operating system: Ubuntu 22.04.1 LTS; 64bit; Intel® Core™ i7-4790K CPU @ 4.00GHz × 8; 16GB RAM; The database is stored on disk with Ext4 (version 1.0)
  • rustc --version: rustc 1.65.0 (897e37553 2022-11-02)
@inetic inetic added the bug label Mar 7, 2023
@inetic
Copy link
Author

inetic commented Mar 8, 2023

Created an issue on github/flume

EDIT: Now with a fix

inetic added a commit to equalitie/ouisync that referenced this issue Mar 8, 2023
madadam added a commit to madadam/sqlx that referenced this issue Sep 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant