Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Match MySQL's LAST_INSERT_ID behaviour when no rows are inserted / updated #15699

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

arthurschreiber
Copy link
Contributor

@arthurschreiber arthurschreiber commented Apr 11, 2024

Description

This pull request fixes the behaviour of LAST_INSERT_ID when inserting into a sharded table that uses a sequence to fill an auto increment column.

In case an INSERT ... ON DUPLICATE KEY UPDATE ... statement does neither insert nor update any rows in the target table, Vitess should set LAST_INSERT_ID to 0.

Note: I think this will only work on connections that don't specify the CLIENT_FOUND_ROWS flag. 🤔 I will add some additional test cases to verify this.

Related Issue(s)

Fixes #15696

Checklist

  • "Backport to:" labels have been added if this change should be back-ported to release branches
  • If this change is to be back-ported to previous releases, a justification is included in the PR description
  • Tests were added or are not required
  • Did the new or modified tests pass consistently locally and on CI?
  • Documentation was added or is not required

Deployment Notes

… behaviour when performing `INSERT ... ON DUPLICATE KEY UPDATE ...` statements.

Signed-off-by: Arthur Schreiber <arthurschreiber@github.com>
…ged.

Signed-off-by: Arthur Schreiber <arthurschreiber@github.com>
Copy link
Contributor

vitess-bot bot commented Apr 11, 2024

Review Checklist

Hello reviewers! 👋 Please follow this checklist when reviewing this Pull Request.

General

  • Ensure that the Pull Request has a descriptive title.
  • Ensure there is a link to an issue (except for internal cleanup and flaky test fixes), new features should have an RFC that documents use cases and test cases.

Tests

  • Bug fixes should have at least one unit or end-to-end test, enhancement and new features should have a sufficient number of tests.

Documentation

  • Apply the release notes (needs details) label if users need to know about this change.
  • New features should be documented.
  • There should be some code comments as to why things are implemented the way they are.
  • There should be a comment at the top of each new or modified test to explain what the test does.

New flags

  • Is this flag really necessary?
  • Flag names must be clear and intuitive, use dashes (-), and have a clear help text.

If a workflow is added or modified:

  • Each item in Jobs should be named in order to mark it as required.
  • If the workflow needs to be marked as required, the maintainer team must be notified.

Backward compatibility

  • Protobuf changes should be wire-compatible.
  • Changes to _vt tables and RPCs need to be backward compatible.
  • RPC changes should be compatible with vitess-operator
  • If a flag is removed, then it should also be removed from vitess-operator and arewefastyet, if used there.
  • vtctl command output order should be stable and awk-able.

@vitess-bot vitess-bot bot added NeedsBackportReason If backport labels have been applied to a PR, a justification is required NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsIssue A linked issue is missing for this Pull Request NeedsWebsiteDocsUpdate What it says labels Apr 11, 2024
@arthurschreiber arthurschreiber self-assigned this Apr 11, 2024
@github-actions github-actions bot added this to the v20.0.0 milestone Apr 11, 2024
Signed-off-by: Arthur Schreiber <arthurschreiber@github.com>
Copy link

codecov bot commented Apr 11, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 68.42%. Comparing base (0912690) to head (7bee257).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #15699      +/-   ##
==========================================
+ Coverage   68.40%   68.42%   +0.02%     
==========================================
  Files        1556     1556              
  Lines      195121   195124       +3     
==========================================
+ Hits       133468   133511      +43     
+ Misses      61653    61613      -40     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Arthur Schreiber <arthurschreiber@github.com>
Signed-off-by: Arthur Schreiber <arthurschreiber@github.com>
@arthurschreiber arthurschreiber added Backport to: release-17.0 Needs to be back ported to release-17.0 Backport to: release-18.0 Needs to be back ported to release-18.0 Backport to: release-19.0 Needs to be back ported to release-19.0 labels Apr 11, 2024
@arthurschreiber arthurschreiber changed the title Match MySQL's LAST_INSERT_ID behaviour Match MySQL's LAST_INSERT_ID behaviour when no rows are inserted / updated Apr 11, 2024
@deepthi deepthi removed NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsIssue A linked issue is missing for this Pull Request labels Apr 25, 2024
@harshit-gangal
Copy link
Member

I tried this on mysql and looks like it retained the old value and not reset to 0.

mysql> create table at(id bigint auto_increment, col bigint, unique key(col), primary key(id));
Query OK, 0 rows affected (0.03 sec)

mysql> insert into at(col) values (1), (2), (3), (4);
Query OK, 4 rows affected (0.01 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> select @@last_insert_id;
+------------------+
| @@last_insert_id |
+------------------+
|                1 |
+------------------+
1 row in set (0.00 sec)

mysql> insert into at(col) values(3) on duplicate key update col = 10;
Query OK, 2 rows affected (0.00 sec)

mysql> select @@last_insert_id;
+------------------+
| @@last_insert_id |
+------------------+
|                1 |
+------------------+
1 row in set (0.00 sec)

mysql> select * from at;
+----+------+
| id | col  |
+----+------+
|  1 |    1 |
|  2 |    2 |
|  4 |    4 |
|  3 |   10 |
+----+------+
4 rows in set (0.01 sec)

Comment on lines +401 to +435
func TestInsertShardedWithOnDuplicateKeyNoInserts(t *testing.T) {
mcmp, closer := start(t)
defer closer()

mcmp.Exec("insert into last_insert_id_test (id, sharding_key, user_id, reason) values (1, '1:1:1', 1, 'foo'), (2, '2:2:2', 2, 'bar')")

// Bump the sequence value so the sequence accounts for the 2 explicit inserts above.
utils.Exec(t, mcmp.VtConn, "SELECT NEXT 2 VALUES FROM uks.last_insert_id_test_seq")

// First test case, insert a row that already exists, and don't actually change any column at all.
query := "insert into last_insert_id_test (sharding_key, user_id, reason) values ('1:1:1', 1, 'foo') on duplicate key update reason = reason"

mysqlResult := utils.Exec(t, mcmp.MySQLConn, query)
// no new row inserted, so insert id should be 0.
assert.Equal(t, uint64(0), mysqlResult.InsertID)
// no row was modified, so rows affected should be 0.
assert.Equal(t, uint64(0), mysqlResult.RowsAffected)

vitessResult := utils.Exec(t, mcmp.VtConn, query)
assert.Equal(t, mysqlResult.RowsAffected, vitessResult.RowsAffected)
assert.Equal(t, mysqlResult.InsertID, vitessResult.InsertID)

// Second test case, insert a row that already exists, and change a column on the existing row.
query = "insert into last_insert_id_test (sharding_key, user_id, reason) values ('1:1:1', 1, 'bar') on duplicate key update reason = VALUES(reason)"

mysqlResult = utils.Exec(t, mcmp.MySQLConn, query)
// a row was modified, so insert id should match the auto increment column value of the modified row
assert.Equal(t, uint64(1), mysqlResult.InsertID)
// one row was modified, so rows affected should be 2.
assert.Equal(t, uint64(2), mysqlResult.RowsAffected)

vitessResult = utils.Exec(t, mcmp.VtConn, query)
assert.Equal(t, mysqlResult.RowsAffected, vitessResult.RowsAffected)
// Vitess can't return the `auto_increment` value of the last updated row, but it returns a value larger than 0.
assert.Greater(t, vitessResult.InsertID, uint64(0))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need to test with way, mcmp.Exec can validate all of this.
With latest changes to this test package we can add new compare option

Comment on lines +177 to +190
// If this insert used auto increment values from a sequence, we need to set the `last_insert_id` value.
if insertID != 0 {
result.InsertID = insertID
if result.RowsAffected > 0 {
// If at least one row was affected, we set the `last_insert_id` value to the lowest reserved sequence id.
//
// This does not match the behaviour of MySQL in case where no new rows where inserted (but one or more rows were updated
// via `ON DUPLICATE KEY UPDATE`), where the `last_insert_id` value is set to the `auto_increment` column value.
result.InsertID = insertID
} else {
// If no rows were inserted or updated, clear the `last_insert_id` value.
result.InsertID = 0
}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not true. I tried it on mysql

mysql> insert into at(col) values (20);
Query OK, 1 row affected (0.00 sec)

mysql> select * from at where col = 20;
+----+------+
| id | col  |
+----+------+
| 10 |   20 |
+----+------+
1 row in set (0.00 sec)

mysql> insert into at(col) values (20);
ERROR 1062 (23000): Duplicate entry '20' for key 'at.col'

mysql> insert into at(col) values (20) on duplicate key update col = 40;
Query OK, 2 rows affected (0.00 sec)

mysql> select * from at where col = 40;
+----+------+
| id | col  |
+----+------+
| 10 |   40 |
+----+------+
1 row in set (0.00 sec)

mysql> select @@last_insert_id;
+------------------+
| @@last_insert_id |
+------------------+
|               10 |
+------------------+
1 row in set (0.00 sec)

If you follow through this sample test. The last_insert_id remained the old value of 10 which was the last insert id when 20 for col was inserted into the table.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should never reset the Insert ID value. I am not sure of the case when it is done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another case where row_affected is 0

mysql> insert into at(col) select 10 from dual where 1!=1;
Query OK, 0 rows affected (0.01 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> select @@last_insert_id;
+------------------+
| @@last_insert_id |
+------------------+
|               10 |
+------------------+
1 row in set (0.00 sec)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Backport to: release-17.0 Needs to be back ported to release-17.0 Backport to: release-18.0 Needs to be back ported to release-18.0 Backport to: release-19.0 Needs to be back ported to release-19.0 NeedsBackportReason If backport labels have been applied to a PR, a justification is required NeedsWebsiteDocsUpdate What it says
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug Report: LAST_INSERT_ID behaviour differs between MySQL and Vitess
3 participants