You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This concept came about from a discussion between myself and @iamjpotts.
Motivation
Handling of constraint errors is a pretty typical task in any database application.
As a simple example, a platform with user registration will likely want to ensure that usernames are unique so they can unambiguously refer to one user. This is pretty easy to enforce in SQL with a UNIQUE constraint, which usually has the benefit of automatically creating an index over the column as well:
CREATETABLEusers (
user_id UUID PRIMARY KEY,
-- Of course, you'd likely want this to be case-insensitive, or just always fold to lowercase.
username TEXT UNIQUE NOT NULL,
password_hash TEXTNOT NULL,
);
A naive implementation may perform a SELECT query before attempting to insert a user record, to check if the username is already taken:
let username_taken:bool = sqlx::query_scalar("SELECT EXISTS(SELECT 1 FROM users WHERE username ILIKE ?)").bind(&username).fetch_one(&pool).await?;if username_taken {returnErr("username taken");}let user_id:Uuid = sqlx::query_scalar("INSERT INTO users (username, password_hash) VALUES (?, ?) RETURNING user_id").bind(&username).bind(&password_hash).fetch_one(&pool).await?;
However, this takes two round-trips to the database, increasing response latency. The astute reader may have also noticed that the above code has a time-of-check to time-of-use (TOCTOU) bug because it does not perform the two queries in a transaction; another call to this function may race this one and create a user record with the same username, and so the insert may fail with an error which is likely not going to be surfaced to the user in a useful manner (e.g. a 500 Internal Server Error instead of a 400 Bad Request) since it wasn't anticipated by the developer.
You could fix the bug by performing the check and the insert in a transaction, but you can also eliminate a round-trip by just attempting the insert and using the unique constraint (which would have fired a violation error anyway in the above situation) to detect the collision. SQLx supports and encourages this use-case; however, this is not currently very ergonomic:
let user_id:Uuid = sqlx::query_scalar("INSERT INTO users (username, password_hash) VALUES (?, ?) RETURNING user_id").bind(&username).bind(&password_hash).fetch_one(&pool).await.map_err(|e| match e {
sqlx::Error::DatabaseError(dbe)if dbe.constraint() == Some("users_username_key") => {MyError::from("username taken")},
other => other.into(),})?;
However, this still leads to matching on magic strings most of the time, and can lead to forgetting to cover newly added constraints.
Design
I propose to add what I'm calling a "checked derive", which takes an enum as input and checks at compile-time that it exhaustively represents the constraints for a given table:
#[derive(sqlx::ConstraintEnum)]#[sqlx(table_name = "users")]pubenumUsersConstraints{/// The `UNIQUE` constraint of the `username` column was violated.#[sqlx(constraint = "users_username_key")]UsernameUnique,}
Like any derive the primary function would be to generate trait impls. This would generate a FromStr impl that converts the constraint name string to a given variant, as well as an impl of the ConstraintEnum trait for type safety:
pubtraitConstraintEnum:FromStr{/// Get the name of the constraint for the given enum variant.fnname(&self) -> &'static str;}
This impl, would then interact with an extension trait and intermediate error type:
pubtraitResultExt<T>{/// If this error represents one of the constraints covered by the given `ConstraintEnum`, map it to another error.fnon_constraint<C:ConstraintEnum,E>(self,on_constraint:FnOnce(C) -> E) -> Result<T,MaybeConstraintError<E>>;}// Implements at least: `std::error::Error`, `Debug`pubenumMaybeConstraintError<E>{ConstraintError(E),Other(sqlx::Error),}// I'm *hoping* coherence allows at least some form of this, but I'm not currently certain.impl<E1,E2>From<MaybeConstraintError<E1>>forE2whereE1:Into<E2>, sqlx::Error:Into<E2>{// ...}impl<T>ResultExtforResult<T, sqlx::Error>{fnon_constraint<C:ConstraintEnum,E>(self,on_constraint:FnOnce(C) -> E) -> Result<T,MaybeConstraintError<E>>{// ..}}
This would then allow the above INSERT to be modeled like so:
let user_id:Uuid = sqlx::query_scalar("INSERT INTO users (username, password_hash) VALUES (?, ?) RETURNING user_id").bind(&username).bind(&password_hash).fetch_one(&pool).await// The above `From` impl should allow the `?` here to work, assuming `MyError` implements `From<sqlx::Error>`.on_constraint(|c:UserConstraints| match c {UserConstraints::UsernameUnique => MyError::new(StatusCode::BAD_REQUEST,"username taken"),})?;
Compile-Time Checking
Similar to the query macros, if DATABASE_URL is set and offline-mode is not enabled, the derive can check the constraints on the given table at compile-time (e.g. using INFORMATION_SCHEMA.TABLE_CONSTRAINTS or the equivalent in pg_catalog) and ensure that the enum exhaustively lists the constraints.
This can help ensure, for example, that a new constraint being added is covered by any code it might affect:
-- This is effectively equivalent to changing `username`'s type to `varchar(63)`,-- however that raises a different error code than a constraint violation.ALTERTABLE users ADD CONSTRAINT user_username_length CHECK (LENGTH(username) <64));
-- Enforcing a minimum length is probably better done in your application code where you can change it more easily-- without having to modify existing user records. However, checking that it's not empty is a good sanity check.ALTERTABLE users ADD CONSTRAINT user_username_nonempty CHECK (LENGTH(username) >0);
If the above were executed as a new migration, for example, the ConstraintEnum derive would then emit a compiler error calling out the uncovered constraints and force the developer to add coverage for them.
For convenience, I would maybe add a subcommand to sqlx-cli to automatically generate a ConstraintEnum definition for a given table, which can use the same logic to look up the constraints so the user doesn't have to manually. This would especially be useful if adapting SQLx to an existing database.
I would make this functionality optional so that users who don't wish to use compile-time checks can still benefit from the API improvement. Perhaps as a checked-derive feature that extends the derive feature added in #3113 (cc @saiintbrisson).
Without DATABASE_URL set and with the checked-derive feature disabled, the ConstraintEnum would work as a simple dumb derive. This would also make it possible to still use the query macros, but only check constraint enums, e.g., in CI or a pre-commit hook, by not enabling the feature in normal development (though that's likely to be flaky given Cargo's feature unification; a dependency could enable the feature non-optionally which would force it on everywhere). It should also obey SQLX_OFFLINE.
Additionally, I would make it possible to explicitly mark constraints as ignored, either as a temporary workaround or to ignore constraints that are not expected to fire in normal operation:
#[derive(sqlx::ConstraintEnum)]#[sqlx(table_name = "users")]// For example, these should realistically be checked before even touching the database.#[sqlx(ignore("user_username_length", "user_username_nonempty")]pubenumUsersConstraints{/// The `UNIQUE` constraint of the `username` column was violated.#[sqlx(constraint = "users_username_key")]UsernameUnique,}
Or to mark a constraint enum as a non-exhaustive subset (only check that the declared constraints actually exist):
#[derive(sqlx::ConstraintEnum)]#[sqlx(table_name = "users")]#[sqlx(non_exhaustive)]pubenumUsersConstraints{/// The `UNIQUE` constraint of the `username` column was violated.#[sqlx(constraint = "users_username_key")]UsernameUnique,}
Bikeshedding: this should be semantically distinct from #[non_exhaustive] and should probably not use the same identifier to avoid confusion. But what should it be instead? check_declared_only?
Alternatives
Automatically generated enum
Instead, SQLx could automatically generate the full enum:
It would require either a compile-time connection to the database, or use the same offline data mechanism as the query macros.
It would be impossible to use as a simple derive without bringing in at least some query macro scaffolding (re. feat: new derive feature flag #3113).
The user cannot reason about the output of the macro without knowing what constraints exist on the given table.
The query macros generate predictable output from their input (assuming output columns are explicitly declared)
and properly formatted query macro invocations should be perfectly readable and easily reasoned about on their own
without an IDE expanding the macro.
Controlling the output of the macro would require a dedicated DSL for:
Ignoring constraints you don't care about;
Renaming variants for clarity;
notice that the UNIQUE constraint on users.username was automatically named users_username_key, which doesn't tell you a lot at a glance
Adding documentation comments to explain what each constraint violation means;
Adding other derives, such as thiserror::Error to turn the ConstraintEnuminto an error type in its own right.
Supporting this use-case in this format would be more complex than just writing out the enum definition!
Go-to-definition would likely not be very useful, and IDE support in general would be very spotty and results would be slow.
The text was updated successfully, but these errors were encountered:
This concept came about from a discussion between myself and @iamjpotts.
Motivation
Handling of constraint errors is a pretty typical task in any database application.
As a simple example, a platform with user registration will likely want to ensure that usernames are unique so they can unambiguously refer to one user. This is pretty easy to enforce in SQL with a
UNIQUE
constraint, which usually has the benefit of automatically creating an index over the column as well:A naive implementation may perform a
SELECT
query before attempting to insert a user record, to check if the username is already taken:However, this takes two round-trips to the database, increasing response latency. The astute reader may have also noticed that the above code has a time-of-check to time-of-use (TOCTOU) bug because it does not perform the two queries in a transaction; another call to this function may race this one and create a user record with the same username, and so the insert may fail with an error which is likely not going to be surfaced to the user in a useful manner (e.g. a
500 Internal Server Error
instead of a400 Bad Request
) since it wasn't anticipated by the developer.You could fix the bug by performing the check and the insert in a transaction, but you can also eliminate a round-trip by just attempting the insert and using the unique constraint (which would have fired a violation error anyway in the above situation) to detect the collision. SQLx supports and encourages this use-case; however, this is not currently very ergonomic:
In real Launchbadge projects, we've used an extension trait to make this more ergonomic, similar to what's demonstrated in
realworld-axum-sqlx
: https://github.com/launchbadge/realworld-axum-sqlx/blob/main/src/http/error.rs#L176-L184However, this still leads to matching on magic strings most of the time, and can lead to forgetting to cover newly added constraints.
Design
I propose to add what I'm calling a "checked derive", which takes an enum as input and checks at compile-time that it exhaustively represents the constraints for a given table:
Like any derive the primary function would be to generate trait impls. This would generate a
FromStr
impl that converts the constraint name string to a given variant, as well as an impl of theConstraintEnum
trait for type safety:This impl, would then interact with an extension trait and intermediate error type:
This would then allow the above
INSERT
to be modeled like so:Compile-Time Checking
Similar to the query macros, if
DATABASE_URL
is set and offline-mode is not enabled, the derive can check the constraints on the given table at compile-time (e.g. usingINFORMATION_SCHEMA.TABLE_CONSTRAINTS
or the equivalent inpg_catalog
) and ensure that the enum exhaustively lists the constraints.This can help ensure, for example, that a new constraint being added is covered by any code it might affect:
If the above were executed as a new migration, for example, the
ConstraintEnum
derive would then emit a compiler error calling out the uncovered constraints and force the developer to add coverage for them.For convenience, I would maybe add a subcommand to
sqlx-cli
to automatically generate aConstraintEnum
definition for a given table, which can use the same logic to look up the constraints so the user doesn't have to manually. This would especially be useful if adapting SQLx to an existing database.I would make this functionality optional so that users who don't wish to use compile-time checks can still benefit from the API improvement. Perhaps as a
checked-derive
feature that extends thederive
feature added in #3113 (cc @saiintbrisson).Without
DATABASE_URL
set and with thechecked-derive
feature disabled, theConstraintEnum
would work as a simple dumb derive. This would also make it possible to still use the query macros, but only check constraint enums, e.g., in CI or a pre-commit hook, by not enabling the feature in normal development (though that's likely to be flaky given Cargo's feature unification; a dependency could enable the feature non-optionally which would force it on everywhere). It should also obeySQLX_OFFLINE
.Additionally, I would make it possible to explicitly mark constraints as ignored, either as a temporary workaround or to ignore constraints that are not expected to fire in normal operation:
Or to mark a constraint enum as a non-exhaustive subset (only check that the declared constraints actually exist):
Bikeshedding: this should be semantically distinct from
#[non_exhaustive]
and should probably not use the same identifier to avoid confusion. But what should it be instead?check_declared_only
?Alternatives
Automatically generated enum
Instead, SQLx could automatically generate the full enum:
However, this has a number of disadvantages:
and properly formatted query macro invocations should be perfectly readable and easily reasoned about on their own
without an IDE expanding the macro.
UNIQUE
constraint onusers.username
was automatically namedusers_username_key
, which doesn't tell you a lot at a glancethiserror::Error
to turn theConstraintEnum
into an error type in its own right.The text was updated successfully, but these errors were encountered: