Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[S308] mark_safe for HTML constants #16702

Closed
mfontana-elem opened this issue Mar 13, 2025 · 5 comments · Fixed by #16770
Closed

[S308] mark_safe for HTML constants #16702

mfontana-elem opened this issue Mar 13, 2025 · 5 comments · Fixed by #16770
Labels
rule Implementing or modifying a lint rule

Comments

@mfontana-elem
Copy link

mfontana-elem commented Mar 13, 2025

Question

I have a question regarding this linting rule (imported from flake8-bandit).

I think I understand the problems with using django.utils.safestring.mark_safe. But know that format_html is being deprecated for being used without arguments¹, I struggle to find what would be the correct way to handle something as simple as creating a filter that performs:

def myfilter(case, ...):
  if case == "hello":
    return "<i>Hello world!</i>"
  elif case == "bye":
    return "<b>Bye world!</b>"
  else:
    ...

(This is not an accurate filter API, but illustrates the purpose).

In this case I know the HTML is safe, but I don't understand how to create a safestring from it without getting the DeprecationWarningError from Django (format_html) or noqa'ing the S308 rule (suspicious-mark-safe-usage).

I fail to see how a string constant could introduce a XSS unless there is programmer negligence, in which case all bets are off. Having said this, I might be the negligent here! 😅

¹: And for a good reason. When coupled with f-strings interpolation happens before Django gets to escape inputs.

Version

ruff 0.8.0

@mfontana-elem mfontana-elem added the question Asking for support or clarification label Mar 13, 2025
@MichaReiser
Copy link
Member

MichaReiser commented Mar 13, 2025

Thanks for the nice write up!

I fail to see how a string constant could introduce a XSS unless there is programmer negligence, in which case all bets are off. Having said this, I might be the negligent here! 😅

Ruff's implementation (which I think matches bandit's) isn't very sophisticated. It only searches for calls to mark_safe without performing any analysis of where the content might be coming from.

Allowing calls to mark_safe where the argument is a string literal without format or f-string placeholders does sound reasonable to me.

For now, the spirit of the rule is that you use mark_safe with an explicit noqa comment, acknowledging that I've thought about this. But that's obviously not as good as not requiring a noqa comment because it then doesn't catch if someone adds placeholders in the future.

So what I think should be accepted is:

def myfilter(case, ...):
  if case == "hello":
    return mark_safe("<i>Hello world!</i>")
  elif case == "bye":
    return mark_safe("<b>Bye world!</b>")
  else:
    ...

@MichaReiser MichaReiser added rule Implementing or modifying a lint rule and removed question Asking for support or clarification labels Mar 13, 2025
@mfontana-elem
Copy link
Author

If it does indeed sound reasonable, I could try to have a look into submitting a PR in the upcoming weeks. I'm pretty much of a newbie when it comes to Rust, but don't care to spend the time off the clock.

@MichaReiser
Copy link
Member

MichaReiser commented Mar 13, 2025

It does sound reasonable to me, unless I'm overlooking something :)

You can find the code here

["django", "utils", "safestring" | "html", "mark_safe"] => SuspiciousMarkSafeUsage.into(),

and you can see an example that inspects the arguments here

if let Some(arguments) = arguments {
// If the `url` argument is a string literal or an f-string, allow `http` and `https` schemes.
if arguments.args.iter().all(|arg| !arg.is_starred_expr())
&& arguments
.keywords
.iter()
.all(|keyword| keyword.arg.is_some())
{
if arguments
.find_argument_value("url", 0)
.and_then(leading_chars)
.is_some_and(has_http_prefix)
{
return;
}
}
}

@Daverball
Copy link
Contributor

You can also take a look at S704 and potentially reuse some of its logic, since it's trying to accomplish the same thing for a different API.

fn is_unsafe_call(call: &ExprCall, semantic: &SemanticModel, settings: &LinterSettings) -> bool {
// technically this could be circumvented by using a keyword argument
// but without type-inference we can't really know which keyword argument
// corresponds to the first positional argument and either way it is
// unlikely that someone will actually use a keyword argument here
// TODO: Eventually we may want to allow dynamic values, as long as they
// have a __html__ attribute, since that is part of the API
matches!(&*call.arguments.args, [first] if !first.is_string_literal_expr() && !first.is_bytes_literal_expr() && !is_whitelisted_call(first, semantic, settings))
}
fn is_whitelisted_call(expr: &Expr, semantic: &SemanticModel, settings: &LinterSettings) -> bool {
let Expr::Call(ExprCall { func, .. }) = expr else {
return false;
};
let Some(qualified_name) = semantic.resolve_qualified_name(func) else {
return false;
};
settings
.flake8_bandit
.allowed_markup_calls
.iter()
.map(|target| QualifiedName::from_dotted_name(target))
.any(|target| qualified_name == target)
}

@mfontanaar
Copy link
Contributor

mfontanaar commented Mar 15, 2025

Thanks a lot for the pointers. With your help a new version of the rule worked on the first cargo run!

I opened #16770 with the proposed change.


Relatedly, it would be great if the @mark_safe decorator could be used for functions which can only return a string literal. For instance, following the example of the title, it would be nice if

@mark_safe
def my_filter(case, ...):
  if case == "hello":
    return "<i>Hello world!</i>"
  else:
    return "<b>Bye world!</b>"

However, this would increase complexity. Whether the symbol is being used as function or as a decorator should be detected and branched accordingly. For the second case, the decorated function should be inspected and all return statements analyzed for "string-literal'ness.

dcreager added a commit that referenced this issue Mar 17, 2025

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
* main: (25 commits)
  [syntax-errors] Parenthesized context managers before Python 3.9 (#16523)
  [ci]: Disable wheel testing on `ppc64le` (#16793)
  [red-knot] Stabilize `negation_reverses_subtype_order` property test (#16801)
  [red-knot] Emit error if int/float/complex/bytes/boolean literals appear in type expressions outside `typing.Literal[]` (#16765)
  [ci] Use `git diff` instead of `changed-files` GH action (#16796)
  [syntax-errors] Improve error message and range for pre-PEP-614 decorator syntax errors (#16581)
  [`flake8-bandit`] Allow raw strings in `suspicious-mark-safe-usage` (`S308`) #16702 (#16770)
  [`refurb`] Avoid panicking `unwrap` in `verbose-decimal-constructor` (`FURB157`) (#16777)
  [red-knot] Add `--color` CLI option (#16758)
  [internal]: Upgrade salsa (#16794)
  Pin dependencies (#16791)
  [internal]: Update indirect dependencies (#16792)
  [ci]: Fixup codspeed upgrade (#16790)
  Update Rust crate compact_str to 0.9.0 (#16785)
  Update Rust crate clap to v4.5.32 (#16778)
  Update Rust crate codspeed-criterion-compat to v2.9.1 (#16784)
  Update Rust crate quote to v1.0.40 (#16782)
  Update Rust crate ordermap to v0.5.6 (#16781)
  Update cloudflare/wrangler-action action to v3.14.1 (#16783)
  Update Rust crate env_logger to v0.11.7 (#16779)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rule Implementing or modifying a lint rule
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants