Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Options representation #7591

Merged
merged 1 commit into from Sep 22, 2023
Merged

Refactor Options representation #7591

merged 1 commit into from Sep 22, 2023

Conversation

MichaReiser
Copy link
Member

@MichaReiser MichaReiser commented Sep 22, 2023

Summary

This PR refactors the Options representation.

Current Reprsentation

Our ConfigureOptions macro currently generates the following code (truncated):

impl Flake8BanditOptions {
    pub const fn metadata() -> crate::options_base::OptionGroup {
        const OPTIONS: [(&'static str, crate::options_base::OptionEntry); 3usize] = [
			("hardcoded-tmp-directory", crate::options_base::OptionEntry::Field(crate::options_base::OptionField { doc: &"A list of directories to consider temporary.", default: &"[\"/tmp\", \"/var/tmp\", \"/dev/shm\"]", value_type: &"list[str]", example: &"hardcoded-tmp-directory = [\"/foo/bar\"]" })), 
			("hardcoded-tmp-directory-extend", crate::options_base::OptionEntry::Field(crate::options_base::OptionField { doc: &"A list of directories to consider temporary, in addition to those\nspecified by `hardcoded-tmp-directory`.", default: &"[]", value_type: &"list[str]", example: &"extend-hardcoded-tmp-directory = [\"/foo/bar\"]" })), 
			("check-typed-exception", crate::options_base::OptionEntry::Field(crate::options_base::OptionField { doc: &"Whether to disallow `try`-`except`-`pass` (`S110`) for specific\nexception types. By default, `try`-`except`-`pass` is only\ndisallowed for `Exception` and `BaseException`.", default: &"false", value_type: &"bool", example: &"check-typed-exception = true" }))
		];
        crate::options_base::OptionGroup::new(&OPTIONS)
    }
}

You can see how it generates a static array and metadata returns an OptionGroup(&slice) referring to the slice of the static array.

Options can be nested (option-groups). The macro calls into the nested type's metadata function to get the OptionGroup and adds a OptionEntry::Group() to its static array

const OPTIONS: [(&'static str, crate::options_base::OptionEntry); 3usize] = [
			("hardcoded-tmp-directory", crate::options_base::OptionEntry::Field(crate::options_base::OptionField { doc: &"A list of directories to consider temporary.", default: &"[\"/tmp\", \"/var/tmp\", \"/dev/shm\"]", value_type: &"list[str]", example: &"hardcoded-tmp-directory = [\"/foo/bar\"]" })), 
			...
			("example-group", crate::options_base::OptionEntry::Group(ExampleGroup::metadata()))
		];

This works well enough and the API is simple. However, it requires that the macro statically determines all options to be able to set the length of the array.

New Requirements

I'm about to introduce a new lint section to the configuration without removing the top-level options until the new configuration is stabilized. The simplest but very verbose approach is duplicating the lint options: Once at the top level and once under the lint section. But that's rather verbose. Ideally, I want to use serde's flatten feature and write:

struct Options {
	...

    // Lint options
    /// Options for the Ruff linter
    #[option_group]
    pub lint: Option<LintOptions>,

    /// Doc
    #[serde(flatten)]
    pub lint_legacy: LintOptions,
}

While serde and schemars support this nicely, our options macro does not, and adding it is impossible with the current representation. The problem is that macros operate on a tokens stream only. They can't resolve types (at least to my knowledge). But that's what's necessary for Options to statically determine the length of its OPTIONS array: It needs to expand the LintOptions options into Options.

New Representation

The new representation is inspired by serde and tracing_subscribs's Visit trait.

The basic idea is to implement the metadata abstraction as a visitor pass (to avoid allocations).

/// Returns metadata for its options.
pub trait OptionsMetadata {
    /// Visits the options metadata of this object by calling `visit` for each option.
    fn record(visit: &mut dyn Visit);

    /// Returns the extracted metadata.
    fn metadata() -> OptionSet
    where
        Self: Sized + 'static,
    {
        OptionSet::of::<Self>()
    }
}

struct Root;

impl OptionsMetadata for Root {
    fn record(visit: &mut dyn Visit) {
        visit.record_field("ignore-git-ignore", OptionField {
            doc: "Whether Ruff should respect the gitignore file",
            default: "false",
            value_type: "bool",
            example: "",
        });

        visit.record_set("format", Nested::metadata());
    }
}

struct Nested;

impl OptionsMetadata for Nested {
    fn record(visit: &mut dyn Visit) {
        visit.record_field("hard-tabs", OptionField {
            doc: "Use hard tabs for indentation and spaces for alignment.",
            default: "false",
            value_type: "bool",
            example: "",
        });
    }
}
  • Each struct that has options implements OptionsMetadata
  • OptionsMetadata::record calls record_field or record_set for each option

where Visit is defined as

/// Visits [`OptionsMetadata`].
///
/// An instance of [`Visit`] represents the logic for inspecting an object's options metadata.
pub trait Visit {
    /// Visits an [`OptionField`] value named `name`.
    fn record_field(&mut self, name: &str, field: OptionField);

    /// Visits an [`OptionSet`] value named `name`.
    fn record_set(&mut self, name: &str, group: OptionSet);
}

This new representation has a few advantages:

  • The OptionsMetadata trait can be implemented manually for cases where the derive macro isn't flexible enough
  • The OptionsMetadata provides all relevant abstractions so that the macro is optional. It means you can now understand the system without understanding the macro.
  • Implementing flatten becomes a simple Nested::record(visit) call (instead of calling visit.record_set(name, Nested::metadata()). Meaning we traverse into the sub-set without telling the visitor that it now enters a new set.

The main downside of the new API is that writing visitors is awkward. This is evident by the longer implementations for has, get, and Display. I think that's a fair tradeoff, considering that these methods are implemented once and that a similar API is provided to consumers.

Test Plan

  • Verified that cargo dev generate-options produces the exact same output
  • Verfieid that the config command continues working

@MichaReiser
Copy link
Member Author

MichaReiser commented Sep 22, 2023

@codspeed-hq
Copy link

codspeed-hq bot commented Sep 22, 2023

CodSpeed Performance Report

Merging #7591 will not alter performance

Comparing refactor-options (f63df7d) with main (9d16e46)

Summary

✅ 25 untouched benchmarks

fields.sort_unstable_by(|(name, _), (name2, _)| name.cmp(name2));
sets.sort_unstable_by(|(name, _), (name2, _)| name.cmp(name2));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how I feel about sorting options. It disallows grouping options that are semantically related (all resolver settings)

Comment on lines 2411 to 2415
impl OptionsMetadata for FormatOrOutputFormat {
fn record(visit: &mut dyn Visit) {
FormatOptions::record(visit)
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also feels less hacky now since we have a trait.

@MichaReiser MichaReiser marked this pull request as ready for review September 22, 2023 08:19
@MichaReiser MichaReiser added the internal An internal refactor or improvement label Sep 22, 2023
}
impl Visit for CollectOptionsVisitor {
fn record_set(&mut self, name: &str, group: OptionSet) {
self.groups.push((name.to_owned(), group));
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old implementation only supports one level of nesting. This would have become a problem with lint.pydocstyle.option where we have two level nesting.

@github-actions
Copy link
Contributor

github-actions bot commented Sep 22, 2023

PR Check Results

Ecosystem

✅ ecosystem check detected no changes.

}

fn generate_set(output: &mut String, set: &Set) {
writeln!(output, "### {title}\n", title = set.title()).unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

writeln! with trailing \n looks strange

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree but everything else feels verbose just to get two newlines

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I typically do two writeln for this but don't feel strongly.)

@MichaReiser MichaReiser merged commit 2ecf597 into main Sep 22, 2023
16 checks passed
@MichaReiser MichaReiser deleted the refactor-options branch September 22, 2023 16:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internal An internal refactor or improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants