Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incompatible Sequence types generated after register_type_strategy() #3767

Closed
h4l opened this issue Oct 11, 2023 · 8 comments · Fixed by #3768
Closed

Incompatible Sequence types generated after register_type_strategy() #3767

h4l opened this issue Oct 11, 2023 · 8 comments · Fixed by #3768
Labels
enhancement it's not broken, but we want it to be better

Comments

@h4l
Copy link
Contributor

h4l commented Oct 11, 2023

A from_type(Sequence[XXX]) strategy is able to generate instances of classes unrelated to XXX that are Sequence subtypes registered with register_type_strategy().

Probably best explained with an example. Both of these tests fail:

from collections import namedtuple
from collections.abc import Sequence

from hypothesis import given
from hypothesis import strategies as st


class Thing(namedtuple("Thing", [])):
    pass


class CustomSequence(Sequence):
    def __getitem__(self, i) -> Never:
        raise IndexError()

    def __len__(self) -> int:
        return 0


class Other:
    pass


st.register_type_strategy(Thing, st.builds(Thing))
st.register_type_strategy(CustomSequence, st.builds(CustomSequence))


@given(st.from_type(Sequence[Other]))
def test_thing_is_not_seq_of_other(others: Sequence[Other]) -> None:
    assert not isinstance(others, Thing)


@given(st.from_type(Sequence[Other]))
def test_customseq_is_not_seq_of_other(others: Sequence[Other]) -> None:
    assert not isinstance(others, CustomSequence)

I ran into this after upgrading from hypothesis 6.68.2 to 6.87.3. In my case I had some strategies registered with namedtuple types which started getting generated for unrelated strategies. It also happens for Sequence subclasses, but only if they are not specifying a generic type.

It seems to be related to #2951.

try_issubclass(k, thing) thinks the Thing namedtuple is a subclass of Sequence[Other].

I can see this is a rather hairy problem looking at the code!

@tybug
Copy link
Member

tybug commented Oct 13, 2023

I'd argue it's correct for CustomSequence to be provided to Sequence[Other]. If you are asking for all sequences in concrete type X, and you have registered a generic sequence type MySequence, then an instantiation of MySequence[X] satisfies the request. As I understand it, this is effectively the mechanism by which lists in concrete type X are provided to Sequence[X]. Maybe I'm missing something here though!

namedtuples are a bit weird, because they have the same typing as the above scenario, but aren't treated the same by users. As in, nobody thinks of a namedtuple as a generic sequence. I PR'd a special handling of namedtuple here #3768. Though in retrospect, I should have waited for opinions from maintainers on this issue first to make sure it was wanted - apologies!

@h4l
Copy link
Contributor Author

h4l commented Oct 13, 2023

Thanks for making a PR for this! I agree, considering just the type signatures, these are type-compatible. Making namedtuple types not eligible for Sequence requests seems reasonable.

Another case is using a type-annotated namedtuple like this:

from typing import NamedTuple
class Thing(NamedTuple):
    foo: int

In the current released hypothesis, this is eligible for st.from_type(Sequence[str]).

I wonder if register_type_strategy() should provide a way to limit which base types a strategy is compatible with? So that a strategy for Thing could be registered either just for itself, or for a specific Protocol type it implements, but not Sequence.

@h4l
Copy link
Contributor Author

h4l commented Oct 13, 2023

Ah, I was a bit rushed earlier and forgot about this. When I was working around this issue to fix my tests, one thing I tried was registering a function to conditionally generate the strategy only when required, e.g:

class Thing(NamedTuple):
    foo: int


def get_thing_strategy(target: type[Any]) -> st.SearchStrategy[Thing]:
    if target == Thing:
        return st.builds(Thing)
    # Opt out of proving a strategy for this target
    return st.nothing()


st.register_type_strategy(Thing, get_thing_strategy)

But this is treated by Hypothesis as if no strategy exists for the type of the target argument here (which would be Sequence[Other] in my example), so I can't opt-out like this. This happens even though other strategies are available for Sequence[Other]. This happens here:

https://github.com/HypothesisWorks/hypothesis/blob/master/hypothesis-python/src/hypothesis/strategies/_internal/types.py#L451

    empty = ", ".join(repr(s) for s in strategies if s.is_empty)
    if empty or not strategies:
        raise ResolutionFailed(
            f"Could not resolve {empty or thing} to a strategy; "
            "consider using register_type_strategy"
        )

Is this a bug? Shouldn't this only fail if all strategies are empty, not if any is empty?

If this were possible, it would basically allow what I was wondering in my comment above, about limiting the registration of strategies to certain base types.

@h4l
Copy link
Contributor Author

h4l commented Oct 13, 2023

This works as a hacky workaround to conditionally register a strategy:

def get_thing_strategy(target: type[Any]) -> st.SearchStrategy[Thing]:
    if target == Thing:
        return st.builds(Thing)
    return st.none().filter(lambda x: False)

@Zac-HD
Copy link
Member

Zac-HD commented Oct 13, 2023

I think we could support fn(type_: type[T]) -> st.SearchStrategy[T] | types.NotImplementedType, using the NotImplemented constant to signal that this type is not resolvable with that callback. It'd require some small changes where we construct the strategies, but seems worth doing anyway.

Pragmatically, dropping subtypes of tuple does seem like a reasonable thing to do here; I'll comment in more detail over on the PR. Thanks for that @tybug! I'm basically always happy to accept PRs, so long as contributors don't mind a risk of them being closed if it turns out that I don't want to merge it (rare, and possible though rarer regardless).

I'd argue it's correct for CustomSequence to be provided to Sequence[Other]. [...like] list[X] are provided for Sequence[X]. Maybe I'm missing something here though!

The catch is that not all subclasses of Sequence are generic, or compatible with X! For example str is a Sequence[X] only for X = str, and doesn't take a type parameter, while many other collection types are generic only in hashable element types.

I'm not sure what we should do about this, but it seems likely that it's currently mishandled in at least some situations 😥. One important mitigation is that we resolve abstract classes to their nearest concrete subclasses, not any subclass - e.g. providing instances of Dog for abstract Animal, not instances of Poodle or Labrador or whatever. Probably we should continue to ignore this until someone reports a practical problem!

@h4l
Copy link
Contributor Author

h4l commented Oct 14, 2023

Cool, I like the sound of that. NotImplemented doesn't get much love, so it would be nice to put it to use! I could do a PR for this.

@Zac-HD Zac-HD added the enhancement it's not broken, but we want it to be better label Oct 14, 2023
@Zac-HD
Copy link
Member

Zac-HD commented Oct 14, 2023

That would be lovely - thanks Hal!

@h4l
Copy link
Contributor Author

h4l commented Oct 16, 2023

Great to see this fixed so quickly! Thanks for your fix @tybug (and on the test performance work), and thanks for reviewing @Zac-HD, I really appreciate it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement it's not broken, but we want it to be better
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants