Check for `annotated-types` constraints in `st.from_type(Annotated[T, ...])` #3356

Zac-HD · 2022-05-22T01:52:37Z

At PyCon 2022 I helped start https://github.com/annotated-types/annotated-types - so it would be nice to support these new constraints once we finalize the upstream release! The key idea is that we can iterate through the constraints, and derive callables which we then use as filters on the strategy that we resolved for the annotated type.

Filter-rewriting will then make numeric bounds as efficient as is practically possible at runtime (though see #3134 / b4c4161 / #3479 for strings). For example:

>>> from_type(Annotated[int, Gt(10)])
# integers().filter(partial(gt, 10))
integers(min_value=11)

>>> from_type(Annotated[str, Predicate(str.isdigit)])
text().filter(str.isdigit)
# from_regex(r"\d+", fullmatch=True)

@adriangb @samuelcolvin FYI 😁

The text was updated successfully, but these errors were encountered:

Zac-HD · 2022-10-30T21:59:31Z

A more detailed breakdown of the tasks involved:

Gt, Ge, Lt, Le:
- translate into filter predicates with partial and the operator module
- add filter-rewriting support for st.fractions() and st.decimals()
MultipleOf:
- annoyingly difficult to get right, especially for floats! Also ambiguous: modulo-based or division-based semantics?
- filtering is absurdly inefficient; probably needs special handling plumbed right through; at minimum apply this last
- can we get away with warning people off for now?
Timezone:
- resolve this to a timezones= strategy for st.times() or st.datetimes()
- register a callback fn for time and datetime types, which accepts this arg, and plumb it through
- explicit error if we see this constraint but the resolver func doesn't accept this arg
MinLen, MaxLen: as for timezones, but fall back to a warning and filter predicate rather than an error
Predicate: use as predicate to .filter() in the obvious way

Other constraints: unpack GroupedMetadata into the component parts, ignore unknown constraint types and log a warning at debug verbosity. The new code will replace logic invoked from:

hypothesis/hypothesis-python/src/hypothesis/strategies/_internal/types.py

Lines 369 to 377 in 47b35ce

    
           if is_annotated_type(thing):  # pragma: no cover 
        
               # This requires Python 3.9+ or the typing_extensions package 
        
               annotated_strategy = find_annotated_strategy(thing) 
        
               if annotated_strategy is not None: 
        
                   return annotated_strategy 
        
               args = thing.__args__ 
        
               assert args, "it's impossible to make an annotated type with no args" 
        
               annotated_type = args[0] 
        
               return st.from_type(annotated_type)

Viicos · 2023-10-27T07:00:27Z

I've started an implementation of this, and I have a couple questions:

Gt, Ge, Lt, Le:

translate into filter predicates with partial and the operator module

add filter-rewriting support for st.fractions() and st.decimals()

I'm don't have much knowledge on how filter rewriting works, but is there a way to check that integers().filter(partial(gt, 10)) is correctly rewritten by Hypothesis to integers(min_value=11)? How about integers().filter(partial(gt, 10)).filter(partial(lt, 20))?
Edit: nevermind, seems to work when accessing the wrapped strategy of the lazy strategy.

How should we handle constraints that are incompatible with each other/with the annotated type? e.g. Annotated[int, Gt(1), Timezone(tz.utc)], Annotated[UUID, Gt(1)]? Should we ignore and let a potential error be raised at some point when the resulting strategy is used?

Zac-HD · 2023-10-27T19:17:21Z

How to check that filter-rewriting works: trust that https://github.com/HypothesisWorks/hypothesis/blob/master/hypothesis-python/tests/cover/test_filter_rewriting.py would surface any problems 😁

How should we handle constraints that are incompatible with each other/with the annotated type? e.g. Annotated[int, Gt(1), Lt(1)], Annotated[str, Gt(1)]? Should we ignore and let a potential error be raised at some point when the resulting strategy is used?

Hmm, this is somewhat tricky - the decision about whether to return st.nothing() or raise an InvalidArgument exception is made on a case-by-case basis. We prefer to raise an error when we think there's something the user could reasonably do to avoid it; or use nothing() when that would be impractical.

I think in this case I'd just translate to filters, as the easy-to-implement approach. That means:

Annotated[str, Gt(1)] will raise a TypeError due to the comparison when an example is drawn. It's somewhat unfortunate that the error message won't mention that this came from an annotated type, but I think not worth the code required to improve on it.
Annotated[int, Gt(1), Lt(1)] will return nothing(), because that's equivalent to a strategy where all possible values are filtered out. In this case, I think we can add a few lines of code like if s.is_empty: raise InvalidArgument(f"There are no valid values for type {t!r}")

Viicos · 2023-10-29T21:06:28Z

Ok I see, I'll soon open a draft PR so that progress can be tracked.

register a callback fn for time and datetime types, which accepts this arg, and plumb it through

By register a callback you mean using this function?

hypothesis/hypothesis-python/src/hypothesis/strategies/_internal/types.py

Lines 703 to 706 in 058420a

    
           def register(type_, fallback=None, *, module=typing): 
        
               if isinstance(type_, str): 
        
                   # Use the name of generic types which are not available on all 
        
                   # versions, and the function just won't be added to the registry

Ideally I will try to build strategies in a "smart" way, and avoid using filters when no filter rewriting is available (e.g. text(min_size=20) instead of text().filter(lambda el: len(el) >= 20); applies to list/bytes/probably others). Even Gt/Ge/Lt/Le can apply to types where filter rewriting could be implemented but currently isn't (e.g. datetime), so I need to build the corresponding strategies in a smart way.

What's challenging is that I could manually build the strategies with the correct kwargs (e.g. min_size=..., max_size=...) for each known type, but that would be rewriting what from_type is currently doing with the extra logic from constraints.

Zac-HD · 2023-11-01T17:16:25Z

Let's start simple, and only include filters in the initial version. If sometimes that means users get Unsatisfiable rather than data which violates their annotation constraints, I think that's still an improvement!

After we've shipped the logically correct version, we can look at improving performance in common cases.

Zac-HD added the enhancement it's not broken, but we want it to be better label May 22, 2022

Zac-HD mentioned this issue Jul 14, 2022

🏃 Sprints meta-issue #3402

Closed

20 tasks

Zac-HD mentioned this issue Oct 30, 2022

V2: hypothesis plugin rewrite pydantic/pydantic#4682

Open

Zac-HD mentioned this issue May 25, 2023

Way to communicate more information between libraries annotated-types/annotated-types#37

Open

Zac-HD mentioned this issue Sep 13, 2023

Can hypothesis only generate data that pass validation of Pydantic #3739

Closed

guacs mentioned this issue Sep 26, 2023

Enhancement: support creation of Hypothesis strategies litestar-org/polyfactory#387

Open

Viicos mentioned this issue Nov 1, 2023

Support annotated_types in st.from_type #3780

Merged

Zac-HD closed this as completed in #3780 Nov 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check for `annotated-types` constraints in `st.from_type(Annotated[T, ...])` #3356

Check for `annotated-types` constraints in `st.from_type(Annotated[T, ...])` #3356

Zac-HD commented May 22, 2022 •

edited

Zac-HD commented Oct 30, 2022

Viicos commented Oct 27, 2023 •

edited

Zac-HD commented Oct 27, 2023

Viicos commented Oct 29, 2023

Zac-HD commented Nov 1, 2023 •

edited

Check for annotated-types constraints in st.from_type(Annotated[T, ...]) #3356

Check for annotated-types constraints in st.from_type(Annotated[T, ...]) #3356

Comments

Zac-HD commented May 22, 2022 • edited

Zac-HD commented Oct 30, 2022

Viicos commented Oct 27, 2023 • edited

Zac-HD commented Oct 27, 2023

Viicos commented Oct 29, 2023

Zac-HD commented Nov 1, 2023 • edited

Check for `annotated-types` constraints in `st.from_type(Annotated[T, ...])` #3356

Check for `annotated-types` constraints in `st.from_type(Annotated[T, ...])` #3356

Zac-HD commented May 22, 2022 •

edited

Viicos commented Oct 27, 2023 •

edited

Zac-HD commented Nov 1, 2023 •

edited