Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Downcast numbers passed to floats() internally in constrained_complex() #3574

Merged
merged 4 commits into from
May 10, 2023

Conversation

honno
Copy link
Member

@honno honno commented Feb 7, 2023

Should resolve #3573, where we used the builtin width (i.e. 64) results of cathetus(), which don't validate in floats() for any width but 64. Currently this PR downcasts the output of cathetus(), but I wonder if some of those downcasts would need clipping... EDIT: yeah somethings wrong

@Zac-HD
Copy link
Member

Zac-HD commented Feb 7, 2023

I think we have existing but accepting-broken-behavior tests for this - we shouldn't be accepting epsilon errors here for example:

https://github.com/HypothesisWorks/hypothesis/pull/3570/files#diff-eadee76ceb9206621a9e39a80c3397360e623cf1459a429b642a411a331bdc31R206-R224

Going to be a annoying, but I think this should include a review of our existing tests for st.complex_numbers() 😥

@felixdivo
Copy link
Contributor

we shouldn't be accepting epsilon errors here for example

Oh no, what have I done in 3b4918f (part of #3570) ... 😅

As an explanation, these were just copied from here, so that should also be investigated:

@given(st.data(), st.integers(-5, 5).map(lambda x: 10**x))
def test_max_magnitude_respected(data, mag):
c = data.draw(complex_numbers(max_magnitude=mag))
assert abs(c) <= mag * (1 + sys.float_info.epsilon)
@given(complex_numbers(max_magnitude=0))
def test_max_magnitude_zero(val):
assert val == 0
@given(st.data(), st.integers(-5, 5).map(lambda x: 10**x))
def test_min_magnitude_respected(data, mag):
c = data.draw(complex_numbers(min_magnitude=mag))
assert (
abs(c.real) >= mag
or abs(c.imag) >= mag
or abs(c) >= mag * (1 - sys.float_info.epsilon)
)

As a note: One has to use the builtin abs() function instead of numpy.abs()/numpy.absolute() since the builtin one does not complain about overflows, which can happen when a comlex number has two very large real and imaginaly components.

@honno
Copy link
Member Author

honno commented Mar 6, 2023

(I'm taking a staycation in 2 weeks time where I'll be able work on reviewing the complex-related tests)

@honno honno force-pushed the complex-magnitudes-widths branch from 4c9f1a9 to 72624d2 Compare March 24, 2023 19:56
@honno
Copy link
Member Author

honno commented Mar 24, 2023

So uhh that staycation came and went 🙈

My exploration on

assert abs(c) <= mag * (1 + sys.float_info.epsilon)

amounts to the use of math.sqrt() used in cathetus() (and thus complex_numbers()) apparently allows for errors on (assumedly specific) systems?

The magnitude constraints are respected up to a relative error
of (around) floating-point epsilon, due to implementation via
the system ``sqrt`` function.

This function relies on the system ``sqrt`` function and so, like it,
may be inaccurate up to a relative error of (around) floating-point
epsilon.

Looking at #1154 (adding {min,max}_magnitude), JJ Green ported their own C cathetus implementation to Python for use in Hypothesis. The epsilon thing didn't seem to be discussed, and a quick google got me no prior art here, but CI is failing with my removal of the "boundary-off-by-epsilon" testing for OSX! Maybe we can rethink the cathetus implementation, or see if we can coerce it somehow.

Anywho happy to explore it all more thoroughly (seem to have no time for coding outside of work now so that might take a while 🙃), thought I'd just brain dump now incase @Zac-HD had any insight/direction.

@Zac-HD
Copy link
Member

Zac-HD commented Mar 24, 2023

...I suspect we might need to just slap on a filter and call it a day 😥

That's imperfect but still clearly better than the status quo, and so long as we document that it might be inefficient when there are very few allowed values I'll be happy with it.

@felixdivo
Copy link
Contributor

...I suspect we might need to just slap on a filter and call it a day 😥

I guess that's pragmatic. It is honestly quite nieche, so it's probably okay if it is not 100% efficient (my outsider's view on this).

@Zac-HD
Copy link
Member

Zac-HD commented May 8, 2023

@honno any updates here? I'd be keen to get something in ASAP.

@honno
Copy link
Member Author

honno commented May 8, 2023

@honno any updates here? I'd be keen to get something in ASAP.

Thanks for the shout, I'll have a go tomorrow! Been busy with local election campaigning stuff which just ended 😅

Remove erroneous tests as complex_numbers({min/max}_magnitude=1.8, width=64)
should validate
@honno honno force-pushed the complex-magnitudes-widths branch from 72624d2 to 67a9da9 Compare May 9, 2023 09:46
@honno honno force-pushed the complex-magnitudes-widths branch from 67a9da9 to 97dc464 Compare May 9, 2023 10:17
@honno
Copy link
Member Author

honno commented May 9, 2023

Hmm, I couldn't figure out how to filter out examples which are outside the magnitude range due to epsilon errors. Say we slapped a filter on the resulting constrained_complex() strategy

# hypothesis/strategies/_internal.py:1746
+    if max_magnitude is None:
+        filter = lambda n: abs(n) >= min_magnitude
+    else:
+        filter = lambda n: min_magnitude <= abs(n) <= max_magnitude

-    return constrained_complex()
+    return constrained_complex().filter(filter)

This is inadequate as abs() gives us overflow errors for otherwise valid complex numbers (e.g. tests/nocover/test_complex_numbers.py::test_magnitude_validates[min_magnitude-128]). If we ignored overflow errors, we wouldn't be catching all potential epsilon errors. Also, abs() might suffer the same epsilon error issues as sqrt() for OSX—my research on the matter came up short though.

Any alternative approach to robustly identify epsilon errors? I couldn't seem to do filtering internally in constrained_complex() either, as the boundaries to test the individual zr and zi complex components have to be calculated by the error-prone cathetus() anywho.

If there's no resolution, I suggest we at least merge this PR to fix #3573 (also leaving comments about the allowance of epsilon errors), and write an issue about finding some kind of resolution to this epsilon error issue. FWIW I hadn't found any other problem with the complex tests, and alongside the introduced test_magnitude_validates they seem adequate to cover the #3573 fix.


Ugh tests/cover/test_complex_numbers.py::test_min_magnitude_respected failing with "Data generation is extremely slow" on the check-coverage job, can't reproduce locally either. I assume the float_of() usage in constrained_complex() is causing that...

@Zac-HD
Copy link
Member

Zac-HD commented May 9, 2023

We could try

else:
    def pred(n):
        try:
            return min_magnitude <= abs(n) <= max_magnitude
        except OverflowError:
            return True  # at least we improved on the status quo, see #3574

which is at least a weak improvement on the status quo? But let's try to get this PR in with just the width bugfix first.

Copy link
Member

@Zac-HD Zac-HD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we need to look at the too-slow issue, but otherwise this looks good to me!

hypothesis-python/RELEASE.rst Outdated Show resolved Hide resolved
Copy link
Member

@Zac-HD Zac-HD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏 thanks again @honno, you're a hero for array users everywhere!

@Zac-HD Zac-HD merged commit b8edd73 into HypothesisWorks:master May 10, 2023
50 checks passed
@honno honno deleted the complex-magnitudes-widths branch February 28, 2024 11:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

hypothesis.extra.numpy.from_dtype is difficult to use with complex dtypes
3 participants