Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different test results using Clang when enabling Debug or not on targer RVV #2140

Open
wychlw opened this issue May 7, 2024 · 2 comments
Open

Comments

@wychlw
Copy link

wychlw commented May 7, 2024

When using clang@d70267fb with highway@e9a2799, cmake with -DCMAKE_BUILD_TYPE=Debug option gives different test results between without this option.

Test results without Debug option:

99% tests passed, 4 tests failed out of 684

Total Test time (real) =   7.99 sec

The following tests FAILED:
        249 - HwyDemoteTestGroup/HwyDemoteTest.TestAllDemoteToFloat/RVV  # GetParam() = 137438953472 (Failed)
        283 - HwyFloatTestGroup/HwyFloatTest.TestAllCeil/RVV  # GetParam() = 137438953472 (Failed)
        285 - HwyFloatTestGroup/HwyFloatTest.TestAllFloor/RVV  # GetParam() = 137438953472 (Failed)
        651 - SortTestGroup/SortTest.TestAllPartition/RVV  # GetParam() = 137438953472 `(Failed)`
Errors while running CTest

Test results with Debug option:

98% tests passed, 12 tests failed out of 684

Total Test time (real) =  25.01 sec

The following tests FAILED:
        249 - HwyDemoteTestGroup/HwyDemoteTest.TestAllDemoteToFloat/RVV  # GetParam() = 137438953472 (Failed)
        571 - MatVecTestGroup/MatVecTest.TestAllMatVecBF16/RVV  # GetParam() = 137438953472 (Failed)
        645 - SortTestGroup/SortTest.TestAllFloatInf/RVV  # GetParam() = 137438953472 (Failed)
        651 - SortTestGroup/SortTest.TestAllPartition/RVV  # GetParam() = 137438953472 (Failed)
        655 - SortTestGroup/SortTest.TestAllSort/RVV  # GetParam() = 137438953472 (Failed)
        656 - SortTestGroup/SortTest.TestAllSort/EMU128  # GetParam() = 2305843009213693952 (Failed)
        657 - SortTestGroup/SortTest.TestAllSelect/RVV  # GetParam() = 137438953472 (Failed)
        658 - SortTestGroup/SortTest.TestAllSelect/EMU128  # GetParam() = 2305843009213693952 (Failed)
        659 - SortTestGroup/SortTest.TestAllPartialSort/RVV  # GetParam() = 137438953472 (Failed)
        660 - SortTestGroup/SortTest.TestAllPartialSort/EMU128  # GetParam() = 2305843009213693952 (Failed)
        663 - BenchSortGroup/BenchSort.BenchAllSort/RVV  # GetParam() = 137438953472 (Failed)
        664 - BenchSortGroup/BenchSort.BenchAllSort/EMU128  # GetParam() = 2305843009213693952 (Failed)
Errors while running CTest

When digging into a more sipecific task, MatVecTest.TestAllMatVecBF16/RVV, on line:

const double tolerance =
exp * 20 * 1.0 /
(1ULL << HWY_MIN(MantissaBits<MatT>(), MantissaBits<VecT>()));
if (!(exp - tolerance <= act && act <= exp + tolerance)) {

With Debug, the actual would be -1.993652, resulting a negtive tolerance. But without Debug, all data would be positive so the test is fine.

i16/f32 6 x 8, with add: mismatch at 4 -1.993652 -1.993652; tol -0.311508

And in SortTest, num would be 24 and Constants::SampleLanes<T>() would be 32.

Abort at vqsort-inl.h:1208: Assert num >= Constants::SampleLanes<T>()
@jan-wassenberg
Copy link
Member

Thanks for reporting. We have also seen issues with rounding mode on QEMU - is that how you are running the tests, or is it on real HW?

Debug, the actual would be -1.993652

Interesting. GenerateMod does, or should, generate numbers 0..15. Can you help us understand where the negative numbers come from? Would be good to also add an assert that inputs and outputs are non-negative.

For SortTest, the comment there says: "We have at least 2 chunks (x 64 bytes) because the base case handles anything up to 8 vectors (x 16 bytes)." It seems possible that this is breaking with LMUL<1. This is only 'breaking' in debug mode because it's a DASSERT which is only active in debug builds. Can you print N and d.Pow2() at the failing DASSERT?

@johnplatts
Copy link
Contributor

There were bugs in RVV F64->F32 and F32->F16 DemoteTo, which are fixed in pull request #2164.

RVV Ceil and Floor have also been reimplemented in pull request #2164 to avoid changing the floating point rounding mode using inline assembly, which fixes issues with Ceil and Floor on RVV on Clang 16 and later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants