TST: add data-driven test #7

inkydragon · 2024-11-22T10:15:39Z

POC/RFC for data-driven test.

Based on #4

Generated 2000 exp1 test data using Wolfram:
- RandomReal[{0.0, 1.0}, 1000] + RandomReal[{0, 720}, 1000]
- Output in 18 digit precision using the default precision (double)
C++ code line coverage 100% (for xsf::exp1(double))
Read csv test data using third party ben-strasser/csv.h
Manually test for special values: NaN, +-Inf, ...

Some issues

Choice of reference implementation.
This pr uses Wolfram, for expint you can also use MPFR, mpmath, FLINT (Arb)
Output precision.
double type has a precision of no more than 18 significant digits.
perhaps in scientific notation, 18 to 20 significant digits would be sufficient.
Input range selection
Number of test data to generate.
Perhaps just make sure that line of code coverage > 95%

fancidev · 2024-12-06T23:30:57Z

Great initiative!

I think It could be helpful to migrate the existing Python test cases first, and then add more test cases gradually.

The accuracy to test for would ideally depends on the accuracy guarantee of the function implementation. However, this is usually not specified because it is usually much harder (and sometimes infeasible) to find out this guarantee than to write an implementation that "looks good". When the accuracy is not specified by the implementation, I guess you could try an increasing sequence of tolerance and use the first one that passes all the tests.

As for input selection, one possibility is to generate (uniformly distributed) bit patterns of the inputs. Then remove those inputs where the correct output is "trivial" (e.g. inf, nan, underflowing zeros). Do this until you collect say 1000 non-trivial inputs and keep those. I expect this method to cover more of a "stress test" type because it would cover a very wide input domain.

steppi · 2024-12-06T23:48:45Z

As for input selection, one possibility is to generate (uniformly distributed) bit patterns of the inputs. Then remove those inputs where the correct output is "trivial" (e.g. inf, nan, underflowing zeros). Do this until you collect say 1000 non-trivial inputs and keep those. I expect this method to cover more of a "stress test" type because it would cover a very wide input domain.

I had essentially the same idea. I've started working on code for random floating point generation on a branch. You can see here, https://github.com/steppi/xsf/blob/test_case_gen/xsf_testing/src/xsf_testing/sampling.py. It's slightly more sophisticated than rejection sampling over random bit patterns, but the general idea is the same. It's also capable of randomly sampling floating point numbers from a particular interval. The idea is indeed to cover a very wide input domain. As for tolerances, rather than trying to decide a priori on tolerances for all functions, I think we should just see what the relative errors are, and store the state of the relative error on main for each tested platform. The tests could then make sure the relative error doesn't get worse by some factor. There's lots of room for improvement in special, and we can use the test cases to guide development. I think the way to do this is to have a smaller set of cases which run on every commit, and a larger set of cases that run less frequently on a cron.

I took some time off recently due to an elbow injury making typing difficult so am a little delayed getting this rolled out. When test case generation is working properly, then we can move forward with the data driven tests in this PR.

lucascolley · 2024-12-06T23:50:36Z

get well soon!

steppi · 2024-12-06T23:52:26Z

get well soon!

Thanks @lucascolley! It's already improved a lot and I'm back to work this week!

inkydragon added 13 commits November 13, 2024 17:43

TST: init gtest

bd80f78

TST: add cmake function add_gtest

26e41a4

TST: init Airy test

9e96f69

TST: Generating coverage report

71f3084

DEV/TST: setup coverage workflow

850bb58

DEV: Upload HTML coverage report

a93a0b0

DEV: Do not specify a full path to lcov

0e01e58

DEV: fix cmake warning CMP0135

bcac6e0

TST: add SpecialValue test for exp1

2d4bb33

TST: Generating exp1 test data with Wolfram

497c6c4

TST: add 3rd/csv.h [BSD-3]

74165f1

TST: add DataDriven test for exp1

a58d1a3

TST: use rtol=1e-11 for exp1 tests

c408b61

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TST: add data-driven test #7

TST: add data-driven test #7

inkydragon commented Nov 22, 2024

fancidev commented Dec 6, 2024

steppi commented Dec 6, 2024

lucascolley commented Dec 6, 2024

steppi commented Dec 6, 2024

TST: add data-driven test #7

Are you sure you want to change the base?

TST: add data-driven test #7

Conversation

inkydragon commented Nov 22, 2024

Some issues

fancidev commented Dec 6, 2024

steppi commented Dec 6, 2024

lucascolley commented Dec 6, 2024

steppi commented Dec 6, 2024