-
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TST: add data-driven test #7
base: main
Are you sure you want to change the base?
Conversation
Great initiative! I think It could be helpful to migrate the existing Python test cases first, and then add more test cases gradually. The accuracy to test for would ideally depends on the accuracy guarantee of the function implementation. However, this is usually not specified because it is usually much harder (and sometimes infeasible) to find out this guarantee than to write an implementation that "looks good". When the accuracy is not specified by the implementation, I guess you could try an increasing sequence of tolerance and use the first one that passes all the tests. As for input selection, one possibility is to generate (uniformly distributed) bit patterns of the inputs. Then remove those inputs where the correct output is "trivial" (e.g. inf, nan, underflowing zeros). Do this until you collect say 1000 non-trivial inputs and keep those. I expect this method to cover more of a "stress test" type because it would cover a very wide input domain. |
I had essentially the same idea. I've started working on code for random floating point generation on a branch. You can see here, https://github.com/steppi/xsf/blob/test_case_gen/xsf_testing/src/xsf_testing/sampling.py. It's slightly more sophisticated than rejection sampling over random bit patterns, but the general idea is the same. It's also capable of randomly sampling floating point numbers from a particular interval. The idea is indeed to cover a very wide input domain. As for tolerances, rather than trying to decide a priori on tolerances for all functions, I think we should just see what the relative errors are, and store the state of the relative error on main for each tested platform. The tests could then make sure the relative error doesn't get worse by some factor. There's lots of room for improvement in I took some time off recently due to an elbow injury making typing difficult so am a little delayed getting this rolled out. When test case generation is working properly, then we can move forward with the data driven tests in this PR. |
get well soon! |
Thanks @lucascolley! It's already improved a lot and I'm back to work this week! |
POC/RFC for data-driven test.
Based on #4
exp1
test data using Wolfram:RandomReal[{0.0, 1.0}, 1000] + RandomReal[{0, 720}, 1000]
xsf::exp1(double)
)ben-strasser/csv.h
NaN, +-Inf, ...
Some issues
This pr uses Wolfram, for
expint
you can also use MPFR, mpmath, FLINT (Arb)double
type has a precision of no more than 18 significant digits.perhaps in scientific notation, 18 to 20 significant digits would be sufficient.
Perhaps just make sure that line of code coverage
> 95%