Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal usage of f64 in some f32 functions #118

Closed
korken89 opened this issue Jul 16, 2018 · 21 comments
Closed

Internal usage of f64 in some f32 functions #118

korken89 opened this issue Jul 16, 2018 · 21 comments

Comments

@korken89
Copy link

Hi, I had a look over the library and something struck me as odd.

For example in sinf and k_sinf the input arguments are being cast up into f64, which will incur a significant run-time cost on no_std systems which only has an single precision floating point unit.
Moreover, the calculation being done in, for example here:
https://github.com/japaric/libm/blob/3932e2ea8e7e8f782cea6b243c3032d415b8afeb/src/math/sinf.rs#L62
The f64 is only used for sums, which does not need the extra precision given by f64.

This looks like a copy/paste bug to me, but I would like to hear your reasoning about this.

@korken89
Copy link
Author

I saw that the kernel in k_sinf utilizes f64 and that high powers (x^5) are being used as well.
Is this what causes the use of f64? Has the precision loss vs f32 been benchmarked?

@korken89
Copy link
Author

korken89 commented Jul 16, 2018

I did a quick benchmark of the kernel over the [0, PI/4] interval (100k steps) utilizing this test code:
https://gist.github.com/korken89/fd6e2fa5f4ededfdd37cd5e846c7819d

The maximum error is 0.000000059604645.
A quick test seems to indicate that the use of f64 is not needed.

@korken89
Copy link
Author

korken89 commented Jul 16, 2018

I extended it for cos as well and added relative error (10M steps): https://gist.github.com/korken89/6a219e0cf87be244009086072816b524

Max error sin = 0.000000059604645
Max error cos = 0.00000011920929
Max relative error sin = 0.00000011920926
Max relative error cos = 0.000000168585

@burrbull
Copy link
Contributor

These implementations from musl library.
We could add other with cargo feature to select accuracy or performance.

@korken89
Copy link
Author

What I am trying to point out is that there is no difference when the max error is 0.000000168585, it is at the edge of precision for f32 with approx. 7 decimals of precision.
For the developer that wants precision, they should be using f64 to start with. :)

@japaric
Copy link
Member

japaric commented Jul 16, 2018

copy pasting some of my IRC comments here for posteriority / visibility:

  • the crate (this crate) is a 1:1 port of MUSL libm
  • yes some f32 functions (internally) use f64 for some reason
  • we can change the implementations to only use f32 but (a) the original code must be compatible with MIT / Apache 2.0 and (b) it needs to be thoroughly tested against the C implementation

@vks
Copy link

vks commented Jul 17, 2018

I would expect it is to avoid cancellation errors.

@korken89
Copy link
Author

From the internal kernel of the sinf/cosf, it is simple 5th order polynomial approximations over the minimal range needed to recreate any sin/cos value.
The only one that is not clear for me yet is the pi/2 modulo operation, which goes through a lot of work. I'm looking into it as well.

@porglezomp
Copy link
Contributor

@korken89 could your run your test again measuring the max error in ULPs or percent error instead of in absolute? Absolute error seems like it could be misleading here when the results are potentially small.

@korken89
Copy link
Author

Please see my previous response, take the relative error *100 and you have percent. And is on the order of 0.000017% worst case. :)

@porglezomp
Copy link
Contributor

Ah, my bad, I misread what you were doing, thank you.

@porglezomp
Copy link
Contributor

I slightly modified your test bench to get the error in ULPs (maybe someone check my work…): https://gist.github.com/porglezomp/3ec824f07d15ba5383992f573e6be120

The max error in sinf is 1 ULP, and the max error in cosf is 2 ULPs.

@korken89
Copy link
Author

Thank you for adding! ULPs really shows how close the kernels are writhin it's designed bounds.
I just have to give the pi/2 modulo function a look, it's the only thing in the way of a full f32 implementation.

@korken89
Copy link
Author

I found this look into sine approximation (in Rust) utilizing Chebychev polynomials, but over the [-pi, pi] domain.
Quite interesting and a good read: http://mooooo.ooo/chebyshev-sine-approximation/

However, this is not the current issue, but still a good read.

@japaric
Copy link
Member

japaric commented Jul 23, 2018

I looked at /usr/arm-none-eabi/lib/thumb/v7e-m+fp/hard/libm.a and, for example, asinf doesn't seem to be using any f64 code. This file is part of the Arch Linux arm-none-eabi-newlib package.

We could use the newlib implementations of some f32 functions but we would have to tweak the test system to generate the expected outputs using newlib.

@korken89
Copy link
Author

@japaric I think this would be a really good idea!
Fixing the test system should be no problem, I have to give the newlib code a look though.

@korken89
Copy link
Author

Hi, I have been looking into the argument reduction and why they need the immense amount of bits of pi. The math behind the implementation they have done, and the need for all the bits is described here: https://www.csee.umbc.edu/~phatak/645/supl/Ng-ArgReduction.pdf

More or less, to make an argument reduction correctly of very large inputs (let say sin(10^10)) is very sensitive to rounding error, and to get small errors they need enough bits in pi/xxx.

However, the implementation used here is for f64, there should be a second implementation for f32s. Perhaps the original implementers oversaw this issue. What would you say @japaric , just use newlib or try and rewrite the argument reduction into an f32? I would probably go with newlib.

@japaric
Copy link
Member

japaric commented Jul 24, 2018

@korken89 I think it would be easier to test the accuracy of a 1:1 port of newlib.

@burrbull
Copy link
Contributor

I already ported sinf and cosf from newlib with dependencies and can add them in 1 day.
Now I have some problems with inet.

@japaric
Copy link
Member

japaric commented Jul 25, 2018

@burrbull nice! Do you also have looked into adding testing support for newlib? If not, I can take a stab at it.

@japaric
Copy link
Member

japaric commented Jul 27, 2018

The test generator now supports testing against newlib. I have created several issues to track porting some of the newlib implementations. They are all under the newlib milestone. newlib's implementation of fmaf also uses f64 operations so we'll need to port some other implementation or write one ourselves -- I have also created an issue for that task in #140.

I'm going to close this issue in favor of the individual issues.

@japaric japaric closed this as completed Jul 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants