I was doing something totally simple, just calculating rotation angles by taking sine and cosine using the FPU, when I looked at the results I was getting in OllyDbg. At first I thought I was losing my mind: for the cosine of 90° (π radians) I wasn't getting zero at all. WTF? When I took a second look I realized it was nonzero, but was very close to zero (large negative exponent). Odd, since the cosine of 0° is exactly 1 and the sine exactly 0.
So I wrote a little testbed, taking angles between 0° and 360° (0 to 2π) and computing sine and cosine. Here's what I got:
Angle: 0 degrees Sin: 0 Cos: 1.00000000E+0000
Angle: 45 degrees Sin: 7.07106781E-0001 Cos: 7.07106766E-0001
Angle: 90 degrees Sin: 1.00000000E+0000 Cos: -4.37113900E-0008
Angle: 135 degrees Sin: 7.07106781E-0001 Cos: -7.07106785E-0001
Angle: 180 degrees Sin: 1.22460635E-0016 Cos: -1.00000000E+0000
Angle: 225 degrees Sin: -7.07106781E-0001 Cos: -7.07106830E-0001
Angle: 270 degrees Sin: -1.00000000E+0000 Cos: 1.19248805E-0008
Angle: 315 degrees Sin: -7.07106781E-0001 Cos: 7.07106679E-0001
Angle: 360 degrees Sin: -2.44921271E-0016 Cos: 1.00000000E+0000
Notice the values in red that should be zero. I'm using single-precision floats here, REAL4s.) Is this the best we expect from the FPU with that data type?
I guess if you were checking for zero here, you'd have to compare the value to some number just larger than the smallest value to determine "zero".
I should run this test with larger floats, see what I get then.
David,
You have 64 bit precision with SSE2 SD maths or 80 bit precision with the old FP instructions. If you need extra precision, these would be worth a try.
Sure, I'm aware that there are much more precise formats than dinky little REAL4. For my purposes, that format was more than precise enough. My purpose was to rotate some text 90°. I was just a little surprised at how far off some results were. (Or were they?)
Wonder if Raymond has anything to say about this?
Somebody know where I left the crystal ball?
No surprise: REAL4 has a lousy precision. But for calculating screen positions or angles it's more than sufficient, that's why GdiPlus uses it.
include \masm32\MasmBasic\MasmBasic.inc
Init
Print cfm$("\n\nFP4:")
fld FP4(0.78539816339744830961566084581988) ; 45 degrees
fsincos
deb 4, "Cosine, sine", ST(0), ST(1), ST(2)
fstp st
fstp st
fld FP4(1.5707963267948966192313216916398) ; 90 degrees
fsincos
deb 4, "Cosine, sine", ST(0), ST(1), ST(2)
fstp st
fstp st
Print cfm$("\n\nFP10:")
fld FP10(0.78539816339744830961566084581988)
fsincos
deb 4, "Cosine, sine", ST(0), ST(1), ST(2)
fstp st
fstp st
fld FP10(1.5707963267948966192313216916398)
fsincos
deb 4, "Cosine, sine", ST(0), ST(1), ST(2)
fstp st
fstp st
EndOfCode
FP4:
Cosine, sine
ST(0) 0.7071067657322372128
ST(1) 0.7071067966408574982
ST(2) empty 0.0
Cosine, sine
ST(0) -4.371139000186443665e-08
ST(1) 0.9999999999999990447
ST(2) empty 0.0
FP10:
Cosine, sine
ST(0) 0.7071067811865475244
ST(1) 0.7071067811865475245
ST(2) empty 0.0
Cosine, sine
ST(0) -2.710505431213761085e-20
ST(1) 1.000000000000000000
ST(2) empty 0.0
Quote from: NoCforMe on October 01, 2022, 12:10:52 PM
Sure, I'm aware that there are much more precise formats than dinky little REAL4. For my purposes, that format was more than precise enough. My purpose was to rotate some text 90°. I was just a little surprised at how far off some results were. (Or were they?)
Wonder if Raymond has anything to say about this?
For those who may not be aware of this detail, the input angle for the FPU trig functions must be expressed in radians and be within the -2
63 to +2
63 range.
I'm quite sure YOU were aware of this requirement unless you used someone else's macro based on using actual degrees.
Any input angle expressed in radians other than 0 would have some precision error which would be reflected in the result. The actual error would then depend on the float format you use.
Thanks, Raymond, for offering your informed opinion here.
Yes, I'm aware of all that. No macros, just plain vanilla X86/X87 code. My values are smack in the middle of that range, from 0 to 2π, which is why I was a little surprised. But not any longer; like any machine, the FPU has its limits, certainly in this least-precision mode (REAL4).