MASM32 Downloads

fild valA fild valB fsub fstp result

For simple math and sqrt and rsqrt use SSE instead,its easier to make code faster in a loopmovss xmm0,val1subss xmm0,val2movss result,xmm0

Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz (SSE4)87 cycles for 100 * movss59 cycles for 100 * fpu80 cycles for 100 * movss59 cycles for 100 * fpu80 cycles for 100 * movss59 cycles for 100 * fpu84 cycles for 100 * movss59 cycles for 100 * fpu80 cycles for 100 * movss60 cycles for 100 * fpu24 bytes for movss18 bytes for fpu

movss xmm0, val1 subss xmm0, val2 movss result, xmm0

fld val1 fsub val2 fstp result

fld VALUE ;// contains 0.523599 fptan fstp RESULT ;// should be 0.57735

Am I using fptan correctly?

The FPU:ST(0) 1.000000000000000000ST(1) 0.5773506065083982818

Having another small issue.I am trying to calculate the tangent of a value using the fp* commands.Quote fld VALUE ;// contains 0.523599 fptan fstp st ; remove 1.0 from st(0)<<-- See Simply FPU by Raymond fstp RESULT ;// should be 0.57735According to ConverterDD (http://masm32.com/board/index.php?topic=1819.0) the value in RESULT is NaN.ConverterDD has being displaying all results correctly so far.Am I using fptan correctly? <<<--- NO, it seems you want tan(VALUE)=0.5773505683919327

fld VALUE ;// contains 0.523599 fptan fstp st ; remove 1.0 from st(0)<<-- See Simply FPU by Raymond fstp RESULT ;// should be 0.57735

What does the help file say?

fld number ;// contains 0.523599 fptan fstp result ;// contains 1 why? fstp result ;// should be 0.57735

Why is it that two fstp calls are required?

According to ConverterDD (http://masm32.com/board/index.php?topic=1819.0) the value in RESULT is NaN.

This works though.Quote fld number ;// contains 0.523599 fptan fstp resultX ;// contains 1 why? <<<- because fptan gives 2 results, not 1 fstp result ;// should be 0.57735Why is it that two fstp calls are required?

fld number ;// contains 0.523599 fptan fstp resultX ;// contains 1 why? <<<- because fptan gives 2 results, not 1 fstp result ;// should be 0.57735

fstp st should be used, it is faster. It removes the current st(0) when we dont need it.