News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Floating point arithmetic question

Started by Lonewolff, April 16, 2018, 08:20:24 PM

Previous topic - Next topic

Lonewolff

Hi guys,

I am trying to work out how to do a simple floating point subtraction but the result is incorrect.


fild valA
fild valB
fsub
fstp result


valA is a real4 of 1000.0 and valB is a real4 of 1.0

I was expecting a result of 999.0 but I get -8.34929E+07.

Any advice would be much appreciated  8)

Lonewolff

Ah! Worked it out. I should have been using fld not fild  :eusa_clap:

daydreamer

For simple math and sqrt and rsqrt use SSE instead,its easier to make code faster in a loop
movss xmm0,val1
subss xmm0,val2
movss result,xmm0
And when you need speedup,use movaps,subps etc instead
my none asm creations
https://masm32.com/board/index.php?topic=6937.msg74303#msg74303
I am an Invoker
"An Invoker is a mage who specializes in the manipulation of raw and elemental energies."
Like SIMD coding

Lonewolff


jj2007

Quote from: daydreamer on April 16, 2018, 08:58:39 PM
For simple math and sqrt and rsqrt use SSE instead,its easier to make code faster in a loop
movss xmm0,val1
subss xmm0,val2
movss result,xmm0

And if you are not absolutely sure, time it:Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz (SSE4)

87      cycles for 100 * movss
59      cycles for 100 * fpu

80      cycles for 100 * movss
59      cycles for 100 * fpu

80      cycles for 100 * movss
59      cycles for 100 * fpu

84      cycles for 100 * movss
59      cycles for 100 * fpu

80      cycles for 100 * movss
60      cycles for 100 * fpu

24      bytes for movss
18      bytes for fpu


movss: movss xmm0, val1
subss xmm0, val2
movss result, xmm0


fpu: fld val1
fsub val2
fstp result

Lonewolff

Yep. Been timing routines heaps today.

Trying to make them as tight as possible as they are performance critical for my project.

Lot's of testing, timing, and learning going on  :biggrin:

Lonewolff

Having another small issue.

I am trying to calculate the tangent of a value using the fp* commands.


fld VALUE ;// contains 0.523599
fptan
fstp RESULT ;// should be 0.57735


According to ConverterDD (http://masm32.com/board/index.php?topic=1819.0) the value in RESULT is NaN.

ConverterDD has being displaying all results correctly so far.

Am I using fptan correctly?

jj2007

Quote from: Lonewolff on April 16, 2018, 10:22:06 PMAm I using fptan correctly?
What does the help file say?

include \masm32\MasmBasic\MasmBasic.inc         ; download
  Init
  fld FP4(0.523599)
  fptan
  deb 4, "The FPU:", ST(0), ST(1)
EndOfCode


The FPU:
ST(0)           1.000000000000000000
ST(1)           0.5773506065083982818

RuiLoureiro

#8
Quote from: Lonewolff on April 16, 2018, 10:22:06 PM
Having another small issue.

I am trying to calculate the tangent of a value using the fp* commands.

Quote
   fld VALUE         ;// contains 0.523599
   fptan
   fstp  st            ; remove 1.0 from st(0)<<-- See Simply FPU by Raymond
   fstp RESULT      ;// should be 0.57735

According to ConverterDD (http://masm32.com/board/index.php?topic=1819.0) the value in RESULT is NaN.

ConverterDD has being displaying all results correctly so far.

Am I using fptan correctly?    <<<--- NO, it seems you want tan(VALUE)=0.5773505683919327
Hi
    Could you post this simple example (the asm file) ?

Lonewolff

Quote from: jj2007 on April 16, 2018, 10:43:20 PM
What does the help file say?

Help file says I am. The result says otherwise.


@RuiLoureiro - Will do  :t

Lonewolff

This works though.


fld number ;// contains 0.523599
fptan
fstp result ;// contains 1 why?
fstp result ;// should be 0.57735


Why is it that two fstp calls are required?

raymond

QuoteWhy is it that two fstp calls are required?

If you follow up on reading the recommended FPU tutorial (more specifically the part relating to the fptan instruction at http://www.ray.masmcode.com/tutorial/fpuchap10.htm#fptan) you would get your answer to your questions, including the last.

It may also give you a hint to explain one of your previous comment:
QuoteAccording to ConverterDD (http://masm32.com/board/index.php?topic=1819.0) the value in RESULT is NaN.
That may possibly be due to valid data already being in the ST(7) and/or the ST(0) register when attempting to compute the tangent. Otherwise
a value of "1" should have been returned by ConverterDD.
Whenever you assume something, you risk being wrong half the time.
https://masm32.com/masmcode/rayfil/index.html

Lonewolff

Thanks for the link to the tutorial. Clears it up well  :t

RuiLoureiro

#13
Quote from: Lonewolff on April 17, 2018, 10:12:49 AM
This works though.

Quote
   fld number         ;// contains 0.523599
   fptan
   fstp resultX         ;// contains 1 why?  <<<- because fptan gives 2 results, not 1
   fstp result         ;// should be 0.57735

Why is it that two fstp calls are required?
What do you get for the new resultX variable ? Try to see. Print it. It should be 1.0 as Raymond said.
              fstp  st should be used in this case, it is faster. It removes the current st(0) when we dont need it.
:t

jj2007

Quote from: RuiLoureiro on April 17, 2018, 10:41:22 PMfstp  st should be used, it is faster. It removes the current st(0) when we dont need it.

Btw there is also fincstp, which at first sight has the same effect. But try a simple fldpi afterwards, and you'll see the difference. Olly has a section with the FPU regs, you must scroll down a little bit the upper right pane to see it.