News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Simple floating point macros.

Started by hutch--, August 13, 2018, 04:36:54 PM

Previous topic - Next topic

hutch--

This is the set I will probably go with, they are all testing up OK and are faster than I remembered FP code from long ago. I am in debt to Rui and HSE for their assistance and to Ray with his reference work that has been very useful. I have been eyeing off some SSE for the simple add, sub, mul and div operations and they generally look good.

    fpinit MACRO                ;; initialise the x87 co-processor
      fninit
      fldz
    ENDM

    fpdiv MACRO arg1,arg2       ;; divide arg1 by arg2
      fld arg1
      fld arg2
      fdivp
    ENDM

    fpmul MACRO arg1,arg2       ;; multiply arg1 and arg2 together
      fld arg1
      fld arg2
      fmulp
    ENDM

    fpadd MACRO arg             ;; add a number
      fld arg
      faddp
    ENDM

    fpsub MACRO arg             ;; subtract a number
      fld arg
      fsubp
    ENDM

    fpsqrt MACRO number         ;; square root of number
      fld number
      fsqrt
    ENDM

    fpsqrd MACRO number         ;; number squared
      fld number
      fld number
      fmulp
    ENDM

    fpercent MACRO num,pcnt     ;; Get Percentage of number
     fld num
     fld FLT8(100.0)
     fdivp
     fld pcnt
     fmulp
    ENDM

    ; ----------------------------
    ; assign result to FP variable
    ; ****************************
    ; fstp variable_name
    ; ****************************

RuiLoureiro

#31
Hutch, about fpinit or the problems with fpinit, take a look at this here (reply #1):

                        http://masm32.com/board/index.php?topic=7352.0

Remember that we may start with fpmul or fpdiv and not with fpadd or fpsub.
If we start with fpinit and then with fpmul we have problems... Repeat some block of code
10 times. Is the result always correct ? Yes it is: the FPU has st(0)=0.0.

About fpsqrd macro we dont need to load the number 2 times but only 1 time:

fld   number    ; load number
fld   st(0)        ; get/make a copy of number
fmulp             ; st(0)= number ^2

There is a second format of fadd/fsub/fmul/fdiv  Dst,Src as noted by Raymond
So it should be
fld  number
fmul  st,st      ; st means st(0)

hutch--

Rui,

I gather with the example that fld st(0) is faster than fld memory ?

I will look at the other later, I am a bit too tired to write another example.

RuiLoureiro

Quote from: hutch-- on August 15, 2018, 03:13:30 AM
Rui,

I gather with the example that fld st(0) is faster than fld memory ?

I will look at the other later, I am a bit too tired to write another example.
First, it was very nice to work with you :t
It seems to be faster because the number is inside FPU it is not loaded again.
But HSE may test it. He did it in another topic. When he read this i guess he will help us
to know what is the best way.
See you

hutch--

Good catch Rui, here is the changed macro, will take 1 or 2 arguments.

    fpadd MACRO arg1, arg2
      fld arg1
      IFNB <arg2>
        fld arg2
      ENDIF
      faddp
    ENDM

This works fine.

    fpadd val1,val2
    fpadd val2
    fpsub val1
    fstp fpval

HSE

#35
 :biggrin: What a nightmare!

fpinit MACRO                ;; initialise the x87 co-processor
      fninit
      fldz
      HutchsoniansFP = 1
    ENDM

fpadd MACRO arg1, arg2   
      fld arg1
      IFNB <arg2>
        fld arg2
        faddp                ; this is arg1+arg2
      ENDIF
      if HutchsoniansFP    ; if there is zero in st(0) [ or something else]
         faddp               ; this (arg1[+arg2]) + original st(0) [now in st(1)] 
      endif
    ENDM

   fpclose macro
      HutchsoniansFP =0
    endm
Equations in Assembly: SmplMath

hutch--

 :biggrin:

Why do I get the impression that this last post was not all that serious ?  :P

raymond

Quotefld   number    ; load number
fld   st(0)        ; get/make a copy of number
fmulp             ; st(0)= number ^2

Even more simple, there's not even any need to make a second copy in another FPU register:

fld   number    ; load number
fmul st,st        ; st(0)= number ^2


I still don't like the idea of leaving data in FPU registers with macros. In my opinion, the risk of generating garbage is too high for unaware users. Results should be stored immediately in a memory variable defined by the user in an additional arg.
Whenever you assume something, you risk being wrong half the time.
https://masm32.com/masmcode/rayfil/index.html

HSE

Quote from: hutch-- on August 15, 2018, 04:18:36 AM
Why do I get the impression that this last post was not all that serious ? 
You can change the names  :biggrin:

But there is a problem. I  will change.
Equations in Assembly: SmplMath

hutch--

Ray,

The target market for 64 bit MASM is different to the 32 bit version, it is not recommended to beginners at all but folks who already know how to write 32 bit MASM code. The difference with macros is the reference material and its easy enough to specify a "fstp variable" when the data needs to be placed in a variable but the more efficient form without redundant loads and stores is in the direction that many who use legacy code like this want.

I am just about clapped out and ready to sleep but I will have a look at your suggestion when I get up later today.

RuiLoureiro

Quote from: hutch-- on August 15, 2018, 04:18:36 AM
:biggrin:

Why do I get the impression that this last post was not all that serious ?  :P
Hutch,
         I guess that HSE is kidding with your idea of fpinit. It is not usual that we start the FPU with finit and  load 0 to st(0). What happen if we use it and next fpmul and next fstp var ? We exit and the FPU is not cleaned: 0.0 is in st(0). Is only this, there is no other problem, all macros works correctly, it seems.
note: when i have my new i7 i will test all possible cases.

HSE

Hi Rui!

Quote from: RuiLoureiro on August 15, 2018, 06:47:23 AM
         I guess that HSE is kidding with your idea of fpinit.

Just trying to guess what Hutch is making.

fldz in fpinit is a problem if you don't need it.

There is two types of macros:

1) Don't need a non-empty st(0) and left an additional non-empty st()

    fpmul MACRO arg1,arg2       ;; multiply arg1 and arg2 together
      fld arg1
      fld arg2
      fmulp
    ENDM

2) Need a non-empty st(0) and don't modify number of non-empty st()

    fpadd MACRO arg             ;; add a number
      fld arg
      faddp
    ENDM

To make two set of macros is a posible solution, mmm

Meanwhile I'm trying to solve some problems calculating adaptation value of vectors for a Genetic Algorithm, really slow with so many debugging messages.    :(
Equations in Assembly: SmplMath

RuiLoureiro

Hi HSE !
            There is no problem with you. We may kid with this things. It is fun !
Have a good work  :t

hutch--

The reason why I have used "fldz" is due to testing. when I have a simple test piece that I know works correctly, you then place another calculation before it and test if the second calculation gets the same result. You can simply turn this on and off with "fldz" between them.

calculation
  fldz          ; effect the following calculation here by commenting in or out
calculation

If the first calculation is sound, the second does not change, if it does change by commenting fldz in or out, then the first calculation has a stack error.

RE: The finit macro, I am yet to see what the problem is using fldz. If I comment out fldz I get incorrect results.

hutch--

Now as far as the use of fldz, I can't find a test that shows any problem unless its the 2 bytes that it takes up. Here is the code from a 64 bit MASM test piece that tests with the fldz in or out and I can find no difference apart from the 2 bytes which I don't lose any sleep over.

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

entry_point proc

    LOCAL fvar1 :REAL8
    LOCAL fvar2 :REAL8
    LOCAL pbuf  :QWORD
    LOCAL buff[32]:BYTE
    LOCAL rslt  :REAL8

    mov pbuf, ptr$(buff)

    fninit
    fldz ; 2 bytes

    mrm fvar1, FLT8(100.0)
    mrm fvar2, FLT8(100.0)

    fld fvar2
    fmul fvar1
    fstp rslt

    invoke fptoa,rslt,pbuf      ; convert rslt to string
    conout pbuf,lf              ; display at console

    waitkey
    .exit

entry_point endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤