News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Simple floating point macros.

Started by hutch--, August 13, 2018, 04:36:54 PM

Previous topic - Next topic

hutch--

I need to add at least some simple support for floating point in 64 bit MASM and since there are at least a few people who actually understand how the old co-processor works, I wondered if there was a better way to do these simple calculations. The criterion is to perform each calculation and leave the result popped into st(0) for further calculations. The input data has to be valid and as FP does not support immediate values, the old macros from the 32 bit version of MASM handle that OK with only some minor alignment changes.

Win64 does not specify co-processor performance or FP register usage but I have tried to keep this available for folks who want to do calculations for maths rather than video tasks.

    fpadd MACRO arg1,arg2
      fld arg1
      fld arg2
      faddp
    ENDM

    fpsub MACRO arg1,arg2
      fld arg1
      fld arg2
      fsubp
    ENDM

    fpdiv MACRO arg1,arg2
      fld arg1
      fld arg2
      fdivp
    ENDM

    fpmul MACRO arg1,arg2
      fld arg1
      fld arg2
      fmulp
    ENDM

    fpsqrt MACRO number,target
      fld number
      fsqrt
      fstp target
    ENDM

daydreamer

I like to use old fpu,sometimes ,maybe you should follow the old ways of fpu mnemonic ,by have both FPMUL and FPMULP macros
but if you have content on fpu stack you might need support MACROS that take only one argument also to make it work

I also want to suggest a FPRCP reciprocal macro,it speeds up both fpu math and SIMD math and helps your coding,so you dont need to spend much time with calculator and type in long reciprocal numbers
FSQRT is both great for students make phytagoras calculation program,but also to add simplest round light function on bitmap
my none asm creations
https://masm32.com/board/index.php?topic=6937.msg74303#msg74303
I am an Invoker
"An Invoker is a mage who specializes in the manipulation of raw and elemental energies."
Like SIMD coding

hutch--

What I have done with these is to pop the result back into st(0) so that it is a consistent interface where one function can follow after the other.

    fpadd st(0), addme
    fpadd st(0), addme
    fpadd st(0), addme

RuiLoureiro

#3
Quote from: hutch-- on August 13, 2018, 04:36:54 PM

    fpadd MACRO arg1,arg2
      fld arg1
      fld arg2
      faddp
    ENDM

    fpadd  st(0), addme
    fpadd  st(0), addme
    fpadd  st(0), addme
Hutch,
          I dont know what you want to do but
          the last code means this:

          fld   st(0)     ; get a copy to st(0)  <<<- why to load the first st(0) ?
          fld   addme  ; load addme
          faddp           ; do st(1)+st(0) = st(0)
          ;
          fld    st(0)    ; get a copy to st(0)    <<<- why to get a copy ?
          fld    addme ; load addme
          faddp           ; do st(1)+st(0) = st(0)
          ;
          fld    st(0)      ; get a copy to st(0)    <<<- why to get a copy again ?
          fld    addme  ; load addme
          faddp             ; do st(1)+st(0) = st(0)
          ;
          ; So we have st(1) and st(0)=st(0)+addme+addme+addme inside FPU
          ; where st(1) is the first st(0). Do we need to preserve the first st(0) ?
          ;----------------------------------------------------------------------------------
          ;
          ; another way
          ;--------------
          fld   firstst0   ; first st(0)
          fld   addme   ; load addme
          faddp           ; do st(1)+st(0) = st(0)= firstst0+addme
          ;
          fld    addme ; load addme
          faddp           ; do st(1)+st(0) = st(0)=firstst0+addme+addme
          ;
          fld    addme ; load addme
          faddp           ; do st(1)+st(0) = st(0)=firstst0+addme+addme+addme
          ;
          ; So we have only st(0) inside FPU

HSE

I don't know what you mean by old co-procesors, but last 20 years you can make:
    fpadd MACRO arg1,arg2
      fld arg1
      fadd arg2
    ENDM

    fpsub MACRO arg1,arg2
      fld arg1
      fsub arg2
    ENDM

    fpdiv MACRO arg1,arg2
      fld arg1
      fdiv arg2
    ENDM

    fpmul MACRO arg1,arg2
      fld arg1
      fmul arg2
    ENDM
Equations in Assembly: SmplMath

hutch--

 :biggrin:

He he, my first processor with a co-processor was in a i486 that cost me a fortune in about 1990. The co-processor has been around that long so 28 years says its old but for folks who want maths rather than just video processing, it is still a very useful capacity.

Rui,

You are right but in the first place I wanted each macro to be complete dumping the result into st(0). You are correct that in continuous code you would not keep loading st(0). What I am after is making the use simple.

HSE

Quote from: hutch-- on August 13, 2018, 11:48:05 PM
He he, my first processor with a co-processor was in a i486 that cost me a fortune in about 1990.
Yes, I used a software FPU emulator before 486dx but, belive me, I don't remember if was posible to make "fadd memory" at that time  :biggrin:.  (Was my time of GWbasic, just some specific programs required FPU)
Equations in Assembly: SmplMath

RuiLoureiro

#7
Quote from: hutch-- on August 13, 2018, 11:48:05 PM
:biggrin:
...
Rui,

You are right but in the first place I wanted each macro to be complete dumping the result into st(0). You are correct that in continuous code you would not keep loading st(0). What I am after is making the use simple.
Hutch,
           That set of macros, i think they are useful, but now add another set with only 1 argument and the operation instruction. In that way it doesnt copy st(0). Something like this:

fpadd1   macro  arg1
             fld      arg1
             fadd   
endm

So we start with
                           finit
                           fpadd   addme0, addme
and then we use
                           fpadd1  addme
                           fpadd1  addme.     ; st(0)=addme0+addme+addme+addme

and we have not any st(1) inside FPU.

When the macro name ends with 1 we know that we want to use st(0) inside FPU operated with that new instruction.
                                    fpadd    A,B
                                    fpmul1   C          ; st(0)= C*(A+B)  and no st(1) inside

When we need to preserve the last st(0) we use the macro with 2 arguments.
It is only my opinion.

hutch--

Rui,

Is this what you mean ? It produces the correct result and only loads st(0) at the end as you suggested..

    .data
      fpbuff REAL8 0.0
      addme REAL8 111.111

    .code

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

entry_point proc

    LOCAL pbuf  :QWORD
    LOCAL buff[32]:BYTE

    mov pbuf, ptr$(buff)

    fld fpbuff

    fld addme
    faddp
    fld addme
    faddp
    fld addme
    faddp
    fld st(0)
    fstp fpbuff
   
    invoke fptoa,fpbuff,pbuf
    conout pbuf,lf

    waitkey
    .exit

entry_point endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

This is the disassembly.

.text:0000000140001016 DD0540100000               fld qword ptr [0x14000205c]
.text:000000014000101c DD0542100000               fld qword ptr [0x140002064]
.text:0000000140001022 DEC1                       faddp st(1)
.text:0000000140001024 DD053A100000               fld qword ptr [0x140002064]
.text:000000014000102a DEC1                       faddp st(1)
.text:000000014000102c DD0532100000               fld qword ptr [0x140002064]
.text:0000000140001032 DEC1                       faddp st(1)
.text:0000000140001034 D9C0                       fld st(0)
.text:0000000140001036 DD1D20100000               fstp qword ptr [0x14000205c]

RuiLoureiro

Quote from: hutch-- on August 14, 2018, 02:44:12 AM
Rui,

Is this what you mean ? It produces the correct result and only loads st(0) at the end as you suggested..

;----------------------------------------------------------------------
fpadd       MACRO  arg1, arg2
               fld     arg1
               fld     arg2
               faddp
ENDM
;----------------------------------------------------------------------
fpadd1     MACRO  arg1    ; let me say that i prefer fpadd and fpadd1
               fld    arg1
               faddp
ENDM
;+++++++++++++++++++++++++++++++++++++++
    .data
      fpbuff REAL8 0.0          ; we dont need to load this variable. If we want 0.0 we use fldz
                                        ; <---  this is only the output variable

      addme REAL8 111.111 ; <<<--- this is the input variable

    .code

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

entry_point proc

    LOCAL pbuf  :QWORD
    LOCAL buff[32]:BYTE

    mov pbuf, ptr$(buff)
    finit

                         ;fld fpbuff
                         ;fld addme
                         ;faddp


   ; start and add 2 arguments
   ;------------------------------
    fpadd   addme, addme   ; st(0)= 2*addme


                         ;fld addme
                         ;faddp


    ; now add another argument
    ;-------------------------------
    fpadd1  addme               ; st(0)= 3*addme

                         ;fld addme
                         ;faddp
                         ;fld st(0)    <<<<- THIS HERE CREATE a new st(1)
                         ;              we may do it only if we need to go on with another operation
                         ;               that needs this value


    ;----remove st(0) to fpbuff and the FPU is cleaned------
    fstp    fpbuff
   
    invoke fptoa,fpbuff, pbuf  ; <<<--- convert to pbuf
    conout pbuf,lf

    waitkey
    .exit

entry_point endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

This is the disassembly.

.text:0000000140001016 DD0540100000               fld qword ptr [0x14000205c]
.text:000000014000101c DD0542100000               fld qword ptr [0x140002064]
.text:0000000140001022 DEC1                       faddp st(1)
.text:0000000140001024 DD053A100000               fld qword ptr [0x140002064]
.text:000000014000102a DEC1                       faddp st(1)
.text:000000014000102c DD0532100000               fld qword ptr [0x140002064]
.text:0000000140001032 DEC1                       faddp st(1)
.text:0000000140001034 D9C0                       fld st(0)
.text:0000000140001036 DD1D20100000               fstp qword ptr [0x14000205c]

No it was not my suggestion. Try now.

jj2007

fpadd MACRO arg1,arg2
  ifdifi <arg1>, <ST(0)>
if type(arg1) eq REAL4 or type(arg1) eq REAL8 or type(arg1) eq REAL10
fld arg1
else
fild arg1
endif
  endif
  if type(arg2) eq REAL4 or type(arg2) eq REAL8
fadd arg2              ; <<<<<<<<<<<<<< see remark by HSE above
  elseif type(arg2) eq REAL10
.err <REAL10 not allowed for second arg>
  else
fiadd arg2
  endif
ENDM


Only ST(7) is being used. Test code (MyDD are dwords, MyR4 is 1000.0):  fpadd MyR8a, MyR8b
  Print Str$("MyR8a+MyR8b=%f\n", ST(0)v)

  fpadd MyR8a, MyDDb
  Print Str$("MyR8a+MyDDb=%f\n", ST(0)v)

  fpadd MyDDa, MyR8b
  Print Str$("MyDDa+MyR8b=%f\n", ST(0)v)

  fpadd MyDDa, MyDDb
  Print Str$("MyDDa+MyDDb=%f\n", ST(0))

  fpadd ST(0), MyR4
  Print Str$("ST(0)+MyR4= %f\n", ST(0)v)


Results:MyR8a+MyR8b=777.7770
MyR8a+MyDDb=777.4560
MyDDa+MyR8b=777.3210
MyDDa+MyDDb=777.0000
ST(0)+MyR4= 7777.778


Testbed attached - MasmBasic, sorry.

Quote from: hutch-- on August 13, 2018, 05:25:12 PM
    fpadd st(0), addme

Implemented but the syntax is longer than fadd addme - a matter of taste maybe ;)

hutch--

Rui,

I tend to try things incrementally, first try was to remove the unnecessary st(0) loads which was your suggestion. Just remember I have not used this stuff for many years.

I will try out more of your suggestions as I find the instructions.

JJ,

WTF ?

hutch--

Here is the next try. Removal of redundant variable, fninit to reset FPU, (did not see the point of the fwait) and fldz as it is more efficient that loading the memory operand. The dot prefix in the two macros is only so it does not clash with existing.

    .ldst0 MACRO var
      fld st(0)                   ;; load st(0)
      fstp var                    ;; store it in variable
    ENDM

    .fpadd MACRO arg1
      fld arg1
      faddp
    ENDM

    .data
      addme REAL8 111.111

    .code

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

entry_point proc

    LOCAL pbuf  :QWORD
    LOCAL buff[32]:BYTE
    LOCAL fpval :REAL8

    mov pbuf, ptr$(buff)        ; get buffer pointer

    fninit                      ; clear FPU registers and flags
    fldz                        ; zero st(0)

    .fpadd addme                ; add value to st(0)
    .fpadd addme
    .fpadd addme

    .ldst0 fpval                ; load st(0) into variable
   
    invoke fptoa,fpval,pbuf     ; convert fpval to string
    conout pbuf,lf              ; display at console

    waitkey
    .exit

entry_point endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    end

comment #

.text:0000000140001016 DBE3                       fninit
.text:0000000140001018 D9EE                       fldz
.text:000000014000101a DD053C100000               fld qword ptr [0x14000205c]
.text:0000000140001020 DEC1                       faddp st(1)
.text:0000000140001022 DD0534100000               fld qword ptr [0x14000205c]
.text:0000000140001028 DEC1                       faddp st(1)
.text:000000014000102a DD052C100000               fld qword ptr [0x14000205c]
.text:0000000140001030 DEC1                       faddp st(1)
.text:0000000140001032 D9C0                       fld st(0)
.text:0000000140001034 DD9D70FFFFFF               fstp qword ptr [rbp-0x90]

#

RuiLoureiro

Quote from: hutch-- on August 14, 2018, 03:46:13 AM
Rui,

I tend to try things incrementally, first try was to remove the unnecessary st(0) loads which was your suggestion. Just remember I have not used this stuff for many years.

I will try out more of your suggestions as I find the instructions.

JJ,

WTF ?
:biggrin:
No problems, i am trying to give suggestions but ... no more.

About fpsqrt sometimes we need to do sqrt of st(0).
For example, to solve the equation A.x^2+B.x+C=0 we need to do sqrt(B^2-4.A.C). So after we have  st(0)=B^2-4.A.C  we need to do sqrt of st(0).

HSE

@Hutch:

Why not:
    fld fpbuff

    fadd addme
    fadd addme
    fadd addme
    fld st(0)       ;   here you also are pushing st(0) to st(1)
    fstp fpbuff
    fstp            <-- you forget this ( that was in st(1))


@JJ:
I agree. It's posible replace
elseif type(arg2) eq REAL10
.err <REAL10 not allowed for second arg>

with
elseif type(arg2) eq REAL10
fld arg2
        faddp     ;  or fsubp

But it not recomended to use REAL10 like variables. The idea is only to use that to store FPU state.

Equations in Assembly: SmplMath