News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

Simple floating point macros.

Started by hutch--, August 13, 2018, 04:36:54 PM

Previous topic - Next topic

RuiLoureiro

Quote from: hutch-- on August 14, 2018, 04:24:42 AM
Here is the next try. Removal of redundant variable, fninit to reset FPU, (did not see the point of the fwait) and fldz as it is more efficient that loading the memory operand. The dot prefix in the two macros is only so it does not clash with existing.

    .ldst0 MACRO var
      fld st(0)                   ;; load st(0)  <<<- get/make a copy of st(0)
                                   ;;                         the previous is now st(1)=st(0)
      fstp var                    ;; store it in variable <<<-- and remove st(0)
                                    ;;                         the previous st(1) is now st(0)
    ENDM

    .fpadd MACRO arg1
      fld arg1
      faddp
    ENDM

    .data
      addme REAL8 111.111

    .code

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

entry_point proc

    LOCAL pbuf  :QWORD
    LOCAL buff[32]:BYTE
    LOCAL fpval :REAL8

    mov pbuf, ptr$(buff)        ; get buffer pointer           <<<< YES

    fninit                      ; clear FPU registers and flags    <<<< YES
    fldz                        ; zero st(0)                               <<<< YES

    .fpadd addme                ; add value to st(0)            <<<< YES
    .fpadd addme
    .fpadd addme
   ;-----------------------------------------------------------
   ; This is "get a copy of st(0)" and store it into variable
   ;-----------------------------------------------------------
    .ldst0 fpval                ; load st(0) into variable

    ;----------------------------------------------------------
    ; HERE the FPU has another equal st(0) inside
    ; to remove it we do:   «fstp   st»
    ;----------------------------------------------------------
    invoke fptoa,fpval,pbuf     ; convert fpval to string         <<<< YES
    conout pbuf,lf              ; display at console                    <<<< YES

    waitkey
    .exit

entry_point endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    end

comment #

.text:0000000140001016 DBE3                       fninit
.text:0000000140001018 D9EE                       fldz
.text:000000014000101a DD053C100000               fld qword ptr [0x14000205c]
.text:0000000140001020 DEC1                       faddp st(1)
.text:0000000140001022 DD0534100000               fld qword ptr [0x14000205c]
.text:0000000140001028 DEC1                       faddp st(1)
.text:000000014000102a DD052C100000               fld qword ptr [0x14000205c]
.text:0000000140001030 DEC1                       faddp st(1)
.text:0000000140001032 D9C0                       fld st(0)
.text:0000000140001034 DD9D70FFFFFF               fstp qword ptr [rbp-0x90]

#

Thats OK with a little problem  :t

HSE

Hutch!! Why You are not sleeping? :biggrin:

    .ldst0 MACRO var
      fld st(0)                   ;; load st(0)
      fstp var                    ;; store it in variable
    ENDM

It's just:
             fst var    ; store st(0) in variable

Equations in Assembly: SmplMath

hutch--

 :biggrin:

He he, I will be shortly.

Rui,

Is this what you meant ?

    .ldst0 MACRO var
      fld st(0)                   ;; load st(0)
      fstp var                    ;; store it in variable
      fstp st(0)                  ;; pop st(0)
    ENDM

RuiLoureiro

#18
Quote from: hutch-- on August 14, 2018, 04:56:03 AM
:biggrin:

He he, I will be shortly.

Rui,

Is this what you meant ?

    .ldst0 MACRO var
      fld st(0)               ;; load st(0)   <<<<<-- the best way is "get a copy of st(0)"
                                ;; or load the current st(0) to a new st(0). The previous is now
                                ;; st(1) and is equal to st(0)
      fstp var               ;; store it in variable
      fstp st(0)            ;; pop st(0)   <<<<-- or remove the current st(0).
    ENDM

No, we need to use only fstp var and not .ldst0. We dont need to add fld st(0) and at the end to do fstp st(0).
So use only fstp var.
 
fld   st(0) is a trap because we think that we load st(0) etc. etc. But at the time we write fld st(0) there is an st(0) and then we make a copy of that st(0) to a new st(0) with fld st(0). So the previous st(0) is now st(1) and is equal st(0) after fld st(0).
Remember that when we need one copy of st(1), st(2) etc,  we do fld  st(1), fld st(2) etc.
And the previous st(1) or st(2) is now the new st(0) and st(1) is now st(2) and st(2) is now st(3) etc. etc.

About the code, we dont need to start with fldz. The better way is to start with
fpadd   addme, addme and next .fpadd addme.
fldz is used when we want to compare st(0) with 0.0 (or another register).

Dont forget also that when we remove one st(0), the previous st(1) is the new st(0). If we have not any previous st(1) the FPU is cleaned (no variables inside).

HSE

There is no purpose to make a macro to load or to store st(0). Have a little more sense if you are thinking to retrive other fpu register:   .ldst MACRO register, var
      fld st(&register)                   ;; load st(?)
      fstp var                    ;; store it in variable
    ENDM


use:  fst var1 (better than    .ldst 0, var1)
         .ldst 1, var2
         .ldst 2, var3

The final fstp it's not in the macro.

Normal code:    fld fpbuff
    fadd addme
    fadd addme
    fadd addme
    fstp fpbuff

Not normal code (but no error):    fld fpbuff
    fadd addme
    fadd addme
    fadd addme
    fst fpbuff
    fstp


Strange Hutch idea:
    fld fpbuff
    fadd addme
    fadd addme
    fadd addme
    .ldst 0, fpbuff
    fstp ; most assemblers know that this is fstp st(0)

Equations in Assembly: SmplMath

jj2007

Quote from: hutch-- on August 14, 2018, 03:46:13 AMWTF ?

Test them. Same syntax as the original version, just a bit more versatile because you can use REAL* and/or dword variables.

hutch--

I think I have got the swing of most of it, effectively the x87 register stack functions like a circular buffer and the trick is to ensure that you do not imbalance the FPU stack. When I am a little more awake I will add the version of the test code that looks like its close to being reasonably efficient for what need to be simple to use macros.

To Rui and HSE, thank you both for your assistance in blowing out the cobwebs, I looked at the date of the last FP stuff I did and its around the year 2000 so it really has been a long time since I wrote any x87 code.

raymond

Hutch,

Sorry for being a bit harsh, but unfortunately you are playing with fire trying to "use" the FPU without knowing what you are doing.

One of the biggest trap for such action is that FPU registers are very different from the ALU registers: they CANNOT BE OVERWRITTEN with new data except in very specific circumstances. Trying to load new data when all 8 registers are full would result in generating GARBAGE. That is why keeping track of register usage is extremely important and offering macros which would leave results on the FPU may not be the best tool for fpu newbies.

Emulating what HLLs do with floats would be a better idea, i.e. do calculations on data from memory and return the result immediately to a memory variable, leaving all FPU registers EMPTY at the termination of each macro. The other option is to insert a 'slow' finit at the start of each macro to ensure that the user will never have any problem.

If you intend to continue with this project, you may want to:
i) have at least a quick glance at the tutorial you had asked me to prepare many moons ago,
ii) consider including the use of floats interacting with integers,
iii) design macros which will cater for the multitude of combinations of the size of each variable.
Whenever you assume something, you risk being wrong half the time.
http://www.ray.masmcode.com

hutch--

Hi Ray,

I have already taken option 1, have the link set up in my browser. The only 64 bit conversions I could find in the C runtime are set at REAL8 so at the moment I have 2 functions, atofp and fptoa that successfully convert REAL8 in both directions. As the x87 capacity is not specified in Win64, my choice is to try and use it OR simply ignore it and as x87 capacity is useful for people who want floating point maths rather than video, its probably worth a try to get at least some simple macros going.

I am already using the 8 MMX registers for other tasks as MMX is a redundant technology and the methods of using the shared registers are straight forward enough but with a choice of "play with fire" or ignore x87 code, I will at least give it a try as noone else is going to do it in 64 bit MASM. For what its worth, the macros and support code are testing up OK at the moment, both Rui and HSE have been very helpful in tidying up the test pieces I have posted and once I have a bit more code up and running, I will give it a serious bashing to make sure it works correctly.

hutch--

This is the first test piece with the later macros. It cannot be built as there are macros that have not been published yet but it works fine and handles 500 million iterations with no problems so the stack does not go BANG.

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    include \masm32\include64\masm64rt.inc

    fpinit MACRO
      fninit                      ;; clear FPU registers and flags
      fldz                        ;; zero st(0)
    ENDM

  ; -------------------------------

    fpdiv MACRO arg1,arg2
      fld arg1
      fld arg2
      fdivp
    ENDM

    fpmul MACRO arg1,arg2
      fld arg1
      fld arg2
      fmulp
    ENDM

    fpadd MACRO arg1
      fld arg1
      faddp
    ENDM

    fpsub MACRO arg1
      fld arg1
      fsubp
    ENDM

  ; -------------------------------

    fpsqrt MACRO number,target
      fld number
      fsqrt
      fstp target
    ENDM

    .code

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

entry_point proc

    LOCAL fpval :REAL8
    LOCAL pbuf  :QWORD
    LOCAL buff[32]:BYTE

    mov pbuf, ptr$(buff)        ; get buffer pointer

    addme = FLT8(1.0)           ; statement form

    fpinit                      ; initialise FPU & set st(0) to 0.0

  ; -----------------------------

    mov r11, 100000000          ; 100 million iterations, 500 million macro calls
  @@:
    fpadd addme                 ; add value to st(0)
    fpadd addme
    fpadd addme
    fpadd addme
    fpadd addme
    sub r11, 1
    jnz @B

  ; -----------------------------

    fpsub FLT8(1.0)             ; function form

    fstp fpval                  ; load st(0) into variable
   
    invoke fptoa,fpval,pbuf     ; convert fpval to string
    conout pbuf,lf              ; display at console

    waitkey
    .exit

entry_point endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    end

comment #

.text:0000000140001016 DBE3                       fninit
.text:0000000140001018 D9EE                       fldz
.text:000000014000101a 49C7C300E1F505             mov r11, 0x5f5e100
.text:0000000140001021
.text:0000000140001021 0x140001021:
.text:0000000140001021 DD0539100000               fld qword ptr [0x140002060]
.text:0000000140001027 DEC1                       faddp st(1)
.text:0000000140001029 DD0531100000               fld qword ptr [0x140002060]
.text:000000014000102f DEC1                       faddp st(1)
.text:0000000140001031 DD0529100000               fld qword ptr [0x140002060]
.text:0000000140001037 DEC1                       faddp st(1)
.text:0000000140001039 DD0521100000               fld qword ptr [0x140002060]
.text:000000014000103f DEC1                       faddp st(1)
.text:0000000140001041 DD0519100000               fld qword ptr [0x140002060]
.text:0000000140001047 DEC1                       faddp st(1)
.text:0000000140001049 4983EB01                   sub r11, 0x1
.text:000000014000104d 75D2                       jne 0x140001021
.text:000000014000104d
.text:000000014000104f DD0513100000               fld qword ptr [0x140002068]
.text:0000000140001055 DEE9                       fsubp st(1)
.text:0000000140001057 DD5D98                     fstp qword ptr [rbp-0x68]

#

raymond

Quotefpdiv MACRO arg1,arg2
      fld arg1
      fld arg2
      fdivp
    ENDM

How will the assembler know the actual size of arg1 and/or arg2 (i.e. REAL4, REAL8 or REAL10) in order to produce the appropriate code?
Or is it your intention to limit the use to only one specific size? And where would that be specified in clear terms for someone who may know absolutely nothing about floats apart from that it contains a decimal point followed by some decimal digits?
Whenever you assume something, you risk being wrong half the time.
http://www.ray.masmcode.com

hutch--

I thought that would be obvious, FLD works on REAL4, REAL8 and REAL10, the macro does not need the size information. The data item size is determined by how the argument is produced. Now given that FP does not support immediate values, an immediate value must be written in the initialised data section to a data variable where you must specify the size.

You can write everything else as LOCAL values,  "LOCAL var :REAL10".

The only data size limitation I have at the moment is the C runtime conversions that only handle REAL4 and REAL8 but have no 80 bit support.

RuiLoureiro

Quote from: hutch-- on August 14, 2018, 01:23:41 PM
This is the first test piece with the later macros. It cannot be built as there are macros that have not been published yet but it works fine and handles 500 million iterations with no problems so the stack does not go BANG.

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    include \masm32\include64\masm64rt.inc

    fpinit MACRO
      fninit                      ;; clear FPU registers and flags
      fldz                        ;; zero st(0)
    ENDM

  ; -------------------------------

    fpdiv MACRO arg1,arg2
      fld arg1
      fld arg2
      fdivp
    ENDM

    fpmul MACRO arg1,arg2
      fld arg1
      fld arg2
      fmulp
    ENDM

    fpadd MACRO arg1,arg2
      fld arg1
      fld arg2
      faddp
    ENDM

    fpsub MACRO arg1,arg2
      fld arg1
      fld arg2
      fsubp
    ENDM

.fpdiv MACRO arg1     
         fld arg1     
         fdivp   
ENDM   

.fpmul MACRO arg1     
          fld arg1     
          fmulp   
ENDM

.fpadd MACRO arg1     

           fld arg1     
           faddp   
ENDM   

.fpsub MACRO arg1     
          fld arg1     
          fsubp   
ENDM
  ; -------------------------------

    fpsqrt MACRO number,target
      fld number
      fsqrt
      fstp target
    ENDM

    .code

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

entry_point proc

    LOCAL fpval :REAL8
    LOCAL pbuf  :QWORD
    LOCAL buff[32]:BYTE

    mov pbuf, ptr$(buff)        ; get buffer pointer

    addme = FLT8(1.0)           ; statement form

    ;fpinit                      ; initialise FPU & set st(0) to 0.0
     finit                        ; initialise FPU
  ; -----------------------------

    mov r11, 100000000          ; 100 million iterations, 500 million macro calls

    fpadd addme,addme           ; st(0)=addme+addme=2*addme

@@:
    .fpadd addme
    .fpadd addme
    .fpadd addme
;    .fpadd addme
    sub r11, 1
    jnz @B

  ; -----------------------------

    .fpsub FLT8(1.0)             ; function form

    fstp fpval                  ; load st(0) into variable
   
    invoke fptoa,fpval,pbuf     ; convert fpval to string
    conout pbuf,lf              ; display at console

    waitkey
    .exit

entry_point endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    end

comment #

.text:0000000140001016 DBE3                       fninit
.text:0000000140001018 D9EE                       fldz
.text:000000014000101a 49C7C300E1F505             mov r11, 0x5f5e100
.text:0000000140001021
.text:0000000140001021 0x140001021:
.text:0000000140001021 DD0539100000               fld qword ptr [0x140002060]
.text:0000000140001027 DEC1                       faddp st(1)
.text:0000000140001029 DD0531100000               fld qword ptr [0x140002060]
.text:000000014000102f DEC1                       faddp st(1)
.text:0000000140001031 DD0529100000               fld qword ptr [0x140002060]
.text:0000000140001037 DEC1                       faddp st(1)
.text:0000000140001039 DD0521100000               fld qword ptr [0x140002060]
.text:000000014000103f DEC1                       faddp st(1)
.text:0000000140001041 DD0519100000               fld qword ptr [0x140002060]
.text:0000000140001047 DEC1                       faddp st(1)
.text:0000000140001049 4983EB01                   sub r11, 0x1
.text:000000014000104d 75D2                       jne 0x140001021
.text:000000014000104d
.text:000000014000104f DD0513100000               fld qword ptr [0x140002068]
.text:0000000140001055 DEE9                       fsubp st(1)
.text:0000000140001057 DD5D98                     fstp qword ptr [rbp-0x68]

#

Hi Hutch,
              This code works correctly and it is correct. No doubts. But we never need to start with fldz in any case. So we are adding that instruction for nothing.
If we define fpadd and fpsub the same way as fpmul and fpdiv plus .fpadd, .fpsub, .fpmul, .fpdiv of one argument we start with instructions of 2 arguments and we go on with instructions of 1 argument. It is what i did above. You may try to run it. It should work corrctly and the FPU is cleaned at the end.



hutch--

Thanks Rui, I will give it a blast a bit later. This is the latest one. I think the first one is the better of the two but the commented out one work OK as well. I have found another use for "fldz", if the results of following calculations are turning out wrong, place a fldz before it and if it corrects the following result, the calculation before it needs to be fixed.

    fpercent MACRO num,pcnt     ;; Get Percentage of number
     fld num                    ;; load the number
     fld FLT8(100.0)            ;; load the 100 divider
     fdivp                      ;; divide num by 100
     fld pcnt                   ;; load required percentage
     fmulp                      ;; multiple by percentage
    ENDM

;     fpercent MACRO num,pcnt     ;; Get Percentage of number
;      fld num                    ;; load the number
;      fld pcnt                   ;; load required percentage
;      fmulp                      ;; multiple by percentage
;      fld FLT8(100.0)            ;; load the 100 divider
;      fdivp                      ;; divide num by 100
;     ENDM

RuiLoureiro

#29
fpercent seems to be OK, doenst give any problem.
Let me say that when we want to use an integer constant we load it this way

fpconst    macro   cst
push       cst    ;100
fild         dword ptr [esp]   ; load 100 into st(0)
pop         eax                  ; remove from stack
endm

so we may do this also

fld num
fpconst 100
fdivp
fld   pcnt
fmulp    ; st(0)= (num/100)*pcnt

Another way for fpercent:

fld pnct            ; load pnct     <<<<<< st(2) >> st(1)     >>> removed
fld num            ; load num     <<<<<< st(1) >>removed
fld FLT8(100.0) ; load constant 100  << st(0) >>removed
fdivp                ; st(0)= num/100.0    >>>>>>> st(0)     >>> removed
fmulp               ; st(0)= (num/100.0)*pcnt  >>>>>>>>>>>> st(0)

note: this last code seems to be better because first it loads the 3 factors and then we do the operations. The macros you are written will be efficient code !  :biggrin: