News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

Simple floating point macros.

Started by hutch--, August 13, 2018, 04:36:54 PM

Previous topic - Next topic

hutch--

OK, that makes sense, I do it differently, single GLOBAL scope for when I need a global named variable and the FLT4/8/10 for when I need a unique variable attached to a local variable with local scope.

Siekmanski


fpercent MACRO num,pcnt     ;; Get Percentage of number
local F8rcp_100
    .const
    F8rcp_100 real8 0.01
    .code
    fld     pcnt            ;; load required percentage
    fld     num             ;; load the number
    fmul    F8rcp_100
    fmulp                   ;; multiple by percentage
    EXITM   <st(0)>
    ENDM
Creative coders use backward thinking techniques as a strategy.

hutch--

Looks good Marinus, I confess being a maths illiterate that I don't know how reciprocals work.

Siekmanski

A reciprocal is 1/value

This is useful if you want to divide by a fixed number, say 100.0
Then the reciprocal is 1.0/100.0 = 0.01

Instead of dividing by 100.0 you multiply with 0.01
Creative coders use backward thinking techniques as a strategy.

RuiLoureiro

Quote from: hutch-- on August 16, 2018, 08:34:39 AM
Looks good Marinus, I confess being a maths illiterate that I don't know how reciprocals work.
:biggrin:
fld1            ; st(1)= 1.0
fld  num     ; st(0) = num
fdivp          ; st(0) = 1.0/num

The reciprocal of a fraction 4/5 is 5/4 ( n/m <-> m/n)
The reciprocal of x^n  is x^-n 
and the x^-n is x^n.
What is the reciprocal of 1/(1/2) ? is 1/2.

hutch--

 :biggrin:

It just happens to be that my formal logic is much better than my maths, I can in fact do most of the normal things and get reliable results but I don't really have a feel for maths, I just see it as a cipher for crunching numbers. The reciprocals look interesting in that with FP maths you can calculate the fractional sized numbers with a reasonably high degree of precision. In older integer code a mul is a lot faster than a div but I am not sure of the difference with later hardware where you have very fast FP processing units which I would imagine would be used for the old integer code.

I long ago used FP for some simple integer calculations related to the startup size of a window and they have been in the library since about 1998 but have had very little use of FP since then. Laziness meant if I wanted precision maths I had a compiler that routinely did 80 bit floating point calculations.

daydreamer

Quote from: hutch-- on August 16, 2018, 03:24:19 PM
:biggrin:

It just happens to be that my formal logic is much better than my maths, I can in fact do most of the normal things and get reliable results but I don't really have a feel for maths, I just see it as a cipher for crunching numbers. The reciprocals look interesting in that with FP maths you can calculate the fractional sized numbers with a reasonably high degree of precision. In older integer code a mul is a lot faster than a div but I am not sure of the difference with later hardware where you have very fast FP processing units which I would imagine would be used for the old integer code.

I do this because I been influenced with RCPPS/MULPS coding
dont forget the reason you avoid divide by zero error and one FDIV Before innerloop to calculate reciprocal for millions of FMUL's can matter
or combination of one FDIV reciprocal and innerloop that uses lots of SIMD SSE/AVX code
my none asm creations
https://masm32.com/board/index.php?topic=6937.msg74303#msg74303
I am an Invoker
"An Invoker is a mage who specializes in the manipulation of raw and elemental energies."
Like SIMD coding

jj2007

Plain Masm32:include \masm32\include\masm32rt.inc

_1byX macro num
  fld1
  if type(num) eq DWORD
fidiv num
  else
fdiv num
  endif
endm

.data
x100 dd 100
result REAL8 ?

.code
start:

  _1byX FP8(0.25)
  fstp result
  printf("1/0.25=%f\n", result)

  _1byX x100
  fstp result
  printf("1/100=%f\n", result)

  exit

end start

(if your IDE doesn't know that the console must be kept open to see the result, insert an inkey before the exit)

hutch--

This is the form for sequential additions. 1 arg macro, manual and 2 arg macro. The versions with fldz is how I would do loop code, for a single addition, the 2 argument macro version has an extra load but 1 less add.

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    include \masm32\include64\masm64rt.inc

    fpinit MACRO                ;; initialise the x87 co-processor
      fninit
      fldz
    ENDM

    fpadd MACRO arg1, arg2      ;; add a number
      fld arg1
        IFNB <arg2>
          fld arg2
        ENDIF
      faddp
    ENDM

    .code

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

entry_point proc

    LOCAL buff[32]:BYTE
    LOCAL pbuf  :QWORD
    LOCAL addval:QWORD ; REAL8  ; either will do as a LOCAL
    LOCAL rslt  :QWORD ; REAL8

    mrm addval, FLT8(111.111)   ; get a pseudo immediate
    mov pbuf, ptr$(buff)        ; get buffer pointer

  ; -----------------------------
  ; start macro code
  ; -----------------------------
    fpinit

    fpadd addval                ; sequential additions
    fpadd addval
    fpadd addval
    fpadd addval

    fstp rslt                   ; store result & pop

    invoke fptoa,rslt,pbuf      ; convert addval to string
    conout pbuf,lf              ; display at console

  ; -----------------------------
  ; identical manual mnemonic code
  ; -----------------------------
    fldz                        ; with FLDZ

    fld addval
    faddp
    fld addval
    faddp
    fld addval
    faddp
    fld addval
    faddp

    fstp rslt                   ; store result & pop

    invoke fptoa,rslt,pbuf      ; convert addval to string
    conout pbuf,lf              ; display at console

  ; -----------------------------
  ; alternate macro code - no fldz
  ; -----------------------------
    fpadd addval, addval        ; sequential additions
    fpadd addval
    fpadd addval

    fstp rslt                   ; pop stack

    invoke fptoa,rslt,pbuf      ; convert addval to string
    conout pbuf,lf              ; display at console

    waitkey
    .exit

entry_point endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

    end

RuiLoureiro

 :biggrin: 
Hi Hutch
    seems OK, all things are correct and FPU is cleaned at the end and at the end of each step (or case)  :eusa_clap:

jj2007

    fpinit MACRO                ;; initialise the x87 co-processor
      fninit
      fldz
    ENDM


The fldz does nothing useful other than wasting a register. No need for a macro, a simple finit does the job (fninit btw doesn't exist, ML.exe encodes it exactly as finit).

RuiLoureiro

Quote from: jj2007 on August 17, 2018, 07:03:11 AM
    fpinit MACRO                ;; initialise the x87 co-processor
      fninit
      fldz
    ENDM


The fldz does nothing useful other than wasting a register. No need for a macro, a simple finit does the job (fninit btw doesn't exist, ML.exe encodes it exactly as finit).
Hi Jochen,
               For me and for you «does nothing» because we dont need to start a sum with 0 or with one register with 0 and add the next argument, but for someone that starts with sum=0 and then he wants to add the next argument in a loop , it does.
To multiply [a  b  c  d] by [A  E  I  M] one by one and sum i start with sum=a.A then i get b and E
and b.E and sum=a.A+b.E  etc. inside a loop (sum=st(0)) but the first is not inside the loop. This is the problem. If we dont know how to do the sum with a loop we need to start with sum=0.
It seems to be the problem. So we may have 2 solutions one of them is to start with 0.
note: See my procedures to multiply matrices with FPU any size.

jj2007

Quote from: RuiLoureiro on August 17, 2018, 07:46:51 AMfor someone that starts with sum=0 and then he wants to add the next argument in a loop , it does.

Yes, that's correct. And for somebody who starts a loop with multiplications, 1 should be in ST(0) at the beginning. But these cases have nothing to do with a macro called "fpinit". If you want to add several numbers in a loop, start with fldz before the loop but don't use a generic "fpinit" macro.

RuiLoureiro

Quote from: jj2007 on August 17, 2018, 08:05:58 AM
Quote from: RuiLoureiro on August 17, 2018, 07:46:51 AMfor someone that starts with sum=0 and then he wants to add the next argument in a loop , it does.

Yes, that's correct. And for somebody who starts a loop with multiplications, 1 should be in ST(0) at the beginning. But these cases have nothing to do with a macro called "fpinit". If you want to add several numbers in a loop, start with fldz before the loop but don't use a generic "fpinit" macro.
Yes, the best way is to write rules to do sums, multiplications, etc. etc. inside loops ...

hutch--

 :biggrin:

> The fldz does nothing useful other than wasting a register. No need for a macro, a simple finit does the job (fninit btw doesn't exist, ML.exe encodes it exactly as finit).

"fldz" does do something useful, it loads 0.0, I imagine that is why Intel provide the instruction.

> (fninit btw doesn't exist, ML.exe encodes it exactly as finit).

He he, you will have to upgrade to ML64, it produces the correct opcode DBE3h.


.text:0000000140001027 DBE3                       fninit     <<<< HERE !!!!
.text:0000000140001029 D9EE                       fldz
.text:000000014000102b DD8568FFFFFF               fld qword ptr [rbp-0x98]
.text:0000000140001031 DEC1                       faddp st(1)
.text:0000000140001033 DD8568FFFFFF               fld qword ptr [rbp-0x98]
.text:0000000140001039 DEC1                       faddp st(1)
.text:000000014000103b DD8568FFFFFF               fld qword ptr [rbp-0x98]
.text:0000000140001041 DEC1                       faddp st(1)
.text:0000000140001043 DD8568FFFFFF               fld qword ptr [rbp-0x98]
.text:0000000140001049 DEC1                       faddp st(1)
.text:000000014000104b DD9D60FFFFFF               fstp qword ptr [rbp-0xa0]