Simple floating point macros.

hutch-- · August 16, 2018, 06:51:32 AM

OK, that makes sense, I do it differently, single GLOBAL scope for when I need a global named variable and the FLT4/8/10 for when I need a unique variable attached to a local variable with local scope.

Siekmanski · August 16, 2018, 07:11:01 AM

Code Select

fpercent MACRO num,pcnt     ;; Get Percentage of number
local F8rcp_100
    .const
    F8rcp_100 real8 0.01
    .code
    fld     pcnt            ;; load required percentage
    fld     num             ;; load the number
    fmul    F8rcp_100 
    fmulp                   ;; multiple by percentage
    EXITM   <st(0)>
    ENDM

hutch-- · August 16, 2018, 08:34:39 AM

Looks good Marinus, I confess being a maths illiterate that I don't know how reciprocals work.

Siekmanski · August 16, 2018, 08:44:29 AM

A reciprocal is 1/value

This is useful if you want to divide by a fixed number, say 100.0
Then the reciprocal is 1.0/100.0 = 0.01

Instead of dividing by 100.0 you multiply with 0.01

RuiLoureiro · August 16, 2018, 08:48:22 AM

Quote from: hutch-- on August 16, 2018, 08:34:39 AM
Looks good Marinus, I confess being a maths illiterate that I don't know how reciprocals work.

fld1 ; st(1)= 1.0
fld num ; st(0) = num
fdivp ; st(0) = 1.0/num

The reciprocal of a fraction 4/5 is 5/4 ( n/m <-> m/n)
The reciprocal of x^n is x^-n
and the x^-n is x^n.
What is the reciprocal of 1/(1/2) ? is 1/2.

hutch-- · August 16, 2018, 03:24:19 PM

It just happens to be that my formal logic is much better than my maths, I can in fact do most of the normal things and get reliable results but I don't really have a feel for maths, I just see it as a cipher for crunching numbers. The reciprocals look interesting in that with FP maths you can calculate the fractional sized numbers with a reasonably high degree of precision. In older integer code a mul is a lot faster than a div but I am not sure of the difference with later hardware where you have very fast FP processing units which I would imagine would be used for the old integer code.

I long ago used FP for some simple integer calculations related to the startup size of a window and they have been in the library since about 1998 but have had very little use of FP since then. Laziness meant if I wanted precision maths I had a compiler that routinely did 80 bit floating point calculations.

daydreamer · August 16, 2018, 06:26:12 PM

Quote from: hutch-- on August 16, 2018, 03:24:19 PM

It just happens to be that my formal logic is much better than my maths, I can in fact do most of the normal things and get reliable results but I don't really have a feel for maths, I just see it as a cipher for crunching numbers. The reciprocals look interesting in that with FP maths you can calculate the fractional sized numbers with a reasonably high degree of precision. In older integer code a mul is a lot faster than a div but I am not sure of the difference with later hardware where you have very fast FP processing units which I would imagine would be used for the old integer code.

I do this because I been influenced with RCPPS/MULPS coding
dont forget the reason you avoid divide by zero error and one FDIV Before innerloop to calculate reciprocal for millions of FMUL's can matter
or combination of one FDIV reciprocal and innerloop that uses lots of SIMD SSE/AVX code

jj2007 · August 16, 2018, 06:40:54 PM

Plain Masm32:

Code Select

include \masm32\include\masm32rt.inc

_1byX macro num
  fld1
  if type(num) eq DWORD
	fidiv num
  else
	fdiv num
  endif
endm

.data
x100	dd 100
result	REAL8 ?

.code
start:

  _1byX FP8(0.25)
  fstp result
  printf("1/0.25=%f\n", result)

  _1byX x100
  fstp result
  printf("1/100=%f\n", result)

  exit

end start

(if your IDE doesn't know that the console must be kept open to see the result, insert an inkey before the exit)

hutch-- · August 17, 2018, 03:42:18 AM

This is the form for sequential additions. 1 arg macro, manual and 2 arg macro. The versions with fldz is how I would do loop code, for a single addition, the 2 argument macro version has an extra load but 1 less add.

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

include \masm32\include64\masm64rt.inc

fpinit MACRO ;; initialise the x87 co-processor
fninit
fldz
ENDM

fpadd MACRO arg1, arg2 ;; add a number
fld arg1
IFNB <arg2>
fld arg2
ENDIF
faddp
ENDM

.code

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

entry_point proc

LOCAL buff[32]:BYTE
LOCAL pbuf :QWORD
LOCAL addval:QWORD ; REAL8 ; either will do as a LOCAL
LOCAL rslt :QWORD ; REAL8

mrm addval, FLT8(111.111) ; get a pseudo immediate
mov pbuf, ptr$(buff) ; get buffer pointer

; -----------------------------
; start macro code
; -----------------------------
fpinit

fpadd addval ; sequential additions
fpadd addval
fpadd addval
fpadd addval

fstp rslt ; store result & pop

invoke fptoa,rslt,pbuf ; convert addval to string
conout pbuf,lf ; display at console

; -----------------------------
; identical manual mnemonic code
; -----------------------------
fldz ; with FLDZ

fld addval
faddp
fld addval
faddp
fld addval
faddp
fld addval
faddp

fstp rslt ; store result & pop

invoke fptoa,rslt,pbuf ; convert addval to string
conout pbuf,lf ; display at console

; -----------------------------
; alternate macro code - no fldz
; -----------------------------
fpadd addval, addval ; sequential additions
fpadd addval
fpadd addval

fstp rslt ; pop stack

invoke fptoa,rslt,pbuf ; convert addval to string
conout pbuf,lf ; display at console

waitkey
.exit

entry_point endp

; ¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

end

RuiLoureiro · August 17, 2018, 04:01:58 AM

Hi Hutch
seems OK, all things are correct and FPU is cleaned at the end and at the end of each step (or case)

jj2007 · August 17, 2018, 07:03:11 AM

Code Select

    fpinit MACRO                ;; initialise the x87 co-processor
      fninit
      fldz
    ENDM

The fldz does nothing useful other than wasting a register. No need for a macro, a simple finit does the job (fninit btw doesn't exist, ML.exe encodes it exactly as finit).

RuiLoureiro · August 17, 2018, 07:46:51 AM

Quote from: jj2007 on August 17, 2018, 07:03:11 AM
Code Select Expand
fpinit MACRO ;; initialise the x87 co-processor fninit fldz ENDM

The fldz does nothing useful other than wasting a register. No need for a macro, a simple finit does the job (fninit btw doesn't exist, ML.exe encodes it exactly as finit).

Hi Jochen,
For me and for you «does nothing» because we dont need to start a sum with 0 or with one register with 0 and add the next argument, but for someone that starts with sum=0 and then he wants to add the next argument in a loop , it does.
To multiply [a b c d] by [A E I M] one by one and sum i start with sum=a.A then i get b and E
and b.E and sum=a.A+b.E etc. inside a loop (sum=st(0)) but the first is not inside the loop. This is the problem. If we dont know how to do the sum with a loop we need to start with sum=0.
It seems to be the problem. So we may have 2 solutions one of them is to start with 0.
note: See my procedures to multiply matrices with FPU any size.

jj2007 · August 17, 2018, 08:05:58 AM

Quote from: RuiLoureiro on August 17, 2018, 07:46:51 AMfor someone that starts with sum=0 and then he wants to add the next argument in a loop , it does.

Yes, that's correct. And for somebody who starts a loop with multiplications, 1 should be in ST(0) at the beginning. But these cases have nothing to do with a macro called "fpinit". If you want to add several numbers in a loop, start with fldz before the loop but don't use a generic "fpinit" macro.

RuiLoureiro · August 17, 2018, 08:18:10 AM

Quote from: jj2007 on August 17, 2018, 08:05:58 AM
Quote from: RuiLoureiro on August 17, 2018, 07:46:51 AMfor someone that starts with sum=0 and then he wants to add the next argument in a loop , it does.

Yes, that's correct. And for somebody who starts a loop with multiplications, 1 should be in ST(0) at the beginning. But these cases have nothing to do with a macro called "fpinit". If you want to add several numbers in a loop, start with fldz before the loop but don't use a generic "fpinit" macro.

Yes, the best way is to write rules to do sums, multiplications, etc. etc. inside loops ...

hutch-- · August 17, 2018, 09:37:21 AM

> The fldz does nothing useful other than wasting a register. No need for a macro, a simple finit does the job (fninit btw doesn't exist, ML.exe encodes it exactly as finit).

"fldz" does do something useful, it loads 0.0, I imagine that is why Intel provide the instruction.

> (fninit btw doesn't exist, ML.exe encodes it exactly as finit).

He he, you will have to upgrade to ML64, it produces the correct opcode DBE3h.

.text:0000000140001027 DBE3 fninit <<<< HERE !!!!
.text:0000000140001029 D9EE fldz
.text:000000014000102b DD8568FFFFFF fld qword ptr [rbp-0x98]
.text:0000000140001031 DEC1 faddp st(1)
.text:0000000140001033 DD8568FFFFFF fld qword ptr [rbp-0x98]
.text:0000000140001039 DEC1 faddp st(1)
.text:000000014000103b DD8568FFFFFF fld qword ptr [rbp-0x98]
.text:0000000140001041 DEC1 faddp st(1)
.text:0000000140001043 DD8568FFFFFF fld qword ptr [rbp-0x98]
.text:0000000140001049 DEC1 faddp st(1)
.text:000000014000104b DD9D60FFFFFF fstp qword ptr [rbp-0xa0]

The MASM Forum

News:

Simple floating point macros.

hutch--

Siekmanski

hutch--

Siekmanski

RuiLoureiro

hutch--

daydreamer

jj2007

hutch--

RuiLoureiro

jj2007

RuiLoureiro

jj2007

RuiLoureiro

hutch--