News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

Simple floating point macros.

Started by hutch--, August 13, 2018, 04:36:54 PM

Previous topic - Next topic

hutch--

This is the version I am adding to the macro file.

    fpavrg MACRO arg1, args:VARARG  ;; average a list of arguments
      LOCAL cnt,var
      cnt = argcount(args)
      cnt = cnt + 1
      fld arg1
      FOR arg, <args>
        fadd arg
      ENDM
      .data
        var dq cnt
      .code
      fild var
      fdivp
    ENDM

HSE

An option I would prefer for no specific reason: fpavrg MACRO args:VARARG  ;; average a list of arguments
      LOCAL cnt,var
      cnt = 0
      FOR arg, <args>
         if cnt eq 0
            fld arg
         else
            fadd arg
         endif
         cnt = cnt + 1
      ENDM
      .data
        var dq cnt
      .code
      fild var
      fdivp
    ENDM


Is possible to measure time required for macro preprocessing?

A minute later: perhaps I don't like to make two loops when is possible to make only one.
Equations in Assembly: SmplMath

jj2007

Quote from: HSE on August 19, 2018, 10:00:51 AMIs possible to measure time required for macro preprocessing?
In theory, yes. In practice, it's pretty irrelevant if your 10,000 lines take 300 or 301 milliseconds to build. A single macro will add nanoseconds only.

Proposal:fpavrg MACRO arg1, args:VARARG  ;; average a list of arguments
LOCAL cnt,var
  cnt = 1
  fld arg1
  FOR arg, <args>
cnt=cnt+1
fadd arg
  ENDM
  .data
  var dw cnt
  .code
  fild var
  fdivp
ENDM


- doesn't depend on the argcount macro
- the trick with var d? cnt is nice, but a WORD is enough; ML64.exe chokes already if you arrive at column 200, i.e. at around 100 arguments if they are all single letters. Snippet for testing:

x REAL4 1.1111
.data?
result REAL8 ?
.code
  fpavrg x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x, x
  fstp result

hutch--

> ML64.exe chokes already if you arrive at column 200

You will find that its the command line limit that stops more arguments. Simple solution if the code design is that bad is to use more than 1 macro. If you are using very large lists you would use a procedure and fed the list to it as an array.

I have never really cared about assembly time, unlike the old  compilers, the assemblers are not slow anyway, what I do look at is the generated code which matters.

jj2007

Quote from: hutch-- on August 19, 2018, 11:47:00 AM
> ML64.exe chokes already if you arrive at column 200

You will find that its the command line limit that stops more arguments.

UAsm stops at roughly column 900, but in any case it's irrelevant, no sane person would squeeze so many arguments into a macro. That's why the WORD is definitely enough.

hutch--

> That's why the WORD is definitely enough.

Except that you put the .data section out of alignment and for the next arg in the .data section which them needs to be aligned, you don't gain anything.

K_F

Another point to consider is that if you're using a lot of these macros across modules/procedures, you might want to save/restore the FPU state.
I always do this with a long sequence of FPU operations and/or when crossing modules/procedures, otherwise strict manual monitoring of registers used.
;)
'Sire, Sire!... the peasants are Revolting !!!'
'Yes, they are.. aren't they....'

jj2007

Quote from: hutch-- on August 19, 2018, 03:19:56 PM
Except that you put the .data section out of alignment

Does that mean you systematically use, in all your code
.DATA
align 16
bla db "oops", 0

?

Quote from: K_F on August 19, 2018, 04:36:54 PM
Another point to consider is that if you're using a lot of these macros across modules/procedures, you might want to save/restore the FPU state.
I always do this with a long sequence of FPU operations and/or when crossing modules/procedures, otherwise strict manual monitoring of registers used.
;)
Do any of the macros above not restore the FPU state for ST(0)...ST(6)? Please show, because that would imply they are buggy. What is "strict manual monitoring of registers" btw?

hutch--

> Does that mean you systematically use, in all your code .....

in 64 bit YES.

  align 16
  item dq 0    ; aligned
  item dq 0    ; aligned
  witm dw 0    ; aligned by at least 2
  item dq 0    ; mis aligned

Magic rule is if you are going to use short .data(?) components as MASM can do

.data
align 16
dataitem dq 0
.code

Ensure you align it.

K_F

Quote from: jj2007 on August 19, 2018, 05:42:44 PM
Do any of the macros above not restore the FPU state for ST(0)...ST(6)? Please show, because that would imply they are buggy. What is "strict manual monitoring of registers" btw?
??
Bad morning for you ? :)

As an extreme example, say you have all 7 registers loaded with valid values.. one more load would corrupt the barrel.
This can happen across procedures and/or modules.

From what I see none of the macros have either FSAVE  or FRSTOR type instructions - this is not necessary for short sequences, but it is advisable to have a start and end macro that one uses for a batch of FPU operations.
;)
'Sire, Sire!... the peasants are Revolting !!!'
'Yes, they are.. aren't they....'

jj2007

Quote from: hutch-- on August 19, 2018, 08:58:20 PM
> Does that mean you systematically use, in all your code .....

in 64 bit YES.

  align 16
  item1 dq 0    ; aligned
  item2 dq 0    ; MIS-aligned for movaps & friends
  witm dw 0    ; aligned by at least 2
  item dq 0    ; mis aligned

Data alignment is important, of course. But for a movaps xmm0 to fail, you need an OWORD. Your second item is already misaligned, in SIMD speak, but it wouldn't matter because movlps xmm0, item2 does not need 16-byte alignment.

In any case, in my own code I wouldn't add align 16 before each and every BYTE-sized variable. I do that on a needs basis.

hutch--

 :biggrin:

    .avxdata SEGMENT align(64)
      avx2a YMMWORD ?
      avx2b YMMWORD ?
      avx2c YMMWORD ?
      avx2d YMMWORD ?
      avx2e YMMWORD ?
      avx2f YMMWORD ?
      avx2g YMMWORD ?
      avx2h YMMWORD ?
    .avxdata ENDS


RuiLoureiro

Quote from: jj2007 on August 20, 2018, 12:18:13 AM
Quote from: hutch-- on August 19, 2018, 08:58:20 PM
> Does that mean you systematically use, in all your code .....

in 64 bit YES.

  align 16
  item1 dq 0    ; aligned
  item2 dq 0    ; MIS-aligned for movaps & friends
  witm dw 0    ; aligned by at least 2
  item dq 0    ; mis aligned

Data alignment is important, of course. But for a movaps xmm0 to fail, you need an OWORD. Your second item is already misaligned, in SIMD speak, but it wouldn't matter because movlps xmm0, item2 does not need 16-byte alignment.

In any case, in my own code I wouldn't add align 16 before each and every BYTE-sized variable. I do that on a needs basis.
BYTE-sized variables ? I prefer dword even if the value is only 0 or 1. :P
I never used OWORD it's too long. I prefer dd,dd,dd,dd,...  :biggrin:

hutch--

Funny part is that 64 bit MASM used XMMWORD and YMMWORD.  :P

daydreamer

I have a question for Hutch,JJ and other macro experts
cnt = 1
  fld arg1
  FOR arg, <args>
cnt=cnt+1
fadd arg

these variables in macros, are they restricted to be 32bit integers,or can you make macros which have floating Point math inside them too?
or you need to solve it with having your calculation as fixed Point if you need that?I was thinking of ways of creating LUT at assembly time
my none asm creations
https://masm32.com/board/index.php?topic=6937.msg74303#msg74303
I am an Invoker
"An Invoker is a mage who specializes in the manipulation of raw and elemental energies."
Like SIMD coding