News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests
NB: Posting URL's See here: Posted URL Change

Main Menu

ZEROLOCALES in 64 bits

Started by TouEnMasm, July 18, 2020, 03:30:31 AM

Previous topic - Next topic

TouEnMasm

Hello,
I am surprised that it seem to work at the first time with UASM.
More improvement would be better.

Quote
ZEROLOCALES MACRO dernierelocale:REQ
   mov r10,rsp      ;
   mov r11,rcx      ;preserve ecx
   lea rcx,dernierelocale
   sub rcx,r10
   shr rcx,3      ;divide 8 for rep
   mov r10,rdi      ;preserve rdi
   lea rdi,dernierelocale
   mov rax,0      ;pour stosq   
   rep stosq
   mov rdi,r10
   mov rcx,r11
ENDM   

Quote
WinMain proc FRAME hInst:HINSTANCE,hPrevInst:HINSTANCE,CmdLine:LPSTR,CmdShow:DWORD
   LOCAL wc:WNDCLASSEX
   LOCAL msg:MSG
   LOCAL hwnd:HWND
   ZEROLOCALES hwnd
   comment µ
   ;mov rbp,rsp
   ;sub rsp,0F0h
   lea rax,hwnd         ;[rbp-88h]
   lea r10,wc            ;[rbp-50h]
   mov r11,rsp
   µ
   mov eax,wc.cbSize
   mov r10,hwnd
   or rax,r10
   .if rax == 0
      invoke MessageBox,NULL,TXT("reussite"),TXT("ZEROLOCALES"),MB_OK
   .endif
Fa is a musical note to play with CL

TouEnMasm


Ther is not only qword in the stack,so
Quote
ZEROLOCALES MACRO dernierelocale:REQ
   mov r10,rsp      ;
   mov r11,rcx      ;preserve ecx
   lea rcx,dernierelocale
   sub rcx,r10
   ;shr rcx,3      ;divide 8 for rep
   mov r10,rdi      ;preserve rdi
   lea rdi,dernierelocale
   mov rax,0      ;pour stosq   
   rep stosb
   mov rdi,r10
   mov rcx,r11
ENDM         
Fa is a musical note to play with CL

jj2007

So r10 and r11 will be trashed, right?

hutch--

This is something that is done in some dialects of basic but I wonder the value in 64 bit assembler. It is simply more efficient to write the required values into each local variable, it also allows you to write different sized values to different sided local variables. There are a few basic rules you need to keep in mind if you need the variables to remain aligned, start with the biggest ones, then smaller ones after them.

This way all the variables remain aligned for their respective data sizes.

jj2007

I use ClearLocals very often, about 20 times in my library. Size-wise it's almost always useful:
SayHi proc arg
Local v1, v2, v3, v4, rc:RECT, wc:WNDCLASSEX
  ClearLocals ; 5 bytes, clears all variables
  mov v1, 0 ; 7 bytes for clearing a dword
  and v1, 0 ; 4 bytes, clears one dword
  ret
SayHi endp


Performance-wise, clearing a fat 8k local buffer is possible (and still costs only 5 bytes), but I wouldn't do it for a procedure in an innermost loop.

For 64-bit Assembly, I would implement it in the PROLOGUE macro, and make sure that no registers are being trashed.

TouEnMasm


Quote
There are a few basic rules you need to keep in mind if you need the variables to remain aligned, start with the biggest ones, then smaller ones after them.
I don't see in what writing the locals could change the alignment ?.I need an enlightment.
Fa is a musical note to play with CL

TouEnMasm

#6
Hello,
I persist,after further tests not really good,here a soluce.
The macro take care to know the exact space used by the locals and need to know the first and last local.
Quote
ZEROLOCAL64 MACRO firstlocal:REQ,lastlocal:REQ
       Local etiquette1,etiquette2
   Lea r10,firstlocal      ;462f95ee90
   add r10,sizeof firstlocal
   lea r11,lastlocal         ;462f95ee80
   sub r10,r11            ;R10 the count of bytes
   ;R10 count in QWORD,R11 in byte --- invoke memset,addr lastlocal,0,r10d   
   mov R11,R10
   AND R10,0FFFFFFFFFFFFFFF8h       ;R11 max 7
   SUB R11,R10
   shr R10,3 ;div 8
   lea rax,lastlocal
   .if R10 != 0
      etiquette1:
      mov qword ptr [rax],0
      add rax,8
      dec R10
      jnz etiquette1
   .endif
   .if R11 != 0
      etiquette2:
      mov byte ptr [rax],0
      inc rax
      dec R11
      jnz etiquette2
   .endif
   
ENDM
Little modify : change @@: by etiquette1(2)
AND R10,0FFFFFFFFFFFFFFF8h    instead of AND R10,0FFFFFFFFFFFFFFF0h,win one more qword
After more tests ,just work    firstlocal and lastlocal can be the same if there is only one local
Fa is a musical note to play with CL

hutch--

Yves,

LOCAL v1 :YMMWORD
LOCAL v2 :XMMWORD
LOCAL v3 :QWORD
LOCAL v4 :DWORD
LOCAL v5 :WORD
LOCAL v6 :BYTE

If the first LOCAL is aligned correctly, all of the following are as well.

jj2007

Quote from: TouEnMasm on April 21, 2021, 01:30:50 AMneed to know the first and last local.

If you write your own PROLOGUE, you don't need the first and last local. Even if you rely on the standard frame, only the last local is needed.

Attached is a testbed, no MasmBasic, just plain Masm64 SDK as developed by Hutch (I get the impression that uses ... does not work, though - see last line of the proc).

mytest proc uses rsi rdi rbx arg1, arg2
Local rc:RECT
Local v1, v2
Local b1:BYTE

hutch--

I don't have a use for doing it but to clear locals, I doubt that you can improve from REP STOSB. All you need to know is the start address and length. You should be able to do this in a stack frame design.

jj2007

Quote from: hutch-- on April 28, 2021, 09:08:17 PM
I don't have a use for doing it but to clear locals, I doubt that you can improve from REP STOSB. All you need to know is the start address and length. You should be able to do this in a stack frame design.

I use and qword ptr [rax], see above. The advantage is that rcx (i.e. the first argument) is not needed, as in rep stosb (and it might be a tick faster).

If uses is not implemented, can you issue a warning?