Print Page - ZEROLOCALES in 64 bits

Title: ZEROLOCALES in 64 bits
Post by: TouEnMasm on July 18, 2020, 03:30:31 AM

Hello,
I am surprised that it seem to work at the first time with UASM.
More improvement would be better.

Quote
ZEROLOCALES MACRO dernierelocale:REQ
   mov r10,rsp      ;
   mov r11,rcx      ;preserve ecx
   lea rcx,dernierelocale
   sub rcx,r10
   shr rcx,3      ;divide 8 for rep
   mov r10,rdi      ;preserve rdi
   lea rdi,dernierelocale
   mov rax,0      ;pour stosq
   rep stosq
   mov rdi,r10
   mov rcx,r11
ENDM

Quote
WinMain proc FRAME hInst:HINSTANCE,hPrevInst:HINSTANCE,CmdLine:LPSTR,CmdShow:DWORD
   LOCAL wc:WNDCLASSEX
   LOCAL msg:MSG
   LOCAL hwnd:HWND
   ZEROLOCALES hwnd
   comment µ
   ;mov rbp,rsp
   ;sub rsp,0F0h
   lea rax,hwnd         ;[rbp-88h]
   lea r10,wc            ;[rbp-50h]
   mov r11,rsp
   µ
   mov eax,wc.cbSize
   mov r10,hwnd
   or rax,r10
   .if rax == 0
      invoke MessageBox,NULL,TXT("reussite"),TXT("ZEROLOCALES"),MB_OK
   .endif

Title: Re: ZEROLOCALES in 64 bits
Post by: TouEnMasm on August 23, 2020, 04:24:57 PM

Ther is not only qword in the stack,so

Quote
ZEROLOCALES MACRO dernierelocale:REQ
mov r10,rsp ;
mov r11,rcx ;preserve ecx
lea rcx,dernierelocale
sub rcx,r10
;shr rcx,3 ;divide 8 for rep
mov r10,rdi ;preserve rdi
lea rdi,dernierelocale
mov rax,0 ;pour stosq
rep stosb
mov rdi,r10
mov rcx,r11
ENDM

Title: Re: ZEROLOCALES in 64 bits
Post by: jj2007 on August 23, 2020, 06:22:19 PM

So r10 and r11 will be trashed, right?

Title: Re: ZEROLOCALES in 64 bits
Post by: hutch-- on August 23, 2020, 08:55:17 PM

This is something that is done in some dialects of basic but I wonder the value in 64 bit assembler. It is simply more efficient to write the required values into each local variable, it also allows you to write different sized values to different sided local variables. There are a few basic rules you need to keep in mind if you need the variables to remain aligned, start with the biggest ones, then smaller ones after them.

This way all the variables remain aligned for their respective data sizes.

Title: Re: ZEROLOCALES in 64 bits
Post by: jj2007 on August 24, 2020, 12:24:20 AM

I use ClearLocals (http://www.jj2007.eu/MasmBasicQuickReference.htm#Mb1254) very often, about 20 times in my library. Size-wise it's almost always useful:

Code Select

SayHi proc arg
Local v1, v2, v3, v4, rc:RECT, wc:WNDCLASSEX
  ClearLocals	; 5 bytes, clears all variables
  mov v1, 0	; 7 bytes for clearing a dword
  and v1, 0	; 4 bytes, clears one dword
  ret
SayHi endp

Performance-wise, clearing a fat 8k local buffer is possible (and still costs only 5 bytes), but I wouldn't do it for a procedure in an innermost loop.

For 64-bit Assembly, I would implement it in the PROLOGUE macro, and make sure that no registers are being trashed.

Title: Re: ZEROLOCALES in 64 bits
Post by: TouEnMasm on August 24, 2020, 03:09:04 PM

Quote
There are a few basic rules you need to keep in mind if you need the variables to remain aligned, start with the biggest ones, then smaller ones after them.

I don't see in what writing the locals could change the alignment ?.I need an enlightment.

Title: Re: ZEROLOCALES in 64 bits
Post by: TouEnMasm on April 21, 2021, 01:30:50 AM

Hello,
I persist,after further tests not really good,here a soluce.
The macro take care to know the exact space used by the locals and need to know the first and last local.

Quote
ZEROLOCAL64 MACRO firstlocal:REQ,lastlocal:REQ
Local etiquette1,etiquette2
   Lea r10,firstlocal      ;462f95ee90
   add r10,sizeof firstlocal
   lea r11,lastlocal         ;462f95ee80
   sub r10,r11            ;R10 the count of bytes
   ;R10 count in QWORD,R11 in byte --- invoke memset,addr lastlocal,0,r10d
   mov R11,R10
   AND R10,0FFFFFFFFFFFFFFF8h ;R11 max 7
   SUB R11,R10
   shr R10,3 ;div 8
   lea rax,lastlocal
   .if R10 != 0
      etiquette1:
      mov qword ptr [rax],0
      add rax,8
      dec R10
      jnz etiquette1
   .endif
   .if R11 != 0
      etiquette2:
      mov byte ptr [rax],0
      inc rax
      dec R11
      jnz etiquette2
   .endif

ENDM

Little modify : change @@: by etiquette1(2)
AND R10,0FFFFFFFFFFFFFFF8h instead of AND R10,0FFFFFFFFFFFFFFF0h,win one more qword
After more tests ,just work firstlocal and lastlocal can be the same if there is only one local

Title: Re: ZEROLOCALES in 64 bits
Post by: hutch-- on April 28, 2021, 08:19:06 PM

Yves,

LOCAL v1 :YMMWORD
LOCAL v2 :XMMWORD
LOCAL v3 :QWORD
LOCAL v4 :DWORD
LOCAL v5 :WORD
LOCAL v6 :BYTE

If the first LOCAL is aligned correctly, all of the following are as well.

Title: Re: ZEROLOCALES in 64 bits
Post by: jj2007 on April 28, 2021, 08:57:40 PM

Quote from: TouEnMasm on April 21, 2021, 01:30:50 AMneed to know the first and last local.

If you write your own PROLOGUE, you don't need the first and last local. Even if you rely on the standard frame, only the last local is needed.

Attached is a testbed, no MasmBasic, just plain Masm64 SDK as developed by Hutch (I get the impression that uses ... does not work, though - see last line of the proc).

Code Select

mytest proc uses rsi rdi rbx arg1, arg2
Local rc:RECT
Local v1, v2
Local b1:BYTE

Title: Re: ZEROLOCALES in 64 bits
Post by: hutch-- on April 28, 2021, 09:08:17 PM

I don't have a use for doing it but to clear locals, I doubt that you can improve from REP STOSB. All you need to know is the start address and length. You should be able to do this in a stack frame design.

Title: Re: ZEROLOCALES in 64 bits
Post by: jj2007 on April 28, 2021, 09:11:43 PM

Quote from: hutch-- on April 28, 2021, 09:08:17 PM
I don't have a use for doing it but to clear locals, I doubt that you can improve from REP STOSB. All you need to know is the start address and length. You should be able to do this in a stack frame design.

I use and qword ptr [rax], see above. The advantage is that rcx (i.e. the first argument) is not needed, as in rep stosb (and it might be a tick faster).

If uses is not implemented, can you issue a warning?

The MASM Forum

General => The Campus => Topic started by: TouEnMasm on July 18, 2020, 03:30:31 AM