Hello,
I am surprised that it seem to work at the first time with UASM.
More improvement would be better.
Quote
ZEROLOCALES MACRO dernierelocale:REQ
mov r10,rsp ;
mov r11,rcx ;preserve ecx
lea rcx,dernierelocale
sub rcx,r10
shr rcx,3 ;divide 8 for rep
mov r10,rdi ;preserve rdi
lea rdi,dernierelocale
mov rax,0 ;pour stosq
rep stosq
mov rdi,r10
mov rcx,r11
ENDM
Quote
WinMain proc FRAME hInst:HINSTANCE,hPrevInst:HINSTANCE,CmdLine:LPSTR,CmdShow:DWORD
LOCAL wc:WNDCLASSEX
LOCAL msg:MSG
LOCAL hwnd:HWND
ZEROLOCALES hwnd
comment µ
;mov rbp,rsp
;sub rsp,0F0h
lea rax,hwnd ;[rbp-88h]
lea r10,wc ;[rbp-50h]
mov r11,rsp
µ
mov eax,wc.cbSize
mov r10,hwnd
or rax,r10
.if rax == 0
invoke MessageBox,NULL,TXT("reussite"),TXT("ZEROLOCALES"),MB_OK
.endif
Ther is not only qword in the stack,so
Quote
ZEROLOCALES MACRO dernierelocale:REQ
mov r10,rsp ;
mov r11,rcx ;preserve ecx
lea rcx,dernierelocale
sub rcx,r10
;shr rcx,3 ;divide 8 for rep
mov r10,rdi ;preserve rdi
lea rdi,dernierelocale
mov rax,0 ;pour stosq
rep stosb
mov rdi,r10
mov rcx,r11
ENDM
So r10 and r11 will be trashed, right?
This is something that is done in some dialects of basic but I wonder the value in 64 bit assembler. It is simply more efficient to write the required values into each local variable, it also allows you to write different sized values to different sided local variables. There are a few basic rules you need to keep in mind if you need the variables to remain aligned, start with the biggest ones, then smaller ones after them.
This way all the variables remain aligned for their respective data sizes.
I use ClearLocals (http://www.jj2007.eu/MasmBasicQuickReference.htm#Mb1254) very often, about 20 times in my library. Size-wise it's almost always useful:
SayHi proc arg
Local v1, v2, v3, v4, rc:RECT, wc:WNDCLASSEX
ClearLocals ; 5 bytes, clears all variables
mov v1, 0 ; 7 bytes for clearing a dword
and v1, 0 ; 4 bytes, clears one dword
ret
SayHi endp
Performance-wise, clearing a fat 8k local buffer is possible (and still costs only 5 bytes), but I wouldn't do it for a procedure in an innermost loop.
For 64-bit Assembly, I would implement it in the PROLOGUE macro, and make sure that no registers are being trashed.
Quote
There are a few basic rules you need to keep in mind if you need the variables to remain aligned, start with the biggest ones, then smaller ones after them.
I don't see in what writing the locals could change the alignment ?.I need an enlightment.
Hello,
I persist,after further tests not really good,here a soluce.
The macro take care to know the exact space used by the locals and need to know the first and last local.
Quote
ZEROLOCAL64 MACRO firstlocal:REQ,lastlocal:REQ
Local etiquette1,etiquette2
Lea r10,firstlocal ;462f95ee90
add r10,sizeof firstlocal
lea r11,lastlocal ;462f95ee80
sub r10,r11 ;R10 the count of bytes
;R10 count in QWORD,R11 in byte --- invoke memset,addr lastlocal,0,r10d
mov R11,R10
AND R10,0FFFFFFFFFFFFFFF8h ;R11 max 7
SUB R11,R10
shr R10,3 ;div 8
lea rax,lastlocal
.if R10 != 0
etiquette1:
mov qword ptr [rax],0
add rax,8
dec R10
jnz etiquette1
.endif
.if R11 != 0
etiquette2:
mov byte ptr [rax],0
inc rax
dec R11
jnz etiquette2
.endif
ENDM
Little modify : change @@: by etiquette1(2)
AND R10,0FFFFFFFFFFFFFFF8h instead of AND R10,0FFFFFFFFFFFFFFF0h,win one more qword
After more tests ,just work firstlocal and lastlocal can be the same if there is only one local
Yves,
LOCAL v1 :YMMWORD
LOCAL v2 :XMMWORD
LOCAL v3 :QWORD
LOCAL v4 :DWORD
LOCAL v5 :WORD
LOCAL v6 :BYTE
If the first LOCAL is aligned correctly, all of the following are as well.
Quote from: TouEnMasm on April 21, 2021, 01:30:50 AMneed to know the first and last local.
If you write your own PROLOGUE, you don't need the first and last local. Even if you rely on the standard frame, only the
last local is needed.
Attached is a testbed, no MasmBasic, just plain Masm64 SDK as developed by Hutch (I get the impression that
uses ... does not work, though - see last line of the proc).
mytest proc uses rsi rdi rbx arg1, arg2
Local rc:RECT
Local v1, v2
Local b1:BYTE
I don't have a use for doing it but to clear locals, I doubt that you can improve from REP STOSB. All you need to know is the start address and length. You should be able to do this in a stack frame design.
Quote from: hutch-- on April 28, 2021, 09:08:17 PM
I don't have a use for doing it but to clear locals, I doubt that you can improve from REP STOSB. All you need to know is the start address and length. You should be able to do this in a stack frame design.
I use
and qword ptr [rax], see above. The advantage is that rcx (i.e. the first argument) is not needed, as in rep stosb (and it might be a tick faster).
If
uses is not implemented, can you issue a warning?