Hi all,
I'm building (with a macro) a procedure with a 14 arguments and 2 local variables but instead 80h write 40h.
Must be this (or so):
00007FF8721237EE | 48 89 4C 24 08 | mov qword ptr ss:[rsp+8],rcx | ;ResGuard.asm:819
00007FF8721237F3 | 48 89 54 24 10 | mov qword ptr ss:[rsp+10],rdx | ;[rsp+10]:RichEditANSIWndProc
00007FF8721237F8 | 4C 89 44 24 18 | mov qword ptr ss:[rsp+18],r8 |
00007FF8721237FD | 4C 89 4C 24 20 | mov qword ptr ss:[rsp+20],r9 |
00007FF872123802 | 48 83 EC 80 | sub rsp,80 |
00007FF872123806 | 48 8D 6C 24 80 | lea rbp,qword ptr ss:[rsp+80] |
*********
00007FF872123A63 | 5D | pop rbp |
but result is:
00007FF8721237EE | 48 89 4C 24 08 | mov qword ptr ss:[rsp+8],rcx | ;ResGuard.asm:819
00007FF8721237F3 | 48 89 54 24 10 | mov qword ptr ss:[rsp+10],rdx | ;[rsp+10]:RichEditANSIWndProc
00007FF8721237F8 | 4C 89 44 24 18 | mov qword ptr ss:[rsp+18],r8 |
00007FF8721237FD | 4C 89 4C 24 20 | mov qword ptr ss:[rsp+20],r9 |
00007FF872123802 | 48 83 EC 40 | sub rsp,40 |
00007FF872123806 | 48 8D 6C 24 40 | lea rbp,qword ptr ss:[rsp+40] |
*********
00007FF872123A63 | 5D | pop rbp |
Apparently is working an optimization because arguments are not used directly by invoke.
Is posible to prevent that, or only way is to write prologue and epilogue?
Thanks in advance, HSE.
Hi HSE
Quote from: HSE on August 19, 2023, 06:24:22 AMApparently is working an optimization because arguments are not used directly by invoke.
Just as an idea, try to reference all the arguments and see if anything changes.
Biterider
Hi HSE,
What are your UASM options at the beginning of your code? For example :
Option Stackbase:rsp
Option Frame:auto
Option Win64:11
Hi Vortex,
Option Stackbase:rbp
Option Frame:auto
Option Win64:8
Quote from: HSE on August 19, 2023, 06:24:22 AMI'm building (with a macro) a procedure with a 14 arguments and 2 local variables but instead 80h write 40h.
40h is more than enough for two local variables. Btw your code is inefficient:
invoke MyTest, 11111111h, 22222222h, 33333333h, 44444444h, 55555555h, 66666666h
...
MyTest proc arg1, arg2, arg3, arg4, arg5, arg6
Local loc1, loc2, loc3, loc4, loc55, loc6
mov eax, loc1
mov eax, loc6
mov edx, arg1
mov edx, arg6
ret
MyTest endp
int3 |
mov [rsp+28],66666666 |
mov [rsp+20],55555555 |
mov r9d,44444444 |
mov r8d,33333333 |
mov edx,22222222 |
mov ecx,11111111 |
call 1400011E8 |
...
push rbp |
mov rbp,rsp | this should be on top, because
mov [rbp+10],rcx | mov [rbp+x] is systematically one
mov [rbp+18],rdx | byte shorter than mov [rsp+x]
mov [rbp+20],r8 |
mov [rbp+28],r9 |
sub rsp,A0 | exaggerated ;-)
mov eax,[rbp-4] |
mov eax,[rbp-18] |
mov edx,[rbp+10] |
mov edx,[rbp+38] |
leave |
ret |
Hi Biterider,
Quote from: Biterider on August 19, 2023, 06:43:51 AMJust as an idea, try to reference all the arguments and see if anything changes.
Ok
Quote from: jj2007 on August 19, 2023, 07:47:33 AM40h is more than enough for two local variables. Btw your code is inefficient:
But not for arguments.
The model is:
00007FF6DF084C82 | 48 89 4C 24 08 | mov qword ptr ss:[rsp+8],rcx | ;test1.asm:74
00007FF6DF084C87 | 48 89 54 24 10 | mov qword ptr ss:[rsp+10],rdx |
00007FF6DF084C8C | 4C 89 44 24 18 | mov qword ptr ss:[rsp+18],r8 |
00007FF6DF084C91 | 4C 89 4C 24 20 | mov qword ptr ss:[rsp+20],r9 |
00007FF6DF084C96 | 48 55 | push rbp |
00007FF6DF084C98 | 48 81 EC 80 00 00 00 | sub rsp,80 |
00007FF6DF084C9F | 48 8D AC 24 80 00 00 00 | lea rbp,qword ptr ss:[rsp+80] |
00007FF6DF084CA7 | 48 8B 45 30 | mov rax,qword ptr ss:[rbp+30] | ;test1.asm:78
00007FF6DF084CAB | 48 89 45 F8 | mov qword ptr ss:[rbp-8],rax | ;test1.asm:79
00007FF6DF084CAF | 48 8B 4D 10 | mov rcx,qword ptr ss:[rbp+10] | ;test1.asm:81
00007FF6DF084CB3 | 48 8B 55 18 | mov rdx,qword ptr ss:[rbp+18] |
00007FF6DF084CB7 | 4C 8B 45 20 | mov r8,qword ptr ss:[rbp+20] |
00007FF6DF084CBB | 4C 8B 4D 28 | mov r9,qword ptr ss:[rbp+28] |
00007FF6DF084CBF | 48 8B 45 30 | mov rax,qword ptr ss:[rbp+30] |
00007FF6DF084CC3 | 48 89 44 24 20 | mov qword ptr ss:[rsp+20],rax |
00007FF6DF084CC8 | 48 8B 45 38 | mov rax,qword ptr ss:[rbp+38] |
00007FF6DF084CCC | 48 89 44 24 28 | mov qword ptr ss:[rsp+28],rax |
00007FF6DF084CD1 | 48 8B 45 40 | mov rax,qword ptr ss:[rbp+40] |
00007FF6DF084CD5 | 48 89 44 24 30 | mov qword ptr ss:[rsp+30],rax |
00007FF6DF084CDA | 48 8B 45 48 | mov rax,qword ptr ss:[rbp+48] |
00007FF6DF084CDE | 48 89 44 24 38 | mov qword ptr ss:[rsp+38],rax |
00007FF6DF084CE3 | 48 8B 45 50 | mov rax,qword ptr ss:[rbp+50] |
00007FF6DF084CE7 | 48 89 44 24 40 | mov qword ptr ss:[rsp+40],rax |
00007FF6DF084CEC | 48 8B 45 58 | mov rax,qword ptr ss:[rbp+58] |
00007FF6DF084CF0 | 48 89 44 24 48 | mov qword ptr ss:[rsp+48],rax |
00007FF6DF084CF5 | 48 8B 45 60 | mov rax,qword ptr ss:[rbp+60] |
00007FF6DF084CF9 | 48 89 44 24 50 | mov qword ptr ss:[rsp+50],rax |
00007FF6DF084CFE | 48 8B 45 68 | mov rax,qword ptr ss:[rbp+68] |
00007FF6DF084D02 | 48 89 44 24 58 | mov qword ptr ss:[rsp+58],rax |
00007FF6DF084D07 | 48 8B 45 70 | mov rax,qword ptr ss:[rbp+70] |
00007FF6DF084D0B | 48 89 44 24 60 | mov qword ptr ss:[rsp+60],rax |
00007FF6DF084D10 | 48 8B 45 78 | mov rax,qword ptr ss:[rbp+78] |
00007FF6DF084D14 | 48 89 44 24 68 | mov qword ptr ss:[rsp+68],rax |
00007FF6DF084D19 | E8 56 FF FF FF | call <test1.test4A> |
00007FF6DF084D1E | 48 8D 65 00 | lea rsp,qword ptr ss:[rbp] | ;test1.asm:82
00007FF6DF084D22 | 5D | pop rbp |
00007FF6DF084D23 | C3 | ret |
but this use UASM
invoke
Quote from: HSE on August 19, 2023, 06:24:22 AMI'm building (with a macro) a procedure with a 14 arguments and 2 local variables but instead 80h write 40h.
Must be this (or so):
00007FF8721237EE | 48 89 4C 24 08 | mov qword ptr ss:[rsp+8],rcx | ;ResGuard.asm:819
00007FF8721237F3 | 48 89 54 24 10 | mov qword ptr ss:[rsp+10],rdx | ;[rsp+10]:RichEditANSIWndProc
00007FF8721237F8 | 4C 89 44 24 18 | mov qword ptr ss:[rsp+18],r8 |
00007FF8721237FD | 4C 89 4C 24 20 | mov qword ptr ss:[rsp+20],r9 |
00007FF872123802 | 48 83 EC 80 | sub rsp,40 |
00007FF872123806 | 48 8D 6C 24 80 | lea rbp,qword ptr ss:[rsp+80] |
Perhaps you should be a bit more explicit. To me, this looks like the beginning of the called procedure, and sub rsp, 40 is more than enough because your arguments sit all above the stack pointer, not below.
Quote from: jj2007 on August 19, 2023, 08:08:57 AMTo me, this looks like the beginning of the called procedure, and sub rsp, 40 is more than enough because your arguments sit all above the stack pointer, not below.
But rbp storage is above that. With "sub rsp, 40h" arguments in new call overwrites rbp storage (and USES registers eventually).
Ok. I figure out what UASM is (perhaps) doing:
40h = locals variables + maximum number of arguments used by invoke inside the procedure.
With inspiration in rrr314159's NVK macros (https://masm32.com/board/index.php?topic=3988.0), I inserted:
locals = 2 *@WordSize
maxargins = 5
if ArgCount
if ArgCount ge 1
mov qword ptr [rsp+8],rcx
endif
if ArgCount ge 2
mov qword ptr [rsp+16],rdx
endif
if ArgCount ge 3
mov qword ptr [rsp+24],r8
endif
if ArgCount ge 4
mov qword ptr [rsp+32],r9
endif
push xbp
IF ArgCount GT 4
stackadjust = ArgCount
ELSE
stackadjust = 4
ENDIF
if maxargins gt stackadjust
stackadjust = maxargins
endif
stackadjust = ((stackadjust+1)/2)*2 ; round up to 16
sub xsp, stackadjust*@WordSize+locals
lea xbp, [xsp][stackadjust*@WordSize+locals]
endif
and
IF ArgCount GT 0
lea xsp, qword ptr [xbp]
ENDIF
pop xbp
So far, this is by hand. I don't know how a prologue macro can access maximum number of arguments used by invoke inside the procedure.
Thanks, HSE
Quote from: HSE on August 19, 2023, 08:29:21 AMQuote from: jj2007 on August 19, 2023, 08:08:57 AMTo me, this looks like the beginning of the called procedure, and sub rsp, 40 is more than enough because your arguments sit all above the stack pointer, not below.
But rbp storage is above that. With "sub rsp, 40h" arguments in new call overwrites rbp storage (and USES registers eventually).
Again: the sub rsp, xx affects only the local variables, not the arguments. Perhaps you could post a
complete example demonstrating where you believe is a problem.
Quote from: jj2007 on August 20, 2023, 12:29:29 AMAgain: the sub rsp, xx affects only the local variables, not the arguments.
Not for UASM. Apparently UASM reserve in prologue all stack space that procedure will need making callings. It's an interesting optimization.
Quote from: jj2007 on August 20, 2023, 12:29:29 AMPerhaps you could post a complete example demonstrating where you believe is a problem.
:biggrin: :biggrin: When I have the complete example!! It's Resguard but in 64 bits: very funny :thumbsup:
It's not a theorethical discussion, just look in your own programs using UASM with same options. The reference function is CreateFont.
Hi HSE
Quote from: HSE on August 20, 2023, 01:33:15 AMResguard but in 64 bits
That's what you're working on!!! :thumbsup:
I will then also join in on that :biggrin:
Biterider
Quote from: Biterider on August 20, 2023, 03:13:21 AMThat's what you're working on!!! :thumbsup:
Not exactly work, is more like a puzzle. Raymond's tridoku is to difficult just for entertainment. At least if I complete the pieces this can be useful. :biggrin:
Quote from: Biterider on August 20, 2023, 03:13:21 AMI will then also join in on that :biggrin:
:thumbsup: Like the tricks 30 years old games have to jump levels!
I think I passed some levels, and still I can move some more pieces :biggrin:
Quote from: HSE on August 19, 2023, 11:42:27 PM40h = locals variables + maximum number of arguments used by invoke inside the procedure.
...
I don't know how a prologue macro can access maximum number of arguments used by invoke inside the procedure.
It's a bit more complicated:
- on entry to each proc, i.e. in the
prolog macro, create a maxargcounter and set it to zero
- set maxloc to the requited LOCALs space
- inside the proc, your
invoke macro counts (repeatedly for each invoke) the args passed, and sets maxargcounter
- your
epilog macro checks if the default sub rsp, nn is sufficient for maxargcounter
- if not, it throws an error:
## line 5: insert <MAXARGS=16> behind Level2 proc ##
- you insert that, see below, and Level2 cares for a generous sub rsp, 8*MAXARGS+maxloc
Level3 proc a1,a2,a3,a4,a5,a6,a7,a8,a9,a10,a11, a12, a13, a14, a15, a16
Local loc1, loc2, loc3
nops 3
ret
Level3 endp
Level2 proc <MAXARGS=16> arg1
Local loc1
nops 2
jinvoke Level3, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16
ret
Level2 endp
Quote from: jj2007 on August 20, 2023, 07:30:11 AMIt's a bit more complicated:
Agree :thumbsup:
UASM make very little available to prologue macro:
sprintf(buffer, " (%s, 0%XH, 0%XH, 0%XH, <<%s>>, <%s>)",
CurrProc->sym.name, flags, info->parasize, info->localsize,
reglst, info->prologuearg ? info->prologuearg : "");
Hi HSE
Quote from: HSE on August 20, 2023, 03:38:04 AMI will then also join in on that :biggrin:
I have ported ResGuard to the current framework "C".
While creating a DLL, I discovered some problems in the toolchain, so I need to do a little tweaking.
I will start a new thread in the ObjAsm section of the forum as soon as I have the working 32-bit version.
The 64-bit version is another story. At the moment I can compile it, but the stack trace analysis is completely different from the previous 32-bit version. It will take a little time and effort to get it working. :wink2:
Biterider
Quote from: Biterider on August 20, 2023, 10:15:24 PMThe 64-bit version is another story.
:biggrin: :biggrin: Another player :thumbsup: