News:

Masm32 SDK description, downloads and other helpful links
Message to All Guests

Main Menu

OPTION STACKBASE also broken in x64

Started by aw27, March 21, 2017, 07:58:08 PM

Previous topic - Next topic

aw27

How to test:
ASM CODE:
;**********************************************************
option casemap:none
option frame:auto
OPTION STACKBASE:RSP

.code

sub1 proc private dest:ptr, src:ptr, val1 : qword, val2:qword
   mov dest, rcx
   mov src, rdx
   mov val1, r8
   mov val2, r9
   mov rax, qword ptr [rdx]
   add rax, val1
   add rax, val2
   mov qword ptr [rcx], rax

   ret
sub1 endp

getSum proc public dest:ptr, src:ptr, val1 : qword, val2:qword
   mov dest, rcx
   mov src, rdx
   mov val1, r8
   mov val2, r9
   INVOKE sub1, dest, src, val1, val2
   ret
getSum endp

end
;******************************************
called from a C++ program:
#include "stdafx.h"

#if defined (__cplusplus)
extern "C" {
#endif
   void getSum(size_t*dest, size_t*src, size_t val1, size_t val2);
#if defined (__cplusplus)
}
#endif

int main()
{
size_t *src = new (size_t);
   size_t *dest = new (size_t);
   size_t val1 = 1;
   size_t val2 = 2;
   *src = 1000;

   getSum(dest, src, val1, val2);

   printf("Result: %d\n", *dest);

    return 0;
}

// How the getSum decompiles?

getSum:
000000013FA81825  mov         qword ptr [rsp+8],rcx 
000000013FA8182A  mov         qword ptr [rsp+10h],rdx 
000000013FA8182F  mov         qword ptr [rsp+18h],r8 
000000013FA81834  mov         qword ptr [rsp+20h],r9 
000000013FA81839  sub         rsp,20h 
000000013FA8183D  mov         rcx,qword ptr [rsp+8] 
000000013FA81842  mov         rdx,qword ptr [rsp+10h] 
000000013FA81847  mov         r8,qword ptr [rsp+18h] 
000000013FA8184C  mov         r9,qword ptr [rsp+20h] 
000000013FA81851  call        000000013FA81800 
000000013FA81856  add         rsp,20h 
000000013FA8185A  ret   :icon13:

Note also that the stack pointer is not realigned to a 16-byte boundary by the sub rsp, 20h!  :eusa_naughty:

johnsa

Can you try this (temporarily to see if this solves the issue and works for now until we fix) :

option frame:auto
option win64:11
option stackbase:rsp

aw27

Quote from: johnsa on March 21, 2017, 09:56:56 PM
Can you try this (temporarily to see if this solves the issue and works for now until we fix) :

option frame:auto
option win64:11
option stackbase:rsp

Does it align the stack before the call according to the specification?  :eusa_naughty:

johnsa

option win64:11 aligns the stack to 16 yes

johnsa

Basically... I'd really like to discontinue ALL of the other options..

WIN64:11 + STACKBASE:RSP
should produce optimal prologue/epilogue in ALL cases, except when you want absolutely nothing to be done for you and you override the prologue/epilogue macros.. which is useful in some cases.
But it's logic should invalidate the need for any of the other options to be honest.

I have half a mind to just force all modes to this one. It also makes our lives much easier trying to manage one calling convention implementation instead of 16 permutations!

aw27

Quote from: johnsa on March 21, 2017, 10:41:45 PM
option win64:11 aligns the stack to 16 yes

No, it does not align the stack to 16, no  :eusa_naughty: :eusa_naughty: :eusa_naughty:
The stack is not aligned on function entry and you are relying on the push rbp to align it which does not happen when you use OPTION STACKBASE:RSP.

And there is another bug, I just found (see next message)


aw27

New bug:

option casemap:none
option frame:auto

.code

sub1 proc private dest:ptr, src:ptr, val1 : qword, val2:qword
   mov dest, rcx
   mov src, rdx
   mov val1, r8
   mov val2, r9
   mov rax, qword ptr [rdx]
   add rax, val1
   add rax, val2
   mov qword ptr [rcx], rax

   ret
sub1 endp

getSum proc public FRAME uses xmm6 dest:ptr, src:ptr, val1 : qword, val2:qword
   INVOKE sub1, rcx, rdx, r8, r9
   ret
getSum endp

end

;************************* Decompiles to:
getSum:
000000013F891824  push        rbp 
000000013F891825  mov         rbp,rsp 
000000013F891828  sub         rsp,10h 
000000013F89182C  sub         rsp,20h 
000000013F891830  call        000000013F891800 
000000013F891835  add         rsp,20h 
000000013F891839  vmovdqa     xmm6,xmmword ptr [rsp] 
000000013F89183E  add         rsp,10h 
000000013F891842  pop         rbp 
000000013F891843  ret 

It restores an ansaved XMM register  :icon_redface:

aw27

Quote from: johnsa on March 21, 2017, 10:44:27 PM
Basically... I'd really like to discontinue ALL of the other options..

Just because you can't fix them does not mean they should be discontinued.   :shock:
Myself, I hate that option to automatically put the registers contents in shadow space. For some functions I want to put the registers in shadow space for others not. What is the sense of always put registers contents in shadow space?  :(
Assembly is not for lazy people that want all to be done for them, those should use Visual Basic .Net . Sorry, VB Net is not even for them, probably Javascript or the likes is better suited.

johnsa

Quote from: aw27 on March 22, 2017, 12:19:38 AM
Quote from: johnsa on March 21, 2017, 10:44:27 PM
Basically... I'd really like to discontinue ALL of the other options..

Just because you can't fix them does not mean they should be discontinued.   :shock:
Myself, I hate that option to automatically put the registers contents in shadow space. For some functions I want to put the registers in shadow space for others not. What is the sense of always put registers contents in shadow space?  :(
Assembly is not for lazy people that want all to be done for them, those should use Visual Basic .Net . Sorry, VB Net is not even for them, probably Javascript or the likes is better suited.

THAT is the whole point, OPTION WIN64:11 DOES NOT put registers into shadow space IF you don't refer to the argument by name. If you use the registers, rcx, rdx .. etc then no shadow space is reserved. It then also automatically re-uses any available shadow space for locals to reduce the total amount of stack that is allocated per function. There shouldn't be a case where win64:11 when used properly would produce any less efficient code that if you'd done it yourself.. but without any of the pain.

johnsa

Quote from: aw27 on March 21, 2017, 11:58:44 PM
Quote from: johnsa on March 21, 2017, 10:41:45 PM
option win64:11 aligns the stack to 16 yes

No, it does not align the stack to 16, no  :eusa_naughty: :eusa_naughty: :eusa_naughty:
The stack is not aligned on function entry and you are relying on the push rbp to align it which does not happen when you use OPTION STACKBASE:RSP.

And there is another bug, I just found (see next message)

Can you provide an example of win64:11 not aligning the stack to 16 ?
I depend on that in well over 500k lines of production code so I'd be interested to see a case where this isn't so.

johnsa

Quote from: aw27 on March 22, 2017, 12:03:46 AM
New bug:

option casemap:none
option frame:auto

.code

sub1 proc private dest:ptr, src:ptr, val1 : qword, val2:qword
   mov dest, rcx
   mov src, rdx
   mov val1, r8
   mov val2, r9
   mov rax, qword ptr [rdx]
   add rax, val1
   add rax, val2
   mov qword ptr [rcx], rax

   ret
sub1 endp

getSum proc public FRAME uses xmm6 dest:ptr, src:ptr, val1 : qword, val2:qword
   INVOKE sub1, rcx, rdx, r8, r9
   ret
getSum endp

end

;************************* Decompiles to:
getSum:
000000013F891824  push        rbp 
000000013F891825  mov         rbp,rsp 
000000013F891828  sub         rsp,10h 
000000013F89182C  sub         rsp,20h 
000000013F891830  call        000000013F891800 
000000013F891835  add         rsp,20h 
000000013F891839  vmovdqa     xmm6,xmmword ptr [rsp] 
000000013F89183E  add         rsp,10h 
000000013F891842  pop         rbp 
000000013F891843  ret 

It restores an ansaved XMM register  :icon_redface:

will check this one out now..

aw27

Quote from: johnsa on March 22, 2017, 12:36:46 AM
THAT is the whole point

If it is so good, develop it on its own merits, don't drag people away from other options because they are buggy and you can't fix them.  :eusa_boohoo:

aw27

Quote from: johnsa on March 22, 2017, 12:38:58 AM
Can you provide an example of win64:11 not aligning the stack to 16 ?
I depend on that in well over 500k lines of production code so I'd be interested to see a case where this isn't so.

Since you ask and I can't resist a challenge, take this one and enjoy another crash  :P:

option casemap:none
option frame:auto
OPTION STACKBASE:RSP
option win64:11

.code

sub1 proc private dest:ptr, src:ptr, val1 : qword, val2:qword
   mov dest, rcx
   mov src, rdx
   mov val1, r8
   mov val2, r9
   mov rax, qword ptr [rdx]
   add rax, val1
   add rax, val2
   mov qword ptr [rcx], rax

   ret
sub1 endp

getSum proc public FRAME uses xmm6 dest:ptr, src:ptr, val1 : qword, val2:qword
   INVOKE sub1, dest, rdx, r8, r9
   ret
getSum endp

end

Why does it crash after 500 K lines of production code?  :greensml: :greensml: :greensml: :greensml: :greensml:


johnsa

Did I say we can't fix them ?

All I said is that option is preferable in that it should do everything in a optimal way, so there should be no need to use the others, but they are still there if someone requires it.

I said ..

Can you try this (temporarily to see if this solves the issue and works for now until we fix) :

option frame:auto
option win64:11
option stackbase:rsp

.. PS. The constant attitude and being antagonistic doesn't not go down well with me.. We try to help everyone out and get all issues resolved as quickly as possible, given this is a "hobby" for us, and I'd like to see any other dev teams communicate as actively as we do on issues and ways to fix them, trying to keep the community involved.


johnsa

Quote from: aw27 on March 22, 2017, 01:12:19 AM
Quote from: johnsa on March 22, 2017, 12:38:58 AM
Can you provide an example of win64:11 not aligning the stack to 16 ?
I depend on that in well over 500k lines of production code so I'd be interested to see a case where this isn't so.

Since you ask and I can't resist a challenge, take this one and enjoy another crash  :P:

option casemap:none
option frame:auto
OPTION STACKBASE:RSP
option win64:11

.code

sub1 proc private dest:ptr, src:ptr, val1 : qword, val2:qword
   mov dest, rcx
   mov src, rdx
   mov val1, r8
   mov val2, r9
   mov rax, qword ptr [rdx]
   add rax, val1
   add rax, val2
   mov qword ptr [rcx], rax

   ret
sub1 endp

getSum proc public FRAME uses xmm6 dest:ptr, src:ptr, val1 : qword, val2:qword
   INVOKE sub1, dest, rdx, r8, r9
   ret
getSum endp

end

Why does it crash after 500 K lines of production code?  :greensml: :greensml: :greensml: :greensml: :greensml:

Will have this one fixed today.