How to test:
ASM CODE:
;**********************************************************
option casemap:none
option frame:auto
OPTION STACKBASE:RSP
.code
sub1 proc private dest:ptr, src:ptr, val1 : qword, val2:qword
mov dest, rcx
mov src, rdx
mov val1, r8
mov val2, r9
mov rax, qword ptr [rdx]
add rax, val1
add rax, val2
mov qword ptr [rcx], rax
ret
sub1 endp
getSum proc public dest:ptr, src:ptr, val1 : qword, val2:qword
mov dest, rcx
mov src, rdx
mov val1, r8
mov val2, r9
INVOKE sub1, dest, src, val1, val2
ret
getSum endp
end
;******************************************
called from a C++ program:
#include "stdafx.h"
#if defined (__cplusplus)
extern "C" {
#endif
void getSum(size_t*dest, size_t*src, size_t val1, size_t val2);
#if defined (__cplusplus)
}
#endif
int main()
{
size_t *src = new (size_t);
size_t *dest = new (size_t);
size_t val1 = 1;
size_t val2 = 2;
*src = 1000;
getSum(dest, src, val1, val2);
printf("Result: %d\n", *dest);
return 0;
}
// How the getSum decompiles?
getSum:
000000013FA81825 mov qword ptr [rsp+8],rcx
000000013FA8182A mov qword ptr [rsp+10h],rdx
000000013FA8182F mov qword ptr [rsp+18h],r8
000000013FA81834 mov qword ptr [rsp+20h],r9
000000013FA81839 sub rsp,20h
000000013FA8183D mov rcx,qword ptr [rsp+8]
000000013FA81842 mov rdx,qword ptr [rsp+10h]
000000013FA81847 mov r8,qword ptr [rsp+18h]
000000013FA8184C mov r9,qword ptr [rsp+20h]
000000013FA81851 call 000000013FA81800
000000013FA81856 add rsp,20h
000000013FA8185A ret :icon13:
Note also that the stack pointer is not realigned to a 16-byte boundary by the sub rsp, 20h! :eusa_naughty:
Can you try this (temporarily to see if this solves the issue and works for now until we fix) :
option frame:auto
option win64:11
option stackbase:rsp
Quote from: johnsa on March 21, 2017, 09:56:56 PM
Can you try this (temporarily to see if this solves the issue and works for now until we fix) :
option frame:auto
option win64:11
option stackbase:rsp
Does it align the stack before the call according to the specification? :eusa_naughty:
option win64:11 aligns the stack to 16 yes
Basically... I'd really like to discontinue ALL of the other options..
WIN64:11 + STACKBASE:RSP
should produce optimal prologue/epilogue in ALL cases, except when you want absolutely nothing to be done for you and you override the prologue/epilogue macros.. which is useful in some cases.
But it's logic should invalidate the need for any of the other options to be honest.
I have half a mind to just force all modes to this one. It also makes our lives much easier trying to manage one calling convention implementation instead of 16 permutations!
Quote from: johnsa on March 21, 2017, 10:41:45 PM
option win64:11 aligns the stack to 16 yes
No, it does not align the stack to 16, no :eusa_naughty: :eusa_naughty: :eusa_naughty:
The stack is not aligned on function entry and you are relying on the push rbp to align it which does not happen when you use OPTION STACKBASE:RSP.
And there is another bug, I just found (see next message)
New bug:
option casemap:none
option frame:auto
.code
sub1 proc private dest:ptr, src:ptr, val1 : qword, val2:qword
mov dest, rcx
mov src, rdx
mov val1, r8
mov val2, r9
mov rax, qword ptr [rdx]
add rax, val1
add rax, val2
mov qword ptr [rcx], rax
ret
sub1 endp
getSum proc public FRAME uses xmm6 dest:ptr, src:ptr, val1 : qword, val2:qword
INVOKE sub1, rcx, rdx, r8, r9
ret
getSum endp
end
;************************* Decompiles to:
getSum:
000000013F891824 push rbp
000000013F891825 mov rbp,rsp
000000013F891828 sub rsp,10h
000000013F89182C sub rsp,20h
000000013F891830 call 000000013F891800
000000013F891835 add rsp,20h
000000013F891839 vmovdqa xmm6,xmmword ptr [rsp]
000000013F89183E add rsp,10h
000000013F891842 pop rbp
000000013F891843 ret
It restores an ansaved XMM register :icon_redface:
Quote from: johnsa on March 21, 2017, 10:44:27 PM
Basically... I'd really like to discontinue ALL of the other options..
Just because you can't fix them does not mean they should be discontinued. :shock:
Myself, I hate that option to automatically put the registers contents in shadow space. For some functions I want to put the registers in shadow space for others not. What is the sense of always put registers contents in shadow space? :(
Assembly is not for lazy people that want all to be done for them, those should use Visual Basic .Net . Sorry, VB Net is not even for them, probably Javascript or the likes is better suited.
Quote from: aw27 on March 22, 2017, 12:19:38 AM
Quote from: johnsa on March 21, 2017, 10:44:27 PM
Basically... I'd really like to discontinue ALL of the other options..
Just because you can't fix them does not mean they should be discontinued. :shock:
Myself, I hate that option to automatically put the registers contents in shadow space. For some functions I want to put the registers in shadow space for others not. What is the sense of always put registers contents in shadow space? :(
Assembly is not for lazy people that want all to be done for them, those should use Visual Basic .Net . Sorry, VB Net is not even for them, probably Javascript or the likes is better suited.
THAT is the whole point, OPTION WIN64:11 DOES NOT put registers into shadow space IF you don't refer to the argument by name. If you use the registers, rcx, rdx .. etc then no shadow space is reserved. It then also automatically re-uses any available shadow space for locals to reduce the total amount of stack that is allocated per function. There shouldn't be a case where win64:11 when used properly would produce any less efficient code that if you'd done it yourself.. but without any of the pain.
Quote from: aw27 on March 21, 2017, 11:58:44 PM
Quote from: johnsa on March 21, 2017, 10:41:45 PM
option win64:11 aligns the stack to 16 yes
No, it does not align the stack to 16, no :eusa_naughty: :eusa_naughty: :eusa_naughty:
The stack is not aligned on function entry and you are relying on the push rbp to align it which does not happen when you use OPTION STACKBASE:RSP.
And there is another bug, I just found (see next message)
Can you provide an example of win64:11 not aligning the stack to 16 ?
I depend on that in well over 500k lines of production code so I'd be interested to see a case where this isn't so.
Quote from: aw27 on March 22, 2017, 12:03:46 AM
New bug:
option casemap:none
option frame:auto
.code
sub1 proc private dest:ptr, src:ptr, val1 : qword, val2:qword
mov dest, rcx
mov src, rdx
mov val1, r8
mov val2, r9
mov rax, qword ptr [rdx]
add rax, val1
add rax, val2
mov qword ptr [rcx], rax
ret
sub1 endp
getSum proc public FRAME uses xmm6 dest:ptr, src:ptr, val1 : qword, val2:qword
INVOKE sub1, rcx, rdx, r8, r9
ret
getSum endp
end
;************************* Decompiles to:
getSum:
000000013F891824 push rbp
000000013F891825 mov rbp,rsp
000000013F891828 sub rsp,10h
000000013F89182C sub rsp,20h
000000013F891830 call 000000013F891800
000000013F891835 add rsp,20h
000000013F891839 vmovdqa xmm6,xmmword ptr [rsp]
000000013F89183E add rsp,10h
000000013F891842 pop rbp
000000013F891843 ret
It restores an ansaved XMM register :icon_redface:
will check this one out now..
Quote from: johnsa on March 22, 2017, 12:36:46 AM
THAT is the whole point
If it is so good, develop it on its own merits, don't drag people away from other options because they are buggy and you can't fix them. :eusa_boohoo:
Quote from: johnsa on March 22, 2017, 12:38:58 AM
Can you provide an example of win64:11 not aligning the stack to 16 ?
I depend on that in well over 500k lines of production code so I'd be interested to see a case where this isn't so.
Since you ask and I can't resist a challenge, take this one and enjoy another crash :P:
option casemap:none
option frame:auto
OPTION STACKBASE:RSP
option win64:11
.code
sub1 proc private dest:ptr, src:ptr, val1 : qword, val2:qword
mov dest, rcx
mov src, rdx
mov val1, r8
mov val2, r9
mov rax, qword ptr [rdx]
add rax, val1
add rax, val2
mov qword ptr [rcx], rax
ret
sub1 endp
getSum proc public FRAME uses xmm6 dest:ptr, src:ptr, val1 : qword, val2:qword
INVOKE sub1, dest, rdx, r8, r9
ret
getSum endp
end
Why does it crash after 500 K lines of production code? :greensml: :greensml: :greensml: :greensml: :greensml:
Did I say we can't fix them ?
All I said is that option is preferable in that it should do everything in a optimal way, so there should be no need to use the others, but they are still there if someone requires it.
I said ..
Can you try this (temporarily to see if this solves the issue and works for now until we fix) :
option frame:auto
option win64:11
option stackbase:rsp
.. PS. The constant attitude and being antagonistic doesn't not go down well with me.. We try to help everyone out and get all issues resolved as quickly as possible, given this is a "hobby" for us, and I'd like to see any other dev teams communicate as actively as we do on issues and ways to fix them, trying to keep the community involved.
Quote from: aw27 on March 22, 2017, 01:12:19 AM
Quote from: johnsa on March 22, 2017, 12:38:58 AM
Can you provide an example of win64:11 not aligning the stack to 16 ?
I depend on that in well over 500k lines of production code so I'd be interested to see a case where this isn't so.
Since you ask and I can't resist a challenge, take this one and enjoy another crash :P:
option casemap:none
option frame:auto
OPTION STACKBASE:RSP
option win64:11
.code
sub1 proc private dest:ptr, src:ptr, val1 : qword, val2:qword
mov dest, rcx
mov src, rdx
mov val1, r8
mov val2, r9
mov rax, qword ptr [rdx]
add rax, val1
add rax, val2
mov qword ptr [rcx], rax
ret
sub1 endp
getSum proc public FRAME uses xmm6 dest:ptr, src:ptr, val1 : qword, val2:qword
INVOKE sub1, dest, rdx, r8, r9
ret
getSum endp
end
Why does it crash after 500 K lines of production code? :greensml: :greensml: :greensml: :greensml: :greensml:
Will have this one fixed today.
Quote from: johnsa on March 22, 2017, 01:16:42 AM
.. PS. The constant attitude and being antagonistic doesn't not go down well with me.
I noticed you are doing your best. If I had not hopes on you I would have already left for better. Be patient. :t
BTW, I gave you an example where
option frame:auto
option win64:11
option stackbase:rsp
does not work.
Yep, got it.. that uses xmm6 is throwing it out.
We'll get this fixed today along with the ESP stackbase issue.
What we'll do is put everything so-far into a 2.22 beta release and share a link, then we can all re-test these issues as we go.
Quote from: johnsa on March 22, 2017, 01:28:44 AM
What we'll do is put everything so-far into a 2.22 beta release and share a link, then we can all re-test these issues as we go.
Excellent idea! :t
aw27, your 64 bit example is a mirror of your programming skills. If you don't use FRAME it is your responsibility to align the stack and allocate shadow space for parameters. So, here is fixed your program which now does what it was suppose to do:
Quote
option casemap:none
option frame:auto
OPTION STACKBASE:RSP
option win64:11
getSum proto dest:ptr, src:ptr, val1 : qword, val2:qword
sub1 proto dest:ptr, src:ptr, val1 : qword, val2:qword
.data
avar dq 40
bvar dq 50
.code
start:
sub rsp, 8+32 ;8 to align stack, 32 = 4*8 for shadow space
invoke getSum, ADDR avar, ADDR bvar, 20, 30
add rsp,8+32
ret
sub1 proc private dest:ptr, src:ptr, val1 : qword, val2:qword
mov dest, rcx
mov src, rdx
mov val1, r8
mov val2, r9
mov rax, qword ptr [rdx]
add rax, val1
add rax, val2
mov qword ptr [rcx], rax
ret
sub1 endp
getSum proc public FRAME uses xmm6 dest:ptr, src:ptr, val1 : qword, val2:qword
INVOKE sub1, dest, rdx, r8, r9
ret
getSum endp
end start
Quote from: habran on March 22, 2017, 08:21:20 AM
aw27, your 64 bit example is a mirror of your programming skills.
Habran, after all this time you have still not figured out that this is to be compiled as an object file to be called from a high level language. And insist on the same argument over and over insulting the intelligence of the others in the meantime. :lol:
I will tell you a secret, high-level languages know very well how to align the stack before a call, you try to do the same.
Hi level languages do know but you have no idea whatsoever!
If someone is insulting intelligence here, I can assure it is not me!
If you don't like Hjwasm why don't you use ml64?
Hi level language called your asm source with stack misaligned to 8 but next is your responsibility to provide shadow space and align the stack before you call some function and restore the stack on exit because you did not use FRAME!
I'll not waste my time in the future with such an arrogant amateur and ungrateful individuals like you.
If I could I would ban you for using HJWasm!
Quote from: habran on March 22, 2017, 08:50:08 AM
Hi level languages do know but you have no idea whatsoever!
If someone is insulting intelligence here, I can assure it is not me!
If you don't like Hjwasm why don't you use ml64?
I have not seen any evidence so far that you know how to program. You are just unable to cope with any issue. :badgrin:
However, I have seen evidence that you have no f***en idea!
Quote from: habran on March 22, 2017, 08:58:21 AM
Hi level language called your asm source with stack misaligned to 8 but next is your responsibility to provide shadow space
I will not waste more time, you are in another universe. :badgrin:
Quote from: habran on March 22, 2017, 09:00:43 AM
If I could I would ban you for using HJWasm!
I am trying to help make HJWASM better, so far it is still more unreliable than JWASM.
Hasta la vista babe :badgrin:
Quote from: habran on March 22, 2017, 09:12:51 AM
Hasta la vista babe :badgrin:
You should moderate whatever you are sniffing :badgrin:
John,
> Basically... I'd really like to discontinue ALL of the other options..
Just do it, it will save you a mountain of grief and reduce the number of complaints by the illiterate. I made this decision early with ML64 and provide 2 options, a stack frame that write the appropriate registers to the shadow space and the rest of the arguments to the correct stack addresses and for the brave or those who actually know what they are doing, no stack frame at all.
Anything much in between is a complicate mess full of potential mistakes, un-necessary clutter and a documentation nightmare. Free of the crappy end it allows you to concentrate on the things that make the assembler better.
Quote from: habran on March 22, 2017, 08:21:20 AM
aw27, your 64 bit example is a mirror of your programming skills.
Quote from: aw27 on March 22, 2017, 09:01:04 AMI have not seen any evidence so far that you know how to program. You are just unable to cope with any issue. :badgrin:
Folks,
You are
both lightyears ahead of the average programmer (same for John & Steve, of course). You are working in an extremely tricky language, in an area that is so far badly explored. It is understandable that in certain moments the stress takes over and lets you write such things, but don't forget that you have a common goal: The best assembler in 64-bit land. You are damn close to that goal, don't spoil the party :icon14:
P.S.: I've great sympathy for Hutch' proposal to reduce the options, or at least to make them secret and officially undocumented. Frameless is pretty useless IMHO, no need to save ebp, as you have a bunch of new registers in 64-bit land. And it was never significantly faster.
HI hutch--,
now you are talking... thank you for confirming our thoughts :t
We'll do exactly that 8)
Hi jj2007 :biggrin:
Thank you for the flowers :bgrin:
Long time ago I've learned in my life that to give choice to people is a big mistake, because they will always chose the worst option,
so, you do that only if you don't care about them.
So, we will probable give 3 options. My proposal is:
win64:11 ;we will force it and you will not need to declare it
OPTION STACKBASE ;this one will go together with x64
for 32 bit we will leave that choice of OPTION STACKBASE
and if not declared it will be normal 32 bit compatible with masm
If someone want to do on they own, they can still use:
OPTION PROLOGUE:NONE
OPTION EPILOGUE:NONE
I'll just see with Johnsa if he agrees about it.
Quote from: habran on March 22, 2017, 02:08:40 PM
Long time ago I've learned in my life that to give choice to people is a big mistake, because they will always chose the worst option,
Rush forwarding was never a solution for anything.
The truth is, your win64:11 is broken as well as I demonstrated easily. While I recognize Johnsa is trying to save the ship you are doing exactly the inverse. :(
aw27,
Just keep in mind that these guys are doing a massive amount of work getting this assembler up and going and on the way through there will be bits and pieces that need to be fixed. Bug reports are useful to the extent that it helps in the development but that does not translate into telling them what to do. Design decisions are the choice of the authors, not the testers and while its fair to make suggestions, the final decisions must be left to them.
Quote from: hutch-- on March 22, 2017, 07:02:19 PM
aw27,
the final decisions must be left to them.
Always! And you know as well that products done against the market expectancies go down the toilet.
:biggrin:
> Always! And you know as well that products done against the market expectancies go down the toilet.
The simple solution is to write your own.
Quote from: hutch-- on March 22, 2017, 08:42:39 PM
:biggrin:
> Always! And you know as well that products done against the market expectancies go down the toilet.
The simple solution is to write your own.
Things don't work like that in real life.
Hi,
After investigating this one I can confirm that Visual (C++) is NOT aligning the stack to 16.
I've setup test C++ and ASM project and I've saved a screen-shot here:
www.terraspace.co.uk/stack.png (http://www.terraspace.co.uk/stack.png)
C++ will align RSP to 16 in it's own prologue to keep locals etc. aligned, but it doesn't do this up-front accounting for the call's 8byte return, hence the wrong alignment once into the ASM proc.
I think it's the same with GCC. It's the library functions (paticularly the WinAPI) that require stack alignment, not the main or the entire program. You should have known this long ago.
Also you guys should take into account the linker's own requirements. Some linkers do align the stack for your code while others don't.
I can confirm that using FRAME (as intended) does counter the mis-alignment by 8 and ensures the stack is aligned to 16 on entry.
I've linked to the C++ project with assembler module, simply decorate the PROCs with "FRAME" and everything will be good to go! :)
http://www.terraspace.co.uk/testapp.zip (http://www.terraspace.co.uk/testapp.zip)
option casemap : none
option frame : auto
option win64: 11
OPTION STACKBASE : RSP
.code
sub1 proc private frame dest : ptr, src : ptr, val1 : qword, val2 : qword
mov dest, rcx
mov src, rdx
mov val1, r8
mov val2, r9
mov rax, qword ptr[rdx]
add rax, val1
add rax, val2
mov qword ptr[rcx], rax
ret
sub1 endp
getSum proc public frame dest:ptr, src : ptr, val1 : qword, val2 : qword
mov dest, rcx
mov src, rdx
mov val1, r8
mov val2, r9
INVOKE sub1, dest, src, val1, val2
ret
getSum endp
end
johnsa
It's good to know that everything is turning out just fine for you guys.
Keep up the good work and get some sleep.
Thanks.. :) I think 2.22 has some great fixes and new features!
Time to take a nap/break though ;)
Quote from: johnsa on March 23, 2017, 10:00:22 PM
After investigating this one I can confirm that Visual (C++) is NOT aligning the stack to 16.
Of course not and does not have to. What is required is to align the stack "before" the call is made. How is it possible to have simultaneously the stack aligned before the call and immediately after the call?
I give up. It is too much nonsense for me.
Just for fun :biggrin:
option casemap:none
;option frame:auto
;OPTION STACKBASE:RSP
;option win64:11
.code
sub1 proc dest:ptr, src:ptr, val1 : qword, val2:qword
mov dest, rcx
mov src, rdx
mov val1, r8
mov val2, r9
mov rax, qword ptr [rdx]
add rax, val1
add rax, val2
mov qword ptr [rcx], rax
ret
sub1 endp
end
ml64.exesub1 PROC
push rbp ; 0000 _ 55
mov rbp, rsp ; 0001 _ 48: 8B. EC
mov qword ptr [rbp+10H], rcx ; 0004 _ 48: 89. 4D, 10
mov qword ptr [rbp+18H], rdx ; 0008 _ 48: 89. 55, 18
mov qword ptr [rbp+20H], r8 ; 000C _ 4C: 89. 45, 20
mov qword ptr [rbp+28H], r9 ; 0010 _ 4C: 89. 4D, 28
mov rax, qword ptr [rdx] ; 0014 _ 48: 8B. 02
add rax, qword ptr [rbp+20H] ; 0017 _ 48: 03. 45, 20
add rax, qword ptr [rbp+28H] ; 001B _ 48: 03. 45, 28
mov qword ptr [rcx], rax ; 001F _ 48: 89. 01
leave ; 0022 _ C9
ret ; 0023 _ C3
sub1 ENDP
HJWasm64.exesub1 PROC
push rbp ; 0000 _ 55
mov rbp, rsp ; 0001 _ 48: 8B. EC
mov qword ptr [rbp+10H], rcx ; 0004 _ 48: 89. 4D, 10
mov qword ptr [rbp+18H], rdx ; 0008 _ 48: 89. 55, 18
mov qword ptr [rbp+20H], r8 ; 000C _ 4C: 89. 45, 20
mov qword ptr [rbp+28H], r9 ; 0010 _ 4C: 89. 4D, 28
mov rax, qword ptr [rdx] ; 0014 _ 48: 8B. 02
add rax, qword ptr [rbp+20H] ; 0017 _ 48: 03. 45, 20
add rax, qword ptr [rbp+28H] ; 001B _ 48: 03. 45, 28
mov qword ptr [rcx], rax ; 001F _ 48: 89. 01
leave ; 0022 _ C9
ret ; 0023 _ C3
sub1 ENDP
option casemap:none
option frame:auto
OPTION STACKBASE:RSP
option win64:11
.code
sub1 proc frame dest:ptr, src:ptr, val1 : qword, val2:qword
; mov dest, rcx
; mov src, rdx
; mov val1, r8
; mov val2, r9
mov rax, qword ptr [rdx]
add rax, val1
add rax, val2
mov qword ptr [rcx], rax
ret
sub1 endp
end
HJWasm64.exesub1 PROC
mov qword ptr [rsp+18H], r8 ; 0000 _ 4C: 89. 44 24, 18
mov qword ptr [rsp+20H], r9 ; 0005 _ 4C: 89. 4C 24, 20
mov rax, qword ptr [rdx] ; 000A _ 48: 8B. 02
add rax, qword ptr [rsp+18H] ; 000D _ 48: 03. 44 24, 18
add rax, qword ptr [rsp+20H] ; 0012 _ 48: 03. 44 24, 20
mov qword ptr [rcx], rax ; 0017 _ 48: 89. 01
ret ; 001A _ C3
sub1 ENDP
In Cvoid sub1C(long long *dest, long long *scr, long long val1, long long val2)
{
register long long val3 = val1;
val3 += val2;
*dest = val3;
}
cl.exesub1C LABEL NEAR
mov qword ptr [rsp+20H], r9 ; 0000 _ 4C: 89. 4C 24, 20
mov qword ptr [rsp+18H], r8 ; 0005 _ 4C: 89. 44 24, 18
mov qword ptr [rsp+10H], rdx ; 000A _ 48: 89. 54 24, 10
mov qword ptr [rsp+8H], rcx ; 000F _ 48: 89. 4C 24, 08
sub rsp, 24 ; 0014 _ 48: 83. EC, 18
mov rax, qword ptr [rsp+30H] ; 0018 _ 48: 8B. 44 24, 30
mov qword ptr [rsp], rax ; 001D _ 48: 89. 04 24
mov rax, qword ptr [rsp+38H] ; 0021 _ 48: 8B. 44 24, 38
mov rcx, qword ptr [rsp] ; 0026 _ 48: 8B. 0C 24
add rcx, rax ; 002A _ 48: 03. C8
mov rax, rcx ; 002D _ 48: 8B. C1
mov qword ptr [rsp], rax ; 0030 _ 48: 89. 04 24
mov rax, qword ptr [rsp+20H] ; 0034 _ 48: 8B. 44 24, 20
mov rcx, qword ptr [rsp] ; 0039 _ 48: 8B. 0C 24
mov qword ptr [rax], rcx ; 003D _ 48: 89. 08
add rsp, 24 ; 0040 _ 48: 83. C4, 18
ret ; 0044 _ C3
cl.exe -Oxsub1C PROC
lea rax, ptr [r8+r9] ; 0000 _ 4B: 8D. 04 08
mov qword ptr [rcx], rax ; 0004 _ 48: 89. 01
ret ; 0007 _ C3
sub1C ENDP
pocc.exesub1C PROC
mov qword ptr [rsp+8H], rcx ; 0000 _ 48: 89. 4C 24, 08
mov qword ptr [rsp+18H], r8 ; 0005 _ 4C: 89. 44 24, 18
mov qword ptr [rsp+20H], r9 ; 000A _ 4C: 89. 4C 24, 20
mov rax, qword ptr [rsp+18H] ; 000F _ 48: 8B. 44 24, 18
add rax, qword ptr [rsp+20H] ; 0014 _ 48: 03. 44 24, 20
mov rdx, qword ptr [rsp+8H] ; 0019 _ 48: 8B. 54 24, 08
mov qword ptr [rdx], rax ; 001E _ 48: 89. 02
ret ; 0021 _ C3
sub1C ENDP
pocc.exe -Otsub1C PROC
add r8, r9 ; 0000 _ 4D: 01. C8
mov qword ptr [rcx], r8 ; 0003 _ 4C: 89. 01
ret ; 0006 _ C3
sub1C ENDP
option casemap:none
option frame:auto
OPTION STACKBASE:RSP
option win64:11
sub1 proto dest:ptr, src:ptr, val1 : qword, val2:qword
.code
getSum proc public frame dest:ptr, src:ptr, val1 : qword, val2:qword
INVOKE sub1, dest, rdx, r8, r9
ret
getSum endp
end
HJWasm64.exegetSum PROC
mov qword ptr [rsp+8H], rcx ; 0000 _ 48: 89. 4C 24, 08
sub rsp, 40 ; 0005 _ 48: 83. EC, 28
mov rcx, qword ptr [rsp+30H] ; 0009 _ 48: 8B. 4C 24, 30
call sub1 ; 000E _ E8, 00000000(rel)
add rsp, 40 ; 0013 _ 48: 83. C4, 28
ret ; 0017 _ C3
getSum ENDP
int sub1(long long *dest, long long *scr, long long val1, long long val2);
int getSumC(long long *dest, long long *scr, long long val1, long long val2)
{
return sub1(dest,scr,val1,val2);
}
cl.exegetSumC LABEL NEAR
mov qword ptr [rsp+20H], r9 ; 0000 _ 4C: 89. 4C 24, 20
mov qword ptr [rsp+18H], r8 ; 0005 _ 4C: 89. 44 24, 18
mov qword ptr [rsp+10H], rdx ; 000A _ 48: 89. 54 24, 10
mov qword ptr [rsp+8H], rcx ; 000F _ 48: 89. 4C 24, 08
sub rsp, 40 ; 0014 _ 48: 83. EC, 28
mov r9, qword ptr [rsp+48H] ; 0018 _ 4C: 8B. 4C 24, 48
mov r8, qword ptr [rsp+40H] ; 001D _ 4C: 8B. 44 24, 40
mov rdx, qword ptr [rsp+38H] ; 0022 _ 48: 8B. 54 24, 38
mov rcx, qword ptr [rsp+30H] ; 0027 _ 48: 8B. 4C 24, 30
call sub1 ; 002C _ E8, 00000000(rel)
add rsp, 40 ; 0031 _ 48: 83. C4, 28
ret ; 0035 _ C3
cl.exe -OxgetSumC PROC
jmp sub1 ; 0000 _ E9, 00000000(rel)
getSumC ENDP
Quote from: aw27 on March 24, 2017, 03:32:03 PM
Quote from: johnsa on March 23, 2017, 10:00:22 PM
After investigating this one I can confirm that Visual (C++) is NOT aligning the stack to 16.
Of course not and does not have to. What is required is to align the stack "before" the call is made. How is it possible to have simultaneously the stack aligned before the call and immediately after the call?
I give up. It is too much nonsense for me.
Perhaps read all the posts before replying to one previously ? :)
It was simply a message stating the current point in the investigation. The following message described that using the FRAME option on the proc ensures the alignment in both cases and adjusts RSP accordingly.
So you get sub rsp,28h instead of sub rsp,20h.
Quote from: TWell on March 24, 2017, 08:39:55 PM
getSumC PROC
jmp sub1 ; 0000 _ E9, 00000000(rel)
getSumC ENDP
Here is the proof that these High-level compilers are artificial intelligence driven. :greenclp:
Hi TWell :biggrin:
Thanks for your support!
I your last example you are showing a so called the Tail Call Elimination
John and myself have discussed it some time ago and concluded that it is not smart to use jump from the function, it can cause a hard to find bug, in our opinion it is much safer to use a call.